[REVIEW]: pudu: A Python library for agnostic feature selection and explainability of Machine Learning classification and regression problems.

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@enricgrau@arfon<!--end-editor-- Reviewers: @hbaniecki, @aksholokhov Archive: 10.5281/zenodo.10161346

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/cacb5b6520209b0c940bf46638df251d"><img src="https://joss.theoj.org/papers/cacb5b6520209b0c940bf46638df251d/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/cacb5b6520209b0c940bf46638df251d/status.svg)](https://joss.theoj.org/papers/cacb5b6520209b0c940bf46638df251d)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@hbaniecki & @aksholokhov, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @arfon know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @hbaniecki

📝 Checklist for @aksholokhov

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 1 year ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.11 s (775.3 files/s, 281992.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JavaScript                      15           2433           2497           9214
HTML                            19           1540             54           7508
SVG                              1              0              0           2671
Python                          19            600            783           1206
CSS                              4            185             35            762
XML                              1              0              2            711
TeX                              1             18              0            350
reStructuredText                12            161            106            293
YAML                             9             31             47            228
Markdown                         5             58              0            145
TOML                             1              0              1              3
-------------------------------------------------------------------------------
SUM:                            87           5026           3525          23091
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 1 year ago

Wordcount for paper.md is 1061

editorialbot commented 1 year ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/B978-1-55860-247-2.50037-1 is OK
- 10.1007/3-540-57868-4_57 is OK
- 10.1016/J.CJCA.2021.09.004 is OK
- 10.48550/arxiv.1602.04938 is OK
- 10.48550/arxiv.2010.07389 is OK
- 10.1613/JAIR.1.12228 is OK
- 10.1109/ACCESS.2020.2976199 is OK
- 10.1145/3351095.3375624 is OK
- 10.3389/FDATA.2021.688969 is OK
- 10.1109/iccv.2017.74 is OK
- 10.5281/ZENODO.6344451 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.1038/s41586-020-2649-2 is OK

MISSING DOIs

- 10.1109/cvpr.2017.354 may be a valid DOI for title: Network Dissection: Quantifying Interpretability of Deep Visual Representation

INVALID DOIs

- None

arfon commented 1 year ago

@hbaniecki, @aksholokhov – This is the review thread for the paper. All of our communications will happen here from now on.

Please read the "Reviewer instructions & questions" in the first comment above. Please create your checklist typing:

@editorialbot generate my checklist

As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention https://github.com/openjournals/joss-reviews/issues/5873 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

editorialbot commented 1 year ago

Five most similar historical JOSS papers:

Yellowbrick: Visualizing the Scikit-Learn Model Selection Process Reviewers: @mnarayan Similarity score: 0.7650

Feature-engine: A Python package for feature engineering for machine learning Reviewers: @Jose-Augusto-C-M, @papachristoumarios, @bobturneruk Similarity score: 0.7590

pysr3: A Python Package for Sparse Relaxed Regularized Regression Reviewers: @blakeaw, @mhu48 Similarity score: 0.7583

Sensie: Probing the sensitivity of neural networks Reviewers: @ejhigson, @omshinde Similarity score: 0.7542

High-performance neural population dynamics modeling enabled by scalable computational infrastructure Reviewers: @richford, @tachukao Similarity score: 0.7530

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before before considering asking the reviewers of these papers to review again for JOSS.

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

hbaniecki commented 1 year ago

Review checklist for @hbaniecki

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/pudu-py/pudu? yes
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license? MIT
[x] Contribution and authorship: Has the submitting author (@enricgrau) made major contributions to the software? Does the full list of paper authors seem appropriate and complete? Authors contribution with CRediT
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines ~In its current state no~ yes
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item. No data
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item. No results
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item. No data

Functionality

[x] Installation: Does installation proceed as outlined in the documentation? Yes
[x] Functionality: Have the functional claims of the software been confirmed? Yes
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.) No claims

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is? ~In its current state no~ yes
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems). https://pudu-py.github.io/pudu/examples.html
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)? https://pudu-py.github.io/pudu/index.html
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support https://pudu-py.github.io/pudu/contributions.html

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided? High-level yes
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work? ~Unclear/incomplete~ yes
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages? ~No~ partially
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)? ~Some errors~ fixed
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax? ~No~ yes

hbaniecki commented 1 year ago

Apart from minor issues with documentation and examples, I have the following major concerns about this contribution:

Substantial scholarly effort is (currently) not evident. It is a relatively young software (4k downloads), there is no user base (no issues, no GitHub stars, no blogs or promotion of the software?), and no clear applications (e.g. no citations). With similar software existing already, {pudu} is not likely to be cited unless the software/paper clearly highlights its original purpose, e.g. an original subset of methods implemented, accessible visualisation, better API, unique audience.
Statement of need is unclear. I have a hard time understanding the motivation for developing this software. In its current state, the paper's title and description seem too generic. Should "sensitivity analysis" be highlighted in the title?
- A clear use-case for this software would help, e.g. the paper mentions the RELIEF method but does not elaborate on applications/usefulness of the algorithm to show that the {pudu} package is indeed needed in situation X or for user Y.
- The terminology of "Importance/speed/synergy/reactivation" is unclear and could be better described in the paper.
- Adding a figure to the paper could help illustrate the statement of need.
State of the field is missing. There is no description of how this software compares to other similar packages. Consider the attached non-exhaustive list of software for explainability and feature selection. Relating {pudu} to other software could partially alleviate concerns mentioned in 1. & 2., i.e. show the added value of {pudu} over the already available solutions, motivate the need for such new package, highlight the target audience etc.

I am open to discussion and hope the software paper can be improved to clearly state the motivation and effort.

References (non-exhaustive list)

Molnar et al. iml: An R package for Interpretable Machine Learning. JOSS 2018
Alber et al. iNNvestigate Neural Networks! JMLR 2019
Kokhlikyan et al. Captum: A unified and generic model interpretability library for PyTorch. arXiv:2009.07896 2020
Arya et al. AI Explainability 360: An Extensible Toolkit for Understanding Data and Machine Learning Models. JMLR 2020
(ours) dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python. JMLR 2021
Klaise et al. Alibi Explain: Algorithms for Explaining Machine Learning Models. JMLR 2021
Li et al. InterpretDL: Explaining Deep Models in PaddlePaddle. JMLR 2022
Zhu et al. abess: A Fast Best-Subset Selection Library in Python and R. JMLR 2022

hbaniecki commented 1 year ago

Quality of writing

There is some vague language, e.g. "help make sense of machine learning results", "Easy plotting of the results"
Acronyms "ML/XAI" are defined but later unused
Typos in references: "Bhatt et al. (2020)](Belle & Papantonis, 2021)" "Bau2018"

aksholokhov commented 1 year ago

Review checklist for @aksholokhov

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/pudu-py/pudu?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@enricgrau) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[ ] Functionality: Have the functional claims of the software been confirmed?
[ ] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[ ] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[ ] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[ ] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

aksholokhov commented 1 year ago

@arfon my full review is here.

arfon commented 1 year ago

Thanks for your reviews @hbaniecki and @aksholokhov. @enricgrau – please take a look at the feedback from both reviewers and share your responses here. Of particular focus should be a response to @hbaniecki's feedback here: https://github.com/openjournals/joss-reviews/issues/5873#issuecomment-1761368406

enricgrau commented 1 year ago

Thanks to @hbaniecki and @aksholokhov for the impeccable reviews. We've been working on all of your comments and concerns during all these days, and we hope to fulfil and respond to all the raised points sometime in the next couple of weeks. Thank you @arfon for your attention to this review. We are excited to show how much the article and documentation have improved once we finish with the corrections.

enricgrau commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

enricgrau commented 1 year ago

@arfon We have modified the paper.md and asked the editorialbot to re-generate the pdf but it is rendering the same old version. Can you help us with this? Shall we wait more time to re-generate? Thank you!

Edit: Could this be due to version change from v0.3.0 to v0.3.2?

enricgrau commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

enricgrau commented 1 year ago

@editorialbot set v0.3.2 as version

editorialbot commented 1 year ago

I'm sorry @enricgrau, I'm afraid I can't do that. That's something only editors are allowed to do.

enricgrau commented 1 year ago

@arfon We have responded to @aksholokhov and @hbaniecki in issues https://github.com/pudu-py/pudu/issues/4 and https://github.com/pudu-py/pudu/issues/3. At the moment, we ask to check the preview from the new and revised paper.md found in the repository here. The new pdf should be generated from this same file so we hope this causes no problems in the review process. Thank you all again for your valuable time and have a great weekend! 😄

arfon commented 1 year ago

@arfon We have modified the paper.md and asked the editorialbot to re-generate the pdf but it is rendering the same old version. Can you help us with this? Shall we wait more time to re-generate? Thank you!

I think this might be happening as you now have two paper.md files in the repository. @editorialbot will simply compile the first one it finds. Could you delete the one you do not want to be compiled?

enricgrau commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

enricgrau commented 1 year ago

@arfon We have modified the paper.md and asked the editorialbot to re-generate the pdf but it is rendering the same old version. Can you help us with this? Shall we wait more time to re-generate? Thank you!

I think this might be happening as you now have two paper.md files in the repository. @editorialbot will simply compile the first one it finds. Could you delete the one you do not want to be compiled?

That did the trick. Thank you!

hbaniecki commented 1 year ago

Hi, I believe authors did a good job at improving the software/paper. My remaining comments are minor (see https://github.com/pudu-py/pudu/issues/4#issuecomment-1794775137).

I can recommend acceptence of the {pudu} software paper to JOSS.

aksholokhov commented 1 year ago

@arfon The authors addressed my feedback in full and I can recommend the acceptance of the {pudu} paper to JOSS as well.

arfon commented 1 year ago

@enricgrau – looks like we're very close to being done here. I will circle back here next week, but in the meantime, please give your own paper a final read to check for any potential typos etc.

After that, could you make a new release of this software that includes the changes that have resulted from this review. Then, please make an archive of the software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:

The title of the archive is the same as the JOSS paper title
That the authors of the archive are the same as the JOSS paper authors
I can then move forward with accepting the submission.

arfon commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

enricgrau commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

enricgrau commented 1 year ago

@arfon I have made a final revision and created the Zenodo archive with the final version. I changed the title to match the paper and added all the authors. The DOI is 10.5281/zenodo.10161346

enricgrau commented 12 months ago

@arfon Just friendly reminder. Thank you!

arfon commented 11 months ago

@enricgrau – my apologies, somehow I lost track of this one!

arfon commented 11 months ago

@editorialbot set 10.5281/zenodo.10161346 as archive

editorialbot commented 11 months ago

Done! archive is now 10.5281/zenodo.10161346

arfon commented 11 months ago

@editorialbot recommend-accept

editorialbot commented 11 months ago

Attempting dry run of processing paper acceptance...

editorialbot commented 11 months ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/S0924-2031(03)00045-6 is OK
- 10.3390/analytica3030020 is OK
- 10.1038/srep19414 is OK
- 10.1201/9781003328513-9 is OK
- 10.1016/J.ECOENV.2022.114405 is OK
- 10.1002/9781119763406.CH8 is OK
- 10.1038/s41524-022-00884-7 is OK
- 10.1016/B978-1-55860-247-2.50037-1 is OK
- 10.1007/3-540-57868-4_57 is OK
- 10.1016/J.CJCA.2021.09.004 is OK
- 10.48550/arxiv.1602.04938 is OK
- 10.48550/arxiv.2010.07389 is OK
- 10.1613/JAIR.1.12228 is OK
- 10.1109/ACCESS.2020.2976199 is OK
- 10.1145/3351095.3375624 is OK
- 10.3389/FDATA.2021.688969 is OK
- 10.1109/iccv.2017.74 is OK
- 10.5281/ZENODO.6344451 is OK
- 10.1109/cvpr.2017.354 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.1145/3313831.3376219 is OK
- 10.21105/JOSS.05220 is OK
- 10.1145/3351095.3375624 is OK
- 10.3389/FDATA.2021.688969 is OK
- 10.1002/AENM.202103163 is OK
- 10.1039/d1ta01299a is OK
- 10.5281/ZENODO.4743323 is OK

MISSING DOIs

- 10.21203/rs.3.rs-2963888/v1 may be a valid DOI for title: The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective

INVALID DOIs

- 10.1116/1.5140587/247679 is INVALID
- 10.1103/REVMODPHYS.79.353/FIGURES/62/MEDIUM is INVALID

editorialbot commented 11 months ago

:warning: Error preparing paper acceptance. The generated XML metadata file is invalid.

ID ref-Bhatt2020 already defined
ID ref-Belle2021 already defined

arfon commented 11 months ago

@enricgrau – could you check your references in your BibTeX file please? It looks like there are duplicate entries for Bhatt2020 and Belle2021

enricgrau commented 11 months ago

@editorialbot generate pdf

editorialbot commented 11 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

enricgrau commented 11 months ago

@arfon No problem :) I fixed the doi's and also deleted de duplicate entries. Thank you!

arfon commented 11 months ago

@editorialbot recommend-accept

editorialbot commented 11 months ago

Attempting dry run of processing paper acceptance...

editorialbot commented 11 months ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/S0924-2031(03)00045-6 is OK
- 10.3390/analytica3030020 is OK
- 10.1116/1.5140587 is OK
- 10.1038/srep19414 is OK
- 10.1201/9781003328513-9 is OK
- 10.1016/J.ECOENV.2022.114405 is OK
- 10.1002/9781119763406.CH8 is OK
- 10.1103/REVMODPHYS.79.353 is OK
- 10.1038/s41524-022-00884-7 is OK
- 10.1016/B978-1-55860-247-2.50037-1 is OK
- 10.1007/3-540-57868-4_57 is OK
- 10.1016/J.CJCA.2021.09.004 is OK
- 10.48550/arxiv.1602.04938 is OK
- 10.48550/arxiv.2010.07389 is OK
- 10.1613/JAIR.1.12228 is OK
- 10.1109/ACCESS.2020.2976199 is OK
- 10.3389/FDATA.2021.688969 is OK
- 10.1109/iccv.2017.74 is OK
- 10.5281/ZENODO.6344451 is OK
- 10.1109/cvpr.2017.354 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.1145/3313831.3376219 is OK
- 10.21105/JOSS.05220 is OK
- 10.1145/3351095.3375624 is OK
- 10.1002/AENM.202103163 is OK
- 10.1039/d1ta01299a is OK
- 10.5281/ZENODO.4743323 is OK

MISSING DOIs

- 10.21203/rs.3.rs-2963888/v1 may be a valid DOI for title: The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective

INVALID DOIs

- None

editorialbot commented 11 months ago

:wave: @openjournals/dsais-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/4826, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

openjournals / joss-reviews