[REVIEW]: AmpTorch: A Python package for scalable fingerprint-based neural network training on multi-element systems with integrated uncertainty quantification

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@ajmedford@dhhagan<!--end-editor-- Reviewers: @ml-evs, @ianfhunter Archive: 10.5281/zenodo.8151492

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/ba551fc487e19526830cab97097fd86d"><img src="https://joss.theoj.org/papers/ba551fc487e19526830cab97097fd86d/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/ba551fc487e19526830cab97097fd86d/status.svg)](https://joss.theoj.org/papers/ba551fc487e19526830cab97097fd86d)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@ml-evs & @professoralkmin, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @dhhagan know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @ml-evs

📝 Checklist for @ianfhunter

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 1 year ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.26 s (824.6 files/s, 166125.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                             10           7091           1998          24319
Python                          48            982            474           5913
C/C++ Header                     9            161             60            719
ANTLR Grammar                  132              0              0            595
reStructuredText                 4             67             13            195
Markdown                         2             32              0            181
YAML                             6              8              9            140
TeX                              1              7              0             93
DOS Batch                        1              8              1             26
make                             1              4              7              9
-------------------------------------------------------------------------------
SUM:                           214           8360           2562          32190
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 1 year ago

Wordcount for paper.md is 1074

editorialbot commented 1 year ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.cpc.2016.05.010 is OK
- 10.48550/arxiv.1908.08381 is OK
- 10.1021/acs.jpcc.0c04225 is OK
- 10.1088/1361-648X/AA680E is OK

MISSING DOIs

- 10.21203/rs.3.rs-952157/v1 may be a valid DOI for title: A Universal Framework for Featurization of Atomistic Systems
- 10.1088/2632-2153/ac8fe0 may be a valid DOI for title: FINETUNA: Fine-tuning Accelerated Molecular Simulations

INVALID DOIs

- 10.1021/ACSCATAL.0C04525/SUPPL\_FILE/CS0C04525\_SI\_001.PDF URL is INVALID

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

ml-evs commented 1 year ago

Review checklist for @ml-evs

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/ulissigroup/amptorch?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@ajmedford) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

ml-evs commented 1 year ago

I started my review a few weeks ago but am still having installation issues raised at https://github.com/ulissigroup/amptorch/issues/108, @ajmedford could you please take a look?

ajmedford commented 1 year ago

@ml-evs Thanks for your efforts on this, and apologies for the delay. I just posted a note to https://github.com/ulissigroup/amptorch/issues/108, we would like your advice on how to best proceed.

dhhagan commented 1 year ago

@professoralkmin Sorry for the delay here - you can go ahead and generate your checklist by adding a comment with @editorialbot generate my checklist

ml-evs commented 1 year ago

Just pinging @ajmedford in case it was missed, I raised an issue with some comments on the code and documentation at https://github.com/ulissigroup/amptorch/issues/112.

dhhagan commented 1 year ago

Hey @professoralkmin - I just wanted to ping you and see if you had any questions about proceeding with the review?

ajmedford commented 1 year ago

@ml-evs - Thanks for the reminder and suggestions. Nicole has been working on this, and fixed a few more things along the way. We're hoping to have it turned around by the end of the week.

dhhagan commented 1 year ago

👋 @hiendn would you be willing to review this manuscript?

ajmedford commented 1 year ago

@dhhagan - We are in the process of a large revision to the code based on @ml-evs feedback, and expect to have the new version out in the next few days. I am not sure of the exact workflow of the revision process, but it might make sense to wait for these changes to the code before the next reviewer takes a look. Let me know what you think.

ajmedford commented 1 year ago

@ml-evs @dhhagan Thank you for taking the time to review the code and for providing valuable feedback. We appreciate your thoughtful comments and suggestions, which have helped us to improve the quality of our work. Given the comments, @nicoleyghu has led the implementation of a number of changes, which are listed below.

DONE

Re-organized the example folder and provided a readme.md file that walks through the general structure to energy (and forces) tasks for the two main fingerprinting + neural network models, and constructing and training with lmdb files. (Comment #3)
Completed the documentation for main functions and classes for the module and integrated it into the documentation (see API here). (Comment #5)
Made a default method for loading fitted pseudo-densities so the general users don't have to import it manually while allowing importing if specifically specified. (Comment #6)
Imported all test scripts into unittest, and made all relevant changes. The tests and examples should work now. (Comment #7)

There are also a few points that we have not completed yet, and the discussion is below:

[DOING] Comments #1 and #2: We are currently working on integrating the dataclass(es) for config of AtomsTrainer into the code, but as it requires refactoring how the code loads from Python Dictionary, it would probably happen for a later release.
Comment #2: We hope the example scripts now provide a better starting point for the users as to GMP+SingleNN or SF+BPNN's training schemes.
Comment #3: The key findings in GMP and Uncertainty Quantification paper require training with a larger dataset that is a bit difficult to incorporate into a demo, but the general training pertains to what is shown in the examples folder just with hyperparameter optimized for the given task. Hopefully this is sufficient.
Comment #8: We would like to finalize a release after hearing the feedback from reviewer(s).

Thank you again for your time and effort in reviewing our code and we look forward to your continued feedback. @dhhagan - please advise on the remainder of the peer review process. We would like to wait until the paper is fully "accepted" and then finalize the next release.

dhhagan commented 1 year ago

@ajmedford Just to clarify, you would like the review to cover v0.1 and not the new release, correct? I am still trying to find a second reviewer as one seems to have vanished. Otherwise, @ml-evs should be able to go through the changes you listed above and finalize their review.

ajmedford commented 1 year ago

@dhhagan We would like to have reviewers look at the current master branch. Once all the reviews are finalized, we will then release a new version that corresponds to the "accepted" paper. Let me know if this makes sense, or if you recommend something else.

dhhagan commented 1 year ago

Hi @ml-evs, have you had a chance to look over any of the changes at this point?

dhhagan commented 1 year ago

👋 @pmeier @ianfhunter - are either of you able to review this manuscript?

ianfhunter commented 1 year ago

👋 @pmeier @ianfhunter - are either of you able to review this manuscript?

Yes, I can provide a review if desired

ml-evs commented 1 year ago

Hi @ml-evs, have you had a chance to look over any of the changes at this point?

I have been off work due to a bereavement, but can take another look when I am back next week. I don't think my re-review is blocking for the other outstanding review.

pmeier commented 1 year ago

In general, I'm available as well. However, I'm a little worried about

-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                             10           7091           1998          24319
Python                          48            982            474           5913

I'm not a C++ dev and thus cannot review anything there beyond the basics. A quick look through the repository reveals that this the C++ is mostly inside the amptorch.descriptor.GMP module and thus relatively "contained". In case my review is needed, maybe @ajmedford can comment if extensive C++ knowledge is required.

ajmedford commented 1 year ago

@pmeier - I don't think the C++ should be too extensive. This is mostly just a performance optimization, and the technical details have been fairly well vetted at this point.

dhhagan commented 1 year ago

@editorialbot add @ianfhunter as reviewer

editorialbot commented 1 year ago

@ianfhunter added to the reviewers list!

dhhagan commented 1 year ago

Hi @ianfhunter sorry for the delay - I have added you as a reviewer and you should be able to generate the checklist by running @editorialbot generate my checklist. Please let me know if you have any questions either via a comment here or at david.hagan@quant-aq.com.

dhhagan commented 1 year ago

In general, I'm available as well. However, I'm a little worried about
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                             10           7091           1998          24319
Python                          48            982            474           5913
I'm not a C++ dev and thus cannot review anything there beyond the basics. A quick look through the repository reveals that this the C++ is mostly inside the amptorch.descriptor.GMP module and thus relatively "contained". In case my review is needed, maybe @ajmedford can comment if extensive C++ knowledge is required.

Hi @pmeier thanks for offering your time, but I think we've got it covered with 2 reviewers at this point. I will keep you in mind for future python reviews!

ianfhunter commented 1 year ago

Review checklist for @ianfhunter

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/ulissigroup/amptorch?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@ajmedford) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

ajmedford commented 1 year ago

@ianfhunter @ml-evs @dhhagan - let us know if there are any outstanding issues, or what the next steps in the process are. @nicoleyghu is heading off to an internship soon, and graduating at the end of summer, so we would like to get this wrapped up before then if possible.

ianfhunter commented 1 year ago

@dhhagan I'm finished with my review. Everything looks good and polished.

Well done 👏

ml-evs commented 1 year ago

I've raised a few final comments in https://github.com/ulissigroup/amptorch/issues/117 and https://github.com/ulissigroup/amptorch/issues/112 that I hope are straightforward to address, then I think this is good to go.

dhhagan commented 1 year ago

Hi @ajmedford - as soon as the issues from @ml-evs are addressed, I will take a final look and move through the approval process! Let me know if you have any questions.

ajmedford commented 1 year ago

Thanks @dhhagan! I think we have addressed the issues, but I will let @ml-evs confirm.

ml-evs commented 1 year ago

My suggestions have been implemented to my satisfaction --- I'm very happy to recommend this package for publication in JOSS, @dhhagan.

dhhagan commented 1 year ago

@editorialbot generate pdf

dhhagan commented 1 year ago

@editorialbot check references

editorialbot commented 1 year ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1088/2632-2153/ACA7B1 is OK
- 10.1016/j.cpc.2016.05.010 is OK
- 10.48550/arxiv.1908.08381 is OK
- 10.1021/acs.jpcc.0c04225 is OK
- 10.1088/1361-648X/AA680E is OK

MISSING DOIs

- 10.21203/rs.3.rs-952157/v1 may be a valid DOI for title: A Universal Framework for Featurization of Atomistic Systems
- 10.1088/2632-2153/ac8fe0 may be a valid DOI for title: FINETUNA: Fine-tuning Accelerated Molecular Simulations

INVALID DOIs

- 110.1002/qua.24890 is INVALID
- 10.1021/ACSCATAL.0C04525/SUPPL\_FILE/CS0C04525\_SI\_001.PDF URL is INVALID

editorialbot commented 1 year ago

:warning: An error happened when generating the pdf.

dhhagan commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:warning: An error happened when generating the pdf.

dhhagan commented 1 year ago

HI @ajmedford it looks like the bot is having a hard time generating the pdf due to what it claims to be a missing author affiliation, though I checked and it looks good to me. I will reach out to an EiC to see if they can help debug.

In the meantime, can you take a look at the missing and invalid DOIs per the comment above? If you feel they are properly formed, please let me know and I can manually verify and bypass the recommendation of editorialbot.

@arfon Do you have any idea why the bot is failing? I looked through the stack trace and it claims a missing author affiliation but it looks okay to me?

ajmedford commented 1 year ago

Sure, we can check the DOI's. A few papers have now appeared in journals so we can update those.

Not sure if this matters, but we did add one author (Jacob Paras) in between the initial submission and the final version. Perhaps this is causing an affiliation issue? Let us know if we need to make any changes.

ajmedford commented 1 year ago

@dhhagan - we have updated the DOIs now, and double-checked the author list. Let me know if it compiles now.

ml-evs commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:warning: An error happened when generating the pdf.

ml-evs commented 1 year ago

Just trying to help debug this @ajmedford, you can see the PDF generation errors here: https://github.com/openjournals/joss-papers/actions/runs/5253611987/jobs/9491155775

Its a bit hidden, but it looks like the remaining problem is just a typo in the affiliation setting:

/home/runner/work/_actions/xuanxu/paper-action/main/vendor/bundle/ruby/3.2.0/gems/theoj-1.5.4/lib/theoj/paper.rb:158:in `failure': 
Author (Jacob Paras) is missing affiliation (Theoj::Error)

I can make a quick PR that fixes this for you, then you should be able to run the same command I did above to check for any other errors.

@dhhagan could you then advise of the next step to get this out of the door? I also note that one of the reviewers listed above (professoralkmin) did not end up reviewing the paper.

ml-evs commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

dhhagan commented 1 year ago

@ml-evs thanks for fixing that issue! I am going to give it one more read-through and then should have it out the door by tonight/tomorrow.

ajmedford commented 1 year ago

Thanks @dhhagan! We just got feedback from the co-authors, and have the following list of minor changes that @nicoleyghu will work on getting implemented and pushed ASAP. Other than that it looks great!

L26: "relatively small datapoints" -> "relatively small datasets" L35, 36, 39, etc.: AMP -> Amp L70: "both constructing..." -> "both constructing and applying" L74: "Open Catalyst 2020 Dataset" -> "datasets in the Open Catalyst Project" L80: "fill in gap of" -> "fill this gap by"

openjournals / joss-reviews