pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/
93 stars 36 forks source link

ncompare #146

Closed danielfromearth closed 3 months ago

danielfromearth commented 11 months ago

Submitting Author: Daniel Kaufman (@danielfromearth) All current maintainers: (@danielfromearth) Package Name: ncompare One-Line Description of Package: ncompare compares two netCDF files at the command line, by generating a report of the matching and non-matching groups, variables, and attributes. Repository Link: https://github.com/nasa/ncompare Version submitted: 1.4.0 Editor: @tomalrussell
Reviewer 1: @cmarmo
Reviewer 2: @cmtso
Archive: 10.5281/zenodo.10625407 JOSS DOI: 10.21105/joss.06490 Version accepted: 1.7.2 Date accepted (month/day/year): 02/06/2024


Code of Conduct & Commitment to Maintain Package

Description

This tool ("ncompare") compares the structure of two Network Common Data Form (NetCDF) files at the command line. It facilitates rapid, human-readable, multi-file evaluation by generating a formatted display of the matching and non-matching groups, variables, and associated metadata between two NetCDF datasets. The user has the option to colorize the terminal output for ease of viewing. As an option, ncompare can save comparison reports in text and/or comma-separated value (CSV) formats.

Scope

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an existing community please check below:

[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

The target audience is anyone who manages the generation, manipulation, or validation of netCDF files. This package can be applied to to these netCDF file tasks in any scientific discipline; although it would be most relevant to applications with large multidimensional datasets, e.g., for comparing climate models, for Earth science data reanalyses, and for remote sensing data.

The ncdiff function in the nco (netCDF Operators) library, as well as ncmpidiff and nccmp, compute value differences, but --- as far as we are aware --- do not have a dedicated function to show structural differences between netCDF4 datasets. Our package, ncompare provides a light-weight Python-based tool for rapid visual comparisons of group & variable structures, attributes, and chunking.

Pre-submission inquiry #142

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication Options

JOSS Checks - [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [x] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [(NOT YET)] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [(NOT YET)] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Confirm each of the following by checking the box.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

NickleDave commented 11 months ago

Thanks @danielfromearth for making the full submission!

Here are the initial editor-in-chief checks.

Almost everything we need to start the review is there, but there's one I don't see.

I think you can add this at the top of the single page docs you have now? Basically repeating what you have in the README.

Please let me know if you can add that.
We're ready to find an editor and reviewers once you do.

Editor in Chief checks



Editor comments

danielfromearth commented 11 months ago

👋 Hi @NickleDave, thanks for the info! I've now updated the GitHub Pages to include the installation and basic usage section from the README.md.

NickleDave commented 10 months ago

Great, thank you @danielfromearth -- sorry I didn't reply sooner.

Just letting you know that @tomalrussell has graciously volunteered to be the editor for this review. We are now seeking reviewers and @tomalrussell will reply here with next steps once we have found them.

tomalrussell commented 10 months ago

Hi @danielfromearth, thanks for the package submission - we've now identified reviewers and can move on to the next steps..

👋🏻 welcome, @cmarmo and @cmtso! Thank you for volunteering to review for pyOpenSci ☺️ The review process is outlined in the guides below, and I'll aim to check in occasionally over the next weeks as we go.

Please fill out our pre-review survey

Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.

The following resources will help you complete your review:

  1. Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
  2. Here is the review template that you will need to fill out and submit here as a comment, once your review is complete.

Please get in touch with any questions or concerns! Your review is due in ~3 weeks: 13 December 2023


Reviewers: @cmarmo @cmtso Due date: 13 December 2023

tomalrussell commented 9 months ago

Hi @cmarmo , @cmtso just checking in - thanks for filling out the pre-review surveys! Are you both on track to post reviews next week? Do raise any questions here or drop me a line.

cmarmo commented 9 months ago

@tomalrussell , @danielfromearth , sorry for the silence! I'm glad to help with this review, and yes I'm going to take care of the review during the week! I had some commitments to honor with closer deadlines, but I'm available now.

cmtso commented 9 months ago

Hi @tomalrussell, it's been a busy two weeks. But I should be able to make the deadline next week.

cmtso commented 9 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 3

Review Comments

The package looks very good: package structure, documentation, and tests looks very good. It is a simple-to-use command-line tool based on python to compare the structure of two netCDF file. Note this tool only examines the structure of the files--so if the two files have different values but have the same fields/structure, this tool will not tell them apart.

I have tested the package with all the example tests provided, as well as some of my own data. However, there is no CI in place at the moment and may be something to work on in the future.

cmarmo commented 9 months ago

Hello everybody!

Thanks for giving me the opportunity to review ncompare! Please find my comments following.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 2h

Review Comments

Below a summary of my comments.

danielfromearth commented 8 months ago

Hi folks, a quick update: I'm wrapping up some final tweaks to respond to the review comments, and will also hopefully be able to add a JOSS paper to the repo next week.

danielfromearth commented 8 months ago

My responses to the comments and blank check-boxes from @cmtso's review can be found next to the green checkmarks (✅) below:

Documentation

The package includes all the following forms of documentation:

  • [ ] (partial) A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • [ ] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • [ ] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

The README should include, from top to bottom:

  • [x] Badges for:

    • [ ] Continuous integration and test coverage,
    • [ ] Docs building (if you have a documentation website),
    • [ ] A repostatus.org badge,
    • [ ] Python versions supported,
  • [ ] Citation information

Functionality

  • [ ] Automated tests:

    • [ ] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • [ ] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
  • [ ] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • [x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:

    • [x] Package supports modern versions of Python and not End of life versions.
    • [ ] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

  • [ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Estimated hours spent reviewing:

3

Review Comments

The package looks very good: package structure, documentation, and tests looks very good. It is a simple-to-use command-line tool based on python to compare the structure of two netCDF file. Note this tool only examines the structure of the files--so if the two files have different values but have the same fields/structure, this tool will not tell them apart.

I have tested the package with all the example tests provided, as well as some of my own data. However, there is no CI in place at the moment and may be something to work on in the future.

danielfromearth commented 8 months ago

My responses to the comments and blank check-boxes from @cmarmo's review can be found next to the green checkmarks (✅) below:

The README should include, from top to bottom:

  • [ ] Badges for:

    • [ ] Continuous integration and test coverage,
    • [ ] Docs building (if you have a documentation website),
    • [ ] A repostatus.org badge,
    • [ ] Python versions supported,
    • [x] Current package version (on PyPI / Conda).
  • [ ] Link to your documentation website.
  • [ ] Citation information

Functionality

  • [ ] Automated tests:
    • [ ] Tests cover essential functions of the package and a reasonable range of inputs and conditions: ncompare/core.py and ncompare/printing.py may benefit of an improved test coverage, both files have a coverage less than 75% and they are the core of the application.

For packages also submitting to JOSS

  • [ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Estimated hours spent reviewing:

2h

Review Comments

Below a summary of my comments.

  • As ncompare is a shell command line it would be informative to have a --version option printing the current version.
  • I am not used to poetry, the tests can indeed run with poetry run pytest tests but pytest tests from a python venv works too, it could be nice to have this clarified in the README as both installations are previously described.
  • As python 3.12 is out it could be interesting to have CI for the last python version.
  • The license is partially available in the README and as pdf in the license directory: as this is a specific NASA license and probably less known by the users, I suggest to have its name specified in the README perhaps with its link at the OSI website, for better discoverability.
  • I was able to run ncompare on the test data shipped with the package and other files, but the notebook in ncompare/example (which is also used in the documentation) uses two files which I was unable to download (permission denied): I strongly recommend to build example with public data, so they can be reproducible.
  • I cannot find a documentation build in the CI, am I wrong? This will make easier having consistent notebooks and example for the users.
  • The documentation only documents ncompare, which is fine because this is the main utility: the CONTRIBUTING file suggests "functions should contain a docstring, though short or trivial function may contain a 1-line docstring", it is unclear to me (as a possible future contributor) how to make the difference between trivial and non-trivial functions, perhaps private and public can be a more consistent definition?
danielfromearth commented 8 months ago

Hi @tomalrussell, thanks for shepherding and providing editorial review of this submission thus far.

We have made changes to the repository to address reviewer recommendations, and responded directly to the reviewers' comments above. In addition, please notice that we have now included a JOSS manuscript in the repository (in the joss-paper branch). Cheers.

tomalrussell commented 8 months ago

Thanks for the super clear responses @danielfromearth !

@cmarmo and @cmtso - over to you to respond to the changes. Discussion is encouraged, we're in the "review follow-up" phase.

Could you pay particular attention to the For packages also submitting to JOSS section, now that the manuscript is up on the repository (paper.md and paper.bib)? It may be helpful to refer to the JOSS guidelines for more detail and background on the expectations here.

As ever, do ask or drop me a line if you need anything ☺️

Responses due in ~2 weeks: 31 January.

cmarmo commented 8 months ago

Hello @danielfromearth , thank you for all your work and your answers to my comments! I have checked all your improvements and I have still some comments about the documentation:

I'm going to check the paper for JOSS in the next week.... sorry for the two steps answer....

cmarmo commented 8 months ago

Hello @danielfromearth, I realized that you are merging your modifications in a develop branch rather than in main. I see that the link to the notebook and to the example files are fixed: thank you! It looks like the link to the local version of the license in the readme still contains spaces and it is still not rendered.

@tomalrussell after this nit fixing I consider the author has responded to my review and made changes to my satisfaction. I recommend approving this package.

I have also checked the paper for JOSS: I'm copying here the checklist related to JOSS.

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

cmtso commented 7 months ago

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

danielfromearth commented 7 months ago

With regard to the ncompare repository:

@cmarmo — Thanks for taking another look through this software, and I see what you're talking about! For some reason the link to the license appears incorrect in the README's preview on the develop branch homepage, but navigating to the README.md file itself displays it correctly (see here). I suspect that once it's merged into main, then the README preview will display correctly — and I will make note of that here once it's merged.

@cmtso — Thank you for your additional effort looking back through this software!

With regard to the JOSS scope:

@cmarmo:

  • [x] ...I believe ncompare can be considered as a tool to "extract knowledge from large data sets".

✅ Like @cmarmo, we also consider ncompare as a tool to "extract knowledge from large data sets", but even more so as a tool that "supports the functioning of research instruments or the execution of research experiments" (from JOSS's what-we-mean-by-research-software). We believe this could be made clearer in the manuscript; therefore, we are planning revisions that will include more description of how ncompare has been utilized for the preparations and evaluation of satellite mission data to research Earth's atmosphere.

@cmarmo and @cmtso, respectively:

...However, note that JOSS states that "single-function packages are not acceptable." It is not clear to me how to interpret "single-function" and if ncompare can be considered as one.

...At the moment, it may be regarded as a minor 'utility' package. Perhaps a clearer roadmap of future feature extensions to cover a slightly larger (and more specific) scope and providing research use case in the paper will help with the JOSS submission.

✅ Although we also do not know JOSS's criteria for "single-function", we do not consider ncompare as a single-function package, because its command line interface includes several options for customizing the comparison and the generation of reports. Some of the feature roadmap under consideration is visible in the GitHub issues.

We note that ncompare has been intentionally designed not to be a "toolbox" of netCDF functions and it is not in our future roadmap, as toolbox libraries already exists. Instead, ncompare is designed to fill an open source software gap, with a focus on comparison, visualizing comparisons, and generating the resulting reports. Furthermore, we consider ncompare to not be a utility package or thin API, but rather a fully fledged tool that is enabling research now, and its functions are extendable so as to be built further upon.

We also appreciate the suggestion that "providing research use case in the paper will help with the JOSS submission", and as mentioned above, we now plan to revise the manuscript to include descriptions of how ncompare has been used in preparations and evaluations for an atmospheric science-focused satellite instrument.

@cmtso:

...I think this is a hugely useful tool to the research community.

✅ Thanks! We think so too :)

Note that because of the timeline of one of the satellite missions I refer to above, we may not be able to submit the revised manuscript to JOSS until April, when the mission data become public.

tomalrussell commented 7 months ago

Hi @danielfromearth, thanks for the thorough response.

On the question of scope for JOSS - in the end this is up to the JOSS editorial team, but do refer to this pyOpenSci review (as for example in this recent JOSS submission openjournals/joss-reviews#5332). I think you have a strong case for ncompare as enabling tool that extracts useful data from large datasets. Revising the manuscript as you describe sounds worthwhile. I'm happy to recommend it for submission when you're ready.

On the last few nits - they all look to be resolved on develop, and merging and creating a release should be part of the wrap-up, so with that -


🎉 ncompare has been approved by pyOpenSci! Thank you @danielfromearth for submitting ncompare and many thanks again to @cmarmo and @cmtso for reviewing this package! 😸

Author Wrap Up Tasks

There are a few things left to do to wrap up this submission:

JOSS submission

Here are the next steps:

Editor Final Checks

Please complete the final steps to wrap up this review. @tomalrussell reminder to do the following:


If you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and documentation in the peer-review-guide.

danielfromearth commented 7 months ago

Author Wrap Up Tasks

There are a few things left to do to wrap up this submission:

  • [x] Activate Zenodo watching the repo if you haven't already done so.
  • [x] Tag and create a release to create a Zenodo version and DOI.
  • [x] Add the badge for pyOpenSci peer-review to the README.md of ncompare. The badge should be [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/146).
  • [x] Please fill out the post-review survey. All maintainers and reviewers should fill this out.

JOSS submission

Here are the next steps:

  • [ ] Login to the JOSS website and fill out the JOSS submission form using your Zenodo DOI. When you fill out the form, be sure to mention and link to the approved pyOpenSci review. JOSS will tag your package for expedited review if it is already pyOpenSci approved.
  • [ ] Once the JOSS issue is opened for the package, we strongly suggest that you subscribe to issue updates. This will allow you to continue to update the issue labels on this review as it goes through the JOSS process.
  • [ ] Wait for a JOSS editor to approve the presubmission (which includes a scope check).
  • [ ] Once the package is approved by JOSS, you will be given instructions by JOSS about updating the citation information in your README file.
  • [ ] When the JOSS review is complete, add a comment to your review in the pyOpenSci software-review repo here that it has been approved by JOSS. An editor will then add the JOSS-approved label to this issue.

ncompare has been archived on Zenodo, with this DOI: 10.5281/zenodo.10625407

tomalrussell commented 7 months ago

Ace! I'll keep an eye on this issue for any updates on your JOSS submission whenever you can make the next steps.

You are also invited to write a short blog post about ncompare for pyOpenSci! If this sounds interesting, have a look at a couple of examples from movingpandas and pandera. There's a markdown example you could use to help draft a post. This is completely optional, but if you have time, we'd love to help promote your work.

lwasser commented 3 months ago

hey team. i'm checking in on this issue. did this issue / package get fastraacked through joss? if it did we need to do a few followup steps!

  1. cross link this issue with the joss review issue. i see the pre-review but not the final review
  2. add the JOSS doi to the header
  3. update the label to say joss accepted (in addition to pyos accepted!)

many thanks y'all!!

tomalrussell commented 3 months ago

Hey @lwasser thanks for checking in!

The package v1.9.0 has been accepted into JOSS, with review issue https://github.com/openjournals/joss-reviews/issues/6490

DOI updated in the issue header: https://doi.org/10.21105/joss.06490

lwasser commented 3 months ago

Fantastic! Thank you @tomalrussell !! We can close it then I think as complete! Thank you all for supporting our open review process!!