pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/
89 stars 33 forks source link

Presubmission inquiry: netCDF comparison tool python package #142

Closed danielfromearth closed 7 months ago

danielfromearth commented 8 months ago

Submitting Author: Daniel Kaufman (@danielfromearth)
Package Name: ncompare One-Line Description of Package: Compare the structure of two netCDF files at the command line Repository Link (if existing): https://github.com/nasa/ncompare


Code of Conduct & Commitment to Maintain Package

Description

This tool ("ncompare") compares the structure of two Network Common Data Form (NetCDF) files at the command line. It facilitates rapid comparisons by generating a formatted display of the matching and non-matching groups, variables, and associated metadata between two NetCDF datasets. The user has the option to colorize the terminal output for ease of viewing. As an option, ncompare can save comparison reports in text and/or comma-separated value (CSV) formats.

Community Partnerships

We partner with communities to support peer review with an additional layer of checks that satisfy community requirements. If your package fits into an existing community please check below:

Scope

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo
- [x] Unsure/Other (explain below)

When creating or modifying Network Common Data Form (netCDF) files, there is often a need to evaluate the differences between an original, unmodified, file and a new, modified file, especially for validation and regression testing. ncompare was developed to avoid the ineffective process of manually opening two netCDF files and inspecting their contents to determine whether there are differences in the structure and shapes of groups and variables

The target audience is anyone who manages the generation, manipulation, or validation of netCDF files. This package can be applied to to these netCDF file tasks in any scientific discipline; although it would be most relevant to applications with large multidimensional datasets, e.g., for comparing climate models, for Earth science data reanalyses, and for remote sensing data.

The ncdiff function in the nco (netCDF Operators) library, as well as ncmpidiff and nccmp, compute value differences, but --- as far as we are aware --- do not have a dedicated function to show structural differences between netCDF4 datasets. Our package, ncompare provides a light-weight Python-based tool for rapid visual comparisons of group & variable structures, attributes, and chunking.

P.S. Have feedback/comments about our review process? Leave a comment here

lwasser commented 8 months ago

hey there @danielfromearth great to see you here! 👋 i'm just dropping in to say hello given we have a brief interaction on the openscapes slack!! Someone from our team will be getting back to you in the next week or so about this submission!!

this looks like a really cool tool!! i sometimes miss working with netcdf and hdf files.

NickleDave commented 8 months ago

Welcome @danielfromearth! Thank you for this detailed presubmission inquiry.

ncompare is definitely in scope. Please proceed with a full submission and please reference this issue by number when you do. Once you confirm you will do so, I will close this issue

NickleDave commented 7 months ago

@danielfromearth I'll take your thumbs up as confirmation that you're going to submit :slightly_smiling_face: Closing this. We're looking forward to your submission!