afscgap: Ocean health tools for NOAA AFSC GAP species presence data

sampottinger commented 1 year ago

Submitting Author: Sam Pottinger (@sampottinger)
All current maintainers: @sampottinger, @gizarp
Package Name: afscgap
One-Line Description of Package: Community contributed Python-based tools for working with public bottom trawl surveys data from the NOAA Alaska Fisheries Science Center Groundfish Assessment Program (NOAA AFSC GAP).
Repository Link: https://github.com/SchmidtDSE/afscgap
Version submitted: 0.0.7
Editor: @ocefpaf Reviewer 1: @7yl4r Reviewer 2: @ayushanand18 Archive: JOSS DOI: Version accepted: 0.0.7 Date accepted (month/day/year): 05-30-2023

Code of Conduct & Commitment to Maintain Package

[X] I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package after should it be accepted.
[X] I have read and will commit to package maintenance after the review as per the pyOpenSci Policies Guidelines.

Description

Python-based tool set for interacting with bottom trawl surveys from the Ground Fish Assessment Program (GAP). This provides information about where certain species were seen and when under what conditions, information useful for research in ocean health.

It offers:

Pythonic access to the official NOAA AFSC GAP API service.
Memory-efficient tools for inference of the "negative" observations not provided by the API service but required for some common types of analysis.
Visualization tools for quickly exploring and creating comparisons within the dataset that provide an "on-ramp" to deeper analysis. The visual analytics components aim to serve both expert programmers and audiences with limited programming experience.

Note that GAP is an excellent dataset produced by the Resource Assessment and Conservation Engineering (RACE) Division of the Alaska Fisheries Science Center (AFSC) as part of the National Oceanic and Atmospheric Administration's Fisheries organization (NOAA Fisheries). See also the RACEBASE NOAA InPort entry.

Additional information at https://pyafscgap.org/.

Scope

Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
- [X] Data retrieval
- [ ] Data extraction
- [X] Data processing/munging
- [ ] Data deposition
- [ ] Data validation and testing
- [X] Data visualization
- [ ] Workflow automation
- [ ] Citation management and bibliometrics
- [ ] Scientific software wrappers
- [ ] Database interoperability

related to data viz category: see presubmission Domain Specific & Community Partnerships

[X] Geospatial
[ ] Education
[ ] Pangeo

Community Partnerships

If your package is associated with an existing community please check below:

[ ] Pangeo
- [ ] My package adheres to the Pangeo standards listed in the pyOpenSci peer review guidebook

[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):

This library supports retrieval of data from the official NOAA AFSC GAP REST API service but, as that service on its own is often not sufficient due to certain data limitations, it also offers zero catch inference as required for a number of common types of investigations (hence data processing). Finally, given the needs of the community and the vast breadth of the dataset, it offers a community application to explore the data which, in turn, can generate Python code to help users get started with continued analysis within their own scripts.

Who is the target audience and what are scientific applications of this package?

This project largely benefits scientific researchers in the ocean health space as this dataset is useful for fisheries management, biodiversity research, and marine science more generally. An example of what this analysis may look like is provided in our example notebook hosted on mybinder.org.

Are there other Python packages that accomplish the same thing? If so, how does yours differ?

We are not aware of other Python packages working with the AFSC GAP dataset.

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Please see our presubmission.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

[X] does not violate the Terms of Service of any service it interacts with.

See NOAA InPort entry. We are primarily using Distribution 6 in the API form. As described, these data are available with the following access constraints: "There are no legal restrictions on access to the data. They reside in public domain and can be freely distributed."

[X] uses an OSI approved license.

See BSD License within repo.

[X] contains a README with instructions for installing the development version.

See project README.md.

[X] includes documentation with examples for all functions.

See project website and usage section of project website's documentation microsite.

[X] contains a tutorial with examples of its essential functions and uses.

See tutorial notebook on mybinder.

[X] has a test suite.

See main tests for package and supplemental tests for website (backend tests and frontend tests).

[X] has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

See CI and CD (library, documentation).

Publication Options

[X] Do you wish to automatically submit to the Journal of Open Source Software? If so:

JOSS Checks

- [X] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. See [example notebook](https://mybinder.org/v2/gh/SchmidtDSE/afscgap/main?urlpath=/tree/index.ipynb) and [app intro](https://app.pyafscgap.org/). - [X] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. In addition to query compilation / emulation, we are hopeful that the negative record inference and complementing visualization tool are enough to escape the thin API client status. Coming in at over 10k lines of code, we would seem to fit some of the ["hard" criteria that the journal puts forward](https://joss.readthedocs.io/en/latest/submitting.html). Though this library happily provides Pythonic access to an API service, we believe that this contribution as a whole goes beyond a thin client by providing one of the only public mechanisms for conducting investigation requiring negative catch data and provides unique tools for comparative analysis within with the dataset. - [X] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. See https://github.com/SchmidtDSE/afscgap/blob/main/inst/paper.md (or PDF preview at https://github.com/SchmidtDSE/afscgap/blob/main/inst/paper.pdf). - [X] The package is deposited in a long-term repository with the DOI: We have submitted to Code Ocean. Please confirm if this will suffice. See 10.24433/CO.4905407.v1 / https://codeocean.com/capsule/4905407/tree/v1 *Note: Do not submit your package separately to JOSS*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

[x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

We would be delighted to have your pull requests! 🎉

Confirm each of the following by checking the box.

[X] I have read the author guide.
[X] I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

[X] Last but not least please fill out our pre-review survey](https://forms.gle/F9mou7S3jhe8DMJ16). This helps us track submission and improve our peer review process. We will also ask our reviewers and editors to fill this out.

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The [editor template can be found here][Editor Template].

The [review template can be found here][Review Template].

sampottinger commented 1 year ago

This is a continuation of #91

sampottinger commented 1 year ago

Sorry updated issue body to fix a small formatting issue and to clarify that we have two maintainers.

NickleDave commented 1 year ago

No worries, edit as much as you need -- we will end up updating the metadata in the YAML header as we go anyway.

I will do initial checks and start finding an editor for this review early next week.

sampottinger commented 1 year ago

Fantastic thank you! Have a great weekend :)

NickleDave commented 1 year ago

You too! 🙂

NickleDave commented 1 year ago

Hi again @sampottinger I'm adding the initial editor checks below.

Good news! afscgap passes with flying colors. I made a couple of comments in the checklist but I don't see anything that needs to be addressed--looks like we're ready for review.

I will begin looking for an editor this week and update you here as soon as we have found one.

Editor in Chief checks

Hi there! Thank you for submitting your package for pyOpenSci review. Below are the basic checks that your package needs to pass to begin our review. If some of these are missing, we will ask you to work on them before the review process begins.

Please check our Python packaging guide for more information on the elements below.

[x] Installation The package can be installed from a community repository such as PyPI (preferred), and/or a community channel on conda (e.g. conda-forge, bioconda).
- might be worth adding a conda-forge package?
- [x] The package imports properly into a standard Python environment import package-name.
[x] Fit The package meets criteria for fit and overlap.
- yes, data retrieval (from GAP dataset through API), data processing, data visualization
[x] Documentation The package has sufficient online documentation to allow us to evaluate package function and scope without installing the package. This includes:
- [x] User-facing documentation that overviews how to install and start using the package.
- https://pyafscgap.org/devdocs/afscgap.html
- [x] Short tutorials that help a user understand how to use the package and what it can do for them.
- notebooks etc linked from landing page https://pyafscgap.org (that also links to install / API docs)
- [x] API documentation (documentation for your code's functions, classes, methods and attributes): this includes clearly written docstrings with variables defined using a standard docstring format. We suggest using the Numpy docstring format.
[x] Core GitHub repository Files
- [x] README The package has a README.md file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.
- [x] Contributing File The package has a CONTRIBUTING.md file that details how to install and contribute to the package.
- [x] Code of Conduct The package has a Code of Conduct file.
- adapted from p5.js COC
- [x] License The package has an OSI approved license. NOTE: We prefer that you have development instructions in your documentation too.
- BSD 3-Clause
[x] Issue Submission Documentation All of the information is filled out in the YAML header of the issue (located at the top of the issue template).
[x] Automated tests Package has a testing suite and is tested via GitHub actions or another Continuous Integration service.
[x] Repository The repository link resolves correctly.
[x] Package overlap The package doesn't entirely overlap with the functionality of other packages that have already been submitted to pyOpenSci.
[ ] Archive (JOSS only, may be post-review): The repository DOI resolves correctly.
[ ] Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

[x] Initial onboarding survey was filled out We appreciate each maintainer of the package filling out this survey individually. :raised_hands: Thank you authors in advance for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:

Editor comments

sampottinger commented 1 year ago

Incredible @NickleDave thank you so much!

NickleDave commented 1 year ago

Hi @sampottinger and @gizarp, very happy to say that @ocefpaf has generously volunteered to act as editor for this review. He will introduce himself and take the lead from here.

sampottinger commented 1 year ago

Thanks @NickleDave and great to meet you @ocefpaf!

ocefpaf commented 1 year ago

Hi folks! Sorry for the delay in getting here. (Holiday+longweekend in South America this past few days.)

@NickleDave I'm catching up with the reading today. Please let me know if there are any urgent button/forms I need press/fill.

ocefpaf commented 1 year ago

@sampottinger I have a few really minor comments after a first pass. These are not blockers and you don't need to address them for this application! They are really just comments:

I see you have a module named utils. That can be problematic as we tend to put everything in there. It also make future refactors or even maintaining the code a bit hard. Most of the functions there seems to be "converters," right? Maybe renamed it to converters.py (horrible name, I know) and avoid the "utils.py" trap.
The query method argument list is daunting! I'm not sure if there is a way around it. I know you are limited by the data structure upstream. Maybe it is a matter of creating more examples and docs? Or other methods with a reduced search scope that calls the main one? Not sure if that even makes sense.
The package name suffers from the usual NOAA acronym problem. I understood it but I am "in the know" b/c I'm a NOS/IOOS/SECOORA contractor (see ;-p) . Maybe a friendlier name can help folks outside of NOAA-Fisheries find this package. However, if there is a user base, even a small one, please disregard this comment. Renaming a package after folks adopted it is worse than a bad name.

PS: I showed your package to a few colleagues at IOOS and they believe that a notebook with an afscgap example would make a nice addition to our Code Lab https://ioos.github.io/ioos_code_lab/content/intro.html. Let me know if you are interested in that.

sampottinger commented 1 year ago

Hey @ocefpaf! Thanks so much for all of these. Let me pull together a PR real fast and some comments!

sampottinger commented 1 year ago

Thanks again @ocefpaf for your thoughtful comments! Please see below.

I see you have a module named utils

This is a good suggestion. I made a PR at https://github.com/SchmidtDSE/afscgap/pull/49. Let me know if this refactor looks good to you!

The package name suffers from the usual NOAA acronym problem ... if there is a user base, even a small one, please disregard this comment

Yeah the dataset's whole name is NOAA AFSC GAP RACEBACSE FOSS :sweat_smile: . Thanks for being understanding on this one. I think we will politely decline this suggestion if that's OK. We have some community continuity concerns like you mentioned and some community sensitivity concerns we want to respect... we have to be a little careful around not appearing to have activity in other datasets (NOAA or otherwise) that may be at different moments in their open science journey than AFSC GAP or not (yet? :crossed_fingers:) open in the same way that the AFSC GAP API service is. Given our current context, this specificity is probably desirable for at least the medium-term future of the project.

a nice addition to our Code Lab

Amazing! We'd be honored :smile:. We'd be more than happy to co-develop something with you or another community member in that space. Just for the conversation, I'll highlight that we do have a mybinder example available (source in repo). Totally understand if it might not be quite the right vibe though.

The query method argument list is daunting!

I think that design choice is a strong opinion loosely held. I'm not sure if it is helpful but I'll document some thoughts we had internally but 100% open to additional ideas!

We went this route to fulfill a few goals: we wanted to reflect the structure of the API service (any combination of filters in the keyword set is valid in the NOAA API service so there isn't a clear functional decomposition) while also providing documented discoverability. There were some other ideas explored:

kwargs: Leaving the keyword arguments as variable is a bit easier for implementation but, though documentation for the AFSC GAP dataset exists, it's not necessarily convenient to reference / use from a Python context. So, we wanted to have docstring description for all of the available options to benefit IDEs, API doc generation, etc. We also wanted the possible list of filters to be enumerated for the user in those contexts as well.
pass objects: There's the option of a more ORM-like approach where a list of filters are passed like query(filters=[CommonNameFilter(eq='Pacific cod')] but we didn't like forcing a lot of our type system into client code as it would require quite a large number of types to maintain our goals around docstring coverage of the available filter parameters and discoverability.
builder chaining: Probably my personal favorite, we could turn this into an object with syntax like query().commonName(eq='Pacific cod').year(min=2021).get(). While I personally like this syntax, there's been conversation internally and with community partners for using this library in an educational context as well. As this syntax may be less common in earlier Python programming journeys, there was a feeling that the keyword arguments approach was more inclusive / approachable for someone in, say, an introductory Python course or a student not in a computer science curriculum (we got invited for one coming up soon!).

Just wanted to list this out but absolutely open to other thoughts! This just felt like the right balance around the broader goals with regards to documentation / discoverability, keeping the type system encapsulated (see below), and maintaining accessibility.

library

ocefpaf commented 1 year ago

This is a good suggestion. I made a PR at https://github.com/SchmidtDSE/afscgap/pull/49. Let me know if this refactor looks good to you!

Better than I thought. Hopefully that will make your life easier in the future.

I think we will politely decline this suggestion if that's OK.

No arguments here! It is better to keep the name now that the package is out.

Totally understand if it might not be quite the right vibe though.

It is a good fit. I'll ping you outside of pyOpenSci later to talk more about it.

Regarding the query functionality. I do like the idea of the builder chaining and it is not too uncommon. Pandas, pyjanitor and many other mainstream libraries use it. It is quite common in the R world too. Also, many similar packages, like argopy, have something like that.

With that said, again, anything I write here is not a blocker for acceptance. The current implementation is good and solid. Just suggestions for the future.

PS: I disagree with the learning curve there. Maybe it is my bais but I teach a lot of basic Python and I make sure to expose the students to a flat vs nested model for operations.

ocefpaf commented 1 year ago

:wave: Hi Tylar Murray and Ayush Anand! Thank you for volunteering to review for pyOpenSci!

The following resources will help you complete your review:

Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
Here is the review template that you will need to fill out and submit here as a comment, once your review is complete.

Please get in touch with any questions or concerns! Your review is due: May 4th, 2023

Reviewers: @7yl4r and @ayushanand18 Due date: 2023-05-04

PS: You can comment, open issue, and/or PR directly in the afscgap for this review. Just make sure to cross-reference them here.

sampottinger commented 1 year ago

Thanks @ocefpaf!

Better than I thought.

Awesome! Merged in.

I disagree with the learning curve there.

That's helpful to have your perspective on that. Let me float this around a little bit and see how we feel about migration. I'll try to make a call on that in the next day or so to keep things moving.

With that said, again, anything I write here is not a blocker for acceptance. The current implementation is good and solid. Just suggestions for the future.

Thanks!

wave Hi Tylar Murray and Ayush Anand!

Lovely to meet you both! Thank you for your time. :D

ayushanand18 commented 1 year ago

Thank you so much @ocefpaf and @sampottinger! Great work building this package. The current implementation looks robust and clean.

But I do have some observations to make.

There is this function afscgap.query().get_bottom_temperature_c(), and I do think that it might be hard to guess the function name for someone who hasn't seen this dataset before. Mind renaming to something like get_bottom_temperature(unit="celsius"|'c')? This observation is similar to other functions like cpue_*(), duration_hr, etc. I think could have been derived from the direct field names in the dataset, and those who know about the dataset can easily figure out what the name would be, without looking at the docs, but it would be difficult for people from outside (like me) to figure out without going back-and-forth to the docs.
Also about the filters and getters, we can utilise the getitem() method in the class with something like this,
```
def __getitem__(self, key):
return self.__getattribute__(key) 
```
This might be more intuitive. Or to make it much easier can we can just go by the name get(key), and specify the key we want. Ref

I would love to hear your opinion, and these are surely non-blockers for this review!

sampottinger commented 1 year ago

Hello @ayushanand18 and @ocefpaf!

I do like the idea of the builder chaining and it is not too uncommon

Thanks again for the conversation @ocefpaf! I've started implementing the builder pattern at https://github.com/SchmidtDSE/afscgap/issues/50.

Mind renaming to something like get_bottom_temperature(unit="celsius"|'c')

Thanks for the suggestion @ayushanand18! I've started that at https://github.com/SchmidtDSE/afscgap/issues/51

we can utilise the getitem() method in the class

Thanks for this suggestion but I may pass on this if that's ok @ayushanand18.

In light of the stuff we did earlier in process, our reasoning is that the library carries an explicit goal to provide docstrings and type annotations for these fields. I think, given the complexity of the data model itself as well as the difficulty in using some of the upstream documentation from NOAA directly in the Python ecosystem, I want to leave those methods there for discoverability and encourage use of those typed interfaces unless the user "opts out" first.

To that end, there is a to_dict method on afscgap.model.Record for those that prefer that key-based approach or want to read into something like pandas.DataFrame but we hope requiring to_dict first helps inform the user that the object returned from a query does offer those other interfaces, information which may be missed if they simply try treating the result as a dictionary. In other words, I'd rather have the user first explicitly decline use of the default typed / strong interface given the benefits it offers for static checking and affordance.

sampottinger commented 1 year ago

Hello all! @ocefpaf, the builder interface was implemented in https://github.com/SchmidtDSE/afscgap/pull/52.

sampottinger commented 1 year ago

Mind renaming to something like get_bottom_temperature(unit="celsius"|'c')

Thanks @ayushanand18! I have implemented in https://github.com/SchmidtDSE/afscgap/pull/55

sampottinger commented 1 year ago

I am in the process of confirming https://github.com/SchmidtDSE/afscgap/pull/54 which includes the following requested changes before they are resolved on main:

https://github.com/SchmidtDSE/afscgap/pull/54 will also soon include updates to documentation.

ayushanand18 commented 1 year ago

I've started that at SchmidtDSE/afscgap#51

That looks great!

Thanks for this suggestion but I may pass on this if that's ok .

It's okay. At the end of the day, we need to follow what's best for the user of the package.

To that end, there is a to_dict method on afscgap.model.Record for those that prefer that key-based approach or want to read into something like pandas.DataFrame [...]

Sounds good. But that leaves me a question of whether the afsc.model.Record object returns a nested dictionary in any case with the to_dict() method. If the answer is yes, then probably we might include a function to_pandas() that unnests the nested dictionary and returns a DataFrame object so that user doesn't take the pain. But again this is just a suggestion.

ocefpaf commented 1 year ago

given the complexity of the data model itself as well as the difficulty in using some of the upstream documentation from NOAA directly in the Python ecosystem

I hear you. My day job is to try to make that better and, most of the times, we fail :-/

sampottinger commented 1 year ago

I hear you. My day job is to try to make that better and, most of the times, we fail :-/

@ocefpaf - I'm sorry I didn't mean it like that! Perhaps just that the docstrings are extra convenient in this context :sweat_smile: . The upstream documentation for the datasets have to address both internal and external users with different kinds of access permissions / modalities. It just means it's a big schema with lots of (great) data but also takes a little while to get oriented towards. I think / hope that the docstrings help a bit but this library also has a simpler task since it knows which dataset the user is working with. Also, since I know they are in Python, I can also do a few things to help the user that the upstream service can't do.

Unrelated but thanks for your comment on the issue in the repo about the builder pattern!

That looks great!

Thank you! Appreciate your suggestions @ayushanand18.

object returns a nested dictionary in any case with the to_dict() method

It is not nested! It uses the same field names and linear record structure from the API service.

sampottinger commented 1 year ago

Hello all! Your requests are implemented on main and the issues in the library are closed. Thanks for your help!

ocefpaf commented 1 year ago

@ocefpaf - I'm sorry I didn't mean it like that!

No harm done! I was mostly venting my own day-job frustration. Like in Software, agreeing on standards is hard!

Most of the time, even when we do agree on them, we only see its flaws when we start to implement Software to use them. By that time it is too late and now we need to circumvent the issues with Software and/or metadata patches.

ocefpaf commented 1 year ago

@ayushanand18 thanks for all the feedback!

@7yl4r do you believe you can take a look at this project in the next few days? If not, please let us know!

ocefpaf commented 1 year ago

PS: Don't forget to close your review with the reviewer template. See https://github.com/pyOpenSci/software-submission/issues/93#issuecomment-1507160030.

ayushanand18 commented 1 year ago

@ayushanand18 thanks for all the feedback!

Thanks @ocefpaf, I'm done with my comments on the review.

ayushanand18 commented 1 year ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[x] Vignette(s) demonstrating major functionality that runs successfully locally.
[x] Function Documentation: for all user-facing functions.
[x] Examples for all user-facing functions.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[x] Badges for:
- [x] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [x] Python versions supported,
- [x] Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

[x] Short description of package goals.
[x] Package installation instructions
[x] Any additional setup required to use the package (authentication tokens, etc.)
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[x] Link to your documentation website.
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[x] Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

[x] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use
[x] The package is easy to install

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests:
- [x] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [x] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [x] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software
[ ] Authors: A list of authors with their affiliations
[ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
[ ] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 0.5

Review Comments

Thanks for the submission @sampottinger! Everything looks good. ~However, my only concern is of including Installation Instructions for the dev version of the package through GitHub.~

sampottinger commented 1 year ago

Hello @ayushanand18! Thanks very much for your review.

However, my only concern is of including Installation Instructions for the dev version of the package through GitHub.

I’m sorry are you recommending that we include additional installation details for installing direct from GitHub (eg: pip install git+https://github.com/SchmidtDSE/afscgap.git@main)? Just wanted to double check.

sampottinger commented 1 year ago

Also @ocefpaf / @ayushanand18 - we were hoping to try at JOSS (we know they don’t like API wrappers but we are hoping that zero catch inference, ORDS compilation / emulation, visualization, etc take us over the line). The paper.md is in the inst folder. LMK if there is something else we should do to facilitate. Just noticed that the section on JOSS was left unchecked.

sampottinger commented 1 year ago

Sorry @ayushanand18 - Added this PR to address your final comment: https://github.com/SchmidtDSE/afscgap/pull/72. LMK if this addresses your concern. Thanks again very much for your time! We very much appreciate all of your help.

ocefpaf commented 1 year ago

LMK if there is something else we should do to facilitate. Just noticed that the section on JOSS was left unchecked.

@lwasser should the PyOS reviewers fill that part or do you get someone from JOSS doing that?

lwasser commented 1 year ago

@ocefpaf this is a great question. JOSS does not normally check those boxes. I think typically the submittor / author would want to check those to indicate that they have all of the items needed by JOSS. but JOSS will review the paper - we do not need to review the paper. It is my understanding that JOSS will also double check scope. But it is good for the author to check first. I hope this helps. feel free to ask more questions if it's unclear as we can always adjust our processes if we need to.

@sampottinger thank you for submitting to us and @ayushanand18 @7yl4r so happy to have you here as reviewers! 👋

sampottinger commented 1 year ago

Thank you @lwasser !

ayushanand18 commented 1 year ago

I’m sorry

No need to be sorry on that, I should have been clearer in my comments :)

are you recommending that we include additional installation details for installing direct from GitHub

Yes, I meant that. The PR looks good, and I don't have anything to add.

sampottinger commented 1 year ago

Thanks @ayushanand18 !

sampottinger commented 1 year ago

Hello all! Just checking in. Is there anything else I can do to help in the process? Thanks!

7yl4r commented 1 year ago

Sorry for the delay on my part. I have filled the review template in this gist but am still looking through things. All looks great. There are some specific details from pyOpenSci to look at. My personal preference would be to simplify the documentation text, but I do not yet have a recommendation on how to do that.

sampottinger commented 1 year ago

Thanks! Looking forward to your feedback @7yl4r!

sampottinger commented 1 year ago

Hello @7yl4r! Thank you for https://github.com/SchmidtDSE/afscgap/issues/75. I have responded there but I was unable to reproduce your behavior in a fresh Ubuntu VM. Just checking in... is there anything else I can help with at this time? Thank you!

ocefpaf commented 1 year ago

There are some specific details from pyOpenSci to look at. My personal preference would be to simplify the documentation text, but I do not yet have a recommendation on how to do that.

Please feel free to ping me directly or open issues in the GH repo for the docs. Documentation improvements are always very welcomed :-)

ocefpaf commented 1 year ago

Sorry for the delay on my part. I have filled the review template in this gist but am still looking through things.

When you are done please paste them as a comment here. @lwasser will aggregate the reviews later and having them in the original issue helps with the process.

sampottinger commented 1 year ago

I have addressed and closed https://github.com/SchmidtDSE/afscgap/issues/75. Thank you for your help!

lwasser commented 1 year ago

thank you all for being here and for using the templates!! @ocefpaf @7yl4r it is SOOOO very helpful to us to have consistent templates for many reasons. so i truly appreciate your doing this and pasting them here. you can still add any adhoc review text that you wish to add to the template. ✨

sampottinger commented 1 year ago

Thanks everyone! I don't believe there are any tasks / unresolved feedback pending on our end right now. Please let me know if there is anything else we can do. 🥳

7yl4r commented 1 year ago

Package Review

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[x] Vignette(s) demonstrating major functionality that runs successfully locally.
[x] Function Documentation: for all user-facing functions.
[x] Examples for all user-facing functions.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[x] Badges for:
- [x] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [x] Python versions supported,
- [x] Current package version (on PyPI / Conda).
[x] Short description of package goals.
[x] Package installation instructions
[ ] ~~Any additional setup required to use the package (authentication tokens, etc.)~~
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[x] Link to your documentation website.
[ ] ~~If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.~~
[ ] Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

[x] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use
[x] The package is easy to install

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[ ] Performance: Any performance claims of the software been confirmed.
[x] Automated tests:
- [x] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [ ] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [x] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software
[ ] Authors: A list of authors with their affiliations
[ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
[ ] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 2

Review Comments

Just a few general comments on my experince digging into this package:

The README is long; I have a personal preference for a very terse README that links out to the heavy documentation.
Badges should technically be above the package description according to review template, but I don't see this as important.
Overall I found the documentation to be extremely detailed - perhaps overwhelmingly so. Ensuring the quickstart examples and notebooks are front and center is very important. The readme already does a good job of this, but perhaps links out to more usage examples would be beneficial.

ocefpaf commented 1 year ago

@sampottinger I move to the next stage, "awaiting reviewers changes" but I do believe you implement all of the suggestions already, right? @7yl4r and @ayushanand18 is there anything pending on your side?

7yl4r commented 1 year ago

Everything looks good on my side. I can tell a lot of great work has gone into this project and it is in good shape.

pyOpenSci / software-submission