[REVIEW]: STITCHES: a Python package to amalgamate existing Earth system model output into new scenario realizations

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@abigailsnyder@observingClouds<!--end-editor-- Reviewers: @znicholls, @Zeitsperre Archive: 10.5281/zenodo.11094934

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/ad81e6a435c13ae644a7ca8cb0ffbc35"><img src="https://joss.theoj.org/papers/ad81e6a435c13ae644a7ca8cb0ffbc35/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/ad81e6a435c13ae644a7ca8cb0ffbc35/status.svg)](https://joss.theoj.org/papers/ad81e6a435c13ae644a7ca8cb0ffbc35)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@znicholls & @Zeitsperre, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @observingClouds know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @znicholls

📝 Checklist for @Zeitsperre

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 1 year ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.04 s (1004.0 files/s, 138666.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          23            675            957           1664
Jupyter Notebook                 2              0            786            406
reStructuredText                 5            290            308            260
Markdown                         7             62              0            223
TeX                              1              4              0             52
YAML                             2             10              4             45
DOS Batch                        1              8              1             26
make                             1              4              7              9
-------------------------------------------------------------------------------
SUM:                            42           1053           2063           2685
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 1 year ago

Wordcount for paper.md is 857

editorialbot commented 1 year ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5194/esd-2022-14 is OK
- 10.1017/9781009157940.001 is OK
- 10.5194/gmd-9-1937-2016 is OK
- 10.5194/gmd-9-3461-2016 is OK
- 10.1038/nclimate3310 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

observingClouds commented 1 year ago

Welcome to the review process 🎉 @znicholls, @Zeitsperre please start your review by typing @editorialbot generate my checklist in a comment below.

znicholls commented 1 year ago

Review checklist for @znicholls

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/JGCRI/stitches?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@abigailsnyder) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
- I agree that summary and statement of need are backwards
- The summary could be made clearer for non-specialist audiences, but it is there. (As a suggestion, the summary could say something more like, "There is a need to inspect the interaction between climate change and impacts. At the moment this isn't possible with our most expensive tools. This package provides a way to build that link and examine the interaction without the computational cost. In a scientific paper, it has been shown that this method does not come with unworkably large errors [cite Tebaldi paper])"
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
- Also there and states needs very clearly. References to other work could be better explained and fleshed out I think, particularly discussion of why other tools (e.g. MESMER, METEOR I would guess?) don't achieve what STITCHES does. A clearer reference to the full scientific explanation in this section would also be helpful I think as that would be what scientific readers need to fully understand how the tool works.
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
- I don't know of any other package which attempts to do such stitching. This package does quite a lot of ESGF data handling so it could be compared to other packages which do such handling and processing such as ESMValTool, but that could also be out of scope.
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

znicholls commented 1 year ago

Just an FYI, I probably won't get to this until the end of the month unfortunately

Zeitsperre commented 1 year ago

Hi there! I'll be taking a look at this very soon, hopefully next week. Thanks again for reaching out to me, @observingClouds.

Zeitsperre commented 1 year ago

Review checklist for @Zeitsperre

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/JGCRI/stitches?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@abigailsnyder) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
- Data is installed via a helper function available at the top-level of the library. Points to a Zenodo DOI repository.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
- No results in paper.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
- Package sources are only available via GitHub (PyPI?)
[x] Functionality: Have the functional claims of the software been confirmed?
- The quick-start examples can be reproduced locally on Linux. Code logic assumes POSIX environment (no Windows support).
- Generating new data does not seem to be feasible (stitches.make_tas_archive) as fetch calls to pangeo are not "lazy" (estimated ~8h to collect data values on a reasonably fast connection; memory requirements are probably significant).
- Stitches largely operates with direct calls to values (loading operations) and uses pandas DataFrames for internal logic before converting back to xarray/NetCDF. Xarray is used for fetching data, and nearly all operations are xarray-compatible. Why not leverage xarray/intake data stores to reduce computational/memory/bandwidth load?
- Data values are loaded early and/or copied very often within the logic (slowdowns, thread safety concerns).
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally, these should be handled with an automated package management solution.
- Dependencies for installation of the package are present in requirements.txt. Requirements pin pandas<1.5 (should be updated) and use pkg_resources (deprecated).
- Documentation build requirements (recipe with sphinx + sphinx extensions) are not present in setup.py or repository.
- Testing requirements make use of standard libraries (no additional dependencies needed).
- Package metadata in setup() doesn't adhere to PEP standards, no use of package metadata classifiers (https://pypi.org/classifiers/).
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
- An example notebook is present within the repository and built documentation, showcasing general purpose usage: construction of scenarios, as well as validation that the scenarios do not diverge significantly from source data.
- Documentation is unclear on the data schema that is expected for the primary analysis operations (Necessary fields? Data formats? Expected outputs?)
- It isn't clear if the library can work with other existing intake-esm data stores or is hardcoded for the data facets seen in https://storage.googleapis.com/cmip6/pangeo-cmip6.json.
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
- API is showcased in the documentation. Functions use ReST-format docstring, but are not statically-typed.
- Docstrings do not really follow a consistent formatting, mostly adhere to conventions (https://peps.python.org/pep-0257/).
- Package API imports all functions to the top-level (would significantly benefit from modular divisions between testing/setup functions, analysis functions, and visualization functions).
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
- There are GitHub Workflow CI tests configured for the repository. Tests are found within the package (stitches.tests) but should be moved to top-level / excluded from wheel.
- Testing configuration relies on outdated installation method ($ python setup.py install).
- Testing setup relies on python calls to library (python -c 'import stitches; stitches.install_package_data()') but this should be made part of the testing setup stage or exposed via a CLI.
- Only Linux * Python3.9 is tested (would benefit from a testing matrix). GitHub Workflows are using deprecated Actions (actions/checkout@v1). Running tests locally shows that they pass (unittest or pytest) but shows many DeprecationWarnings.
- Code coverage reporting via Codecov seems to be misconfigured and not up-to-date (local testing shows 52%; Badge shows 8%).
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
- Repository contains a CONTRIBUTING.md guide specific for the project.
- Authors ask that contributors clarify whether contributions are copyright-restricted (mentions of a project called "Hector"?) - this could be made easier with Pull Request Templates.
- No Issue Templates or Pull Request Templates.
- Testing/tooling setup and metrics are not mentioned in contributor guidelines.
- No guidance is mentioned on how to report problems with software
- Repository and ReadMe identify Code of Conduct (Contributor Covenant v2.0). No contact method is listed for reporting abuse.

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
- Summary and Statement of Need appear to be mislabelled.
- Summary (Statement of Need) is relatively clear and highlights well how stitches is unique in its approaches.
- Summary (Statement of Need) could be shortened and made more accessible to general audiences.
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
- Statement of Need (Summary) covers the value of stitches within the field of impact analysis.
- The value of output from stitches to help in impact modelling is made clear, but the intended audience is only explicitly stated in the final paragraph. Should be stated earlier.
- Stitches builds on existing science frameworks with a unique approach, but could be clearer in stating the relative strengths/weaknesses of its approach versus others (speed? versatility? portability? statistical/physical consistency?)
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
- I'm not familiar with python packages that aid in generating new variable curves from ESM runs via stitched model data, but it would be good to briefly talk about others that may exist in the scientific Python community (I will search around as well).
- It would be interesting to mention the unique value of stitches in scenario development as compared to multimodel ensemble statistics.
- Other packages are not explored. Mention is made to PANGEO and PANGEO-backed software (xarray), but no comparisons are made to other scenario-building packages.
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
- Paper could benefit from some editing; some repetitive statements, some redundant phrases could be removed.
- A few statements require citations; "While many existing ESM emulation methods rely on ‘bottom up’ methods", "Research from the climate science community has indicated that many ESM output variables are tightly dependent upon the GSAT trajectory and thus scenario independent", "emulators trained with bottom- up methods often can only handle a small number of variables jointly (e.g. temperature and precipitation)", etc.
- The description of the library could benefit from an explanation of the data structure or a brief summary of the capabilities or core functionality.
- An example of the expected outputs from the tool (e.g. Quickstarter examples) would help showcase the capabilities of stitches.
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Zeitsperre commented 1 year ago

@observingClouds @abigailsnyder

My review here is complete. I think that STITCHES does something quite unique that I haven't seen before, and can genuinely see the value of it in performing global-scale impact modelling analyses. Thanks again for asking me to review this software.

I think there are quite a few improvements that can be done (many of them are relatively low-effort) that I would gladly open as issues (or submit as fixes in Pull Requests) if they are welcome (and if Pull Requests from reviewers are allowed from reviewers). Please let me know.

observingClouds commented 1 year ago

Thanks @Zeitsperre for the update! Please open issues over in the STITCHES' repository. If it helps the discussion of the issue feel free to open a PR as well.

znicholls commented 1 year ago

@observingClouds re conflict of interest: I have published with both Kalyn and Claudia previously. Sorry I should have read that more closely earlier. I would request that the COI is simply noted here (and ideally waived) given that the community is relatively small and I think I can provide an impartial review nonetheless.

znicholls commented 1 year ago

I have also completed my review. In the process, I have made a number of issues (all cross-linked here) to follow up with areas where I haven't ticked the box above yet.

In general, I think I could use the package without too much trouble but I would really struggle to extend it beyond its main use case. The major reason is that many of the internal data is built around pandas data frames, but it wasn't clear to me where I should look to understand what sort of form these data frames should take or what the data in them should be. This may require a separate section on the various 'models' used in the code (e.g. https://pyam-iamc.readthedocs.io/en/stable/data.html). Such sections can be a bit painful to write and maintain, but they are generally very helpful for helping new users and maintainers to understand how the system works. In particular, it wasn't clear to me what the recipe should look like nor the archive data (I could sort of guess from the examples, but I am not sure I would guess correctly so explicit docs could be very helpful). The other option would be to use something like pandera to add more structure to the data frames and communicate more explicitly what each column should be and means. Each option comes with pro's and con's, but I think I would find it very hard to build a mental model of the package as it is currently written and documented.

Arising from the above, there were a few areas where the functionality of the package wasn't super clear to me. I think adding a section like the 'model' section above and/or refactoring would make it much clearer what is going on. The issues were:

why are 'ensemble', 'experiment' and 'model' needed as part of the target_data? Could this not work just based on grouping by everything except the key columns of variable, year and value? In addition, shouldn't there be some units in target_data and perhaps also a reference period? Using a different reference period from that which was used to create the archive in the first place could cause havoc no?
Is the pangeo table hard-coded? This is probably fine for CMIP6, but seems problematic as we rapidly move beyond CMIP6 (perhaps this can be left for future work, but it seemed a bit odd to me to not give users a way to use a different archive if they want)
Is the historical/scenario split also intentionally hard-coded to know about the SSPs? This also seems like it could be problematic if anyone wanted to use STITCHES in a different context or after CMIP6 (or before, e.g. CMIP5).
Does the stitching only pull data from one model or can you end up with stitching that joins together windows/samples from multiple different models (e.g. CanESM5 and NASA-GISS output)? Reading the diagram, I think the answer is you can have more than one model but looking at the code, it wasn't super clear to me (e.g. this line csv_to_load = [file for file in all_files if (model in file)][0] in gmat_stitching confused me, maybe the stitching_id handles this?)

In general, the repo would also benefit from the use of some other tools e.g. code linters and auto-formatters. They would make the code much more readable and introduce very little cost (at least in my opinon).

Zeitsperre commented 1 year ago

(Hi all, I've been meaning to open some issues/PRs in the repo (and will when I find a minute), but I wanted to piggyback on Zeb's great comments here)

In general, I think I could use the package without too much trouble but I would really struggle to extend it beyond its main use case. The major reason is that many of the internal data is built around pandas data frames, but it wasn't clear to me where I should look to understand what sort of form these data frames should take or what the data in them should be. This may require a separate section on the various 'models' used in the code (e.g. https://pyam-iamc.readthedocs.io/en/stable/data.html). Such sections can be a bit painful to write and maintain, but they are generally very helpful for helping new users and maintainers to understand how the system works.

I am very much in agreement with this. Having worked with CORDEX/CMIP5/CMIP6 data for many years, I also felt it wasn't entirely clear how I could format my local NetCDF collections for use in STITCHES. A data model would provide a first step towards implementing methods for generalizing use-cases.

In particular, it wasn't clear to me what the recipe should look like nor the archive data (I could sort of guess from the examples, but I am not sure I would guess correctly so explicit docs could be very helpful). The other option would be to use something like pandera to add more structure to the data frames and communicate more explicitly what each column should be and means. Each option comes with pro's and con's, but I think I would find it very hard to build a mental model of the package as it is currently written and documented.

I mentioned in my review that xarray and intake are used to fetch data by STITCHES and I had found it odd that efforts were done to convert their Dataset format to CSVs and Pandas DataFrames. The xarray format is quite robust/extensible (https://docs.xarray.dev/en/stable/user-guide/data-structures.html) and preserves the metadata fetched from Pangeo. With tools intake and dask, it allows for methods of structuring subsetted requests for data in advance of GET requests, significantly reducing download time/memory requirements. I'm not familiar with Pandera, but it seems to have integrated dask support as well. Adherence to a standardized data structure would be essential to making the project much more portable.

In general, the repo would also benefit from the use of some other tools e.g. code linters and auto-formatters. They would make the code much more readable and introduce very little cost (at least in my opinon).

In my work, I do a fair amount of package maintenance and coding standards enforcement, and would be willing to give a hand in this way. If my schedule allows for it, I'd be more than happy to open a PR.

observingClouds commented 1 year ago

Thank you @Zeitsperre and @znicholls for your review and starting the discussion on some action-items. Based on the agreement between your reviews, I suggest @abigailsnyder and their colleagues to already go ahead and address these common issues/suggestions.

observingClouds commented 1 year ago

@znicholls, regarding your potential COI, could you let me know in which aspect you fall under the JOSS COI policy? Thank you!

znicholls commented 1 year ago

could you let me know in which aspect you fall under the JOSS COI policy?

In which I aspect I fall under or fail under? I have published papers with both Kalyn and Claudia in the last two years. I am also on a scientific steering committee with Claudia and plan to publish with Kalyn again in the next 12 months.

abigailsnyder commented 1 year ago

Thanks very much to the reviewers for their feedback, I look forward to addressing it! I see you have both opened issues and/or PRs, but please feel free to open any additional ones.

We might be slightly slower addressing these than is typical for our group due to various folks being out on vacations at different points.

Thank you so much again!

observingClouds commented 1 year ago

could you let me know in which aspect you fall under the JOSS COI policy?

In which I aspect I fall under or fail under? I have published papers with both Kalyn and Claudia in the last two years. I am also on a scientific steering committee with Claudia and plan to publish with Kalyn again in the next 12 months.

Thanks @znicholls for giving me additional detail about your COI. Upon consultation with the JOSS team, we will waive the COI for this submission. As you mentioned already, for the future it would be great to know about any potential COI earlier. Maybe we also need to adjust the workflow on our side and ask for the COI before the start of the review process.

observingClouds commented 1 year ago

@abigailsnyder just checking back if the comments and issues could already be addressed. Please ping me here if you have done so. Thank you.

abigailsnyder commented 1 year ago

Thank you for checking in! I have time blocked out this week and next to address review comments on this

abigailsnyder commented 1 year ago

@observingClouds We are working on this still (and juggling summer schedules) and are hoping to have it ready to go by end of next week. Thank you for your patience! We will likely do a full R2R all in one place when it is ready.

observingClouds commented 1 year ago

Hi @abigailsnyder, I just want to check back and see how you are progressing. Please let me know if you have any questions. Cheers, Hauke

kdorheim commented 1 year ago

Hi @observingClouds, @abigailsnyder is currently out of the office we are waiting for another coauthor to respond, but hope to resolve the outstanding issues as quickly as possible. Thanks!

kthyng commented 1 year ago

@kdorheim @abigailsnyder What is the timeframe for this response? We'd like to label this submission as "paused" if it will be more than another week, and we would request a withdraw and resubmission if it will be more than a few weeks given the already extended wait here. (we can't keep the reviewers on the hook for too long...)

abigailsnyder commented 1 year ago

@kthyng I will check with the co-author whose time has been a challenge and get back with you. Thank you

abigailsnyder commented 1 year ago

@kthyng my co-author thinks working on this tomorrow and getting the remaining parts done in a week is feasible. The main challenge in their availability has been family caregiving, so it's a little hard to predict. We have addressed most of the points and I'm happy to post the in-progress R2R document, if that's helpful. Or I'm more than happy to enter this submission into a paused phase if that would work better.

I appreciate the journal's patience and willingness to work with us during a period where more than a few of us have had big life things coming up.

kthyng commented 1 year ago

@abigailsnyder I am very sympathetic to the realities of life! Let's see if we can pause this for 1 month while your team works in the background.

It sounds like the authors have some steps to do on their end that are taking some extra time, but if the reviewers @znicholls, @Zeitsperre have any steps to do on their end that don't overlap, please consider doing so.

Separately, @znicholls, @Zeitsperre are you ok with a 1 month pause on this submission after which you would be requested to finish reviewing this submission (after any items you might be able to do now)?

Zeitsperre commented 1 year ago

All good for me. Life happens.

Some of the comments we made require some significant changes, so I completely understand the need to not rush the implementation.

I had been holding off on opening PRs as the changes might be significant (mainly formatting). If things are on hold, this might be a good time to propose them. Will open a PR if I have some time next week.

znicholls commented 1 year ago

Yep fine for me too (I don't have any outstanding things to do so this is just sitting in the background for me)

kthyng commented 1 year ago

Ok I will "pause" this submission and let's get back together on here by the end of November.

kthyng commented 1 year ago

Ok we are about at the time of our original noted timing. @abigailsnyder How are things looking on the author side of things? @Zeitsperre will you be able to open an issue for the comments you had?

Zeitsperre commented 1 year ago

@kthyng Yes, can do in the next day or so!

abigailsnyder commented 1 year ago

@kthyng - I checked with my co-author today and he is confident he can complete the remaining pieces this week. Thanks

kthyng commented 1 year ago

ok thanks all! I'll remove the "paused" label.

observingClouds commented 11 months ago

Hi @abigailsnyder, Now that we are "back in business" I want to check how things are progressing. I still see some open issues in the repository. Maybe you can give us a short update on your plans? Thank you!

abigailsnyder commented 11 months ago

@observingClouds We are reconvening Monday postAGU to get things finished off

abigailsnyder commented 11 months ago

@observingClouds We have addressed all comments and performed a new release of stitches (https://github.com/JGCRI/stitches/releases/tag/v0.11.0). All changes in response to the review here are now on main. Attached here is a PDF of our R2R. We pulled every comment we could find across this review, issues and PRs opened in the stitches repo by reviewers into one place to organize and address.

We thank the reviewers and editors for their thorough comments and patience as we have navigated this revision. As it is about to be Winter Holidays, we may be slow to respond in the coming weeks. Hopefully the reviewers and editors will be as well, enjoying a break. We look forward to addressing any additional comments or new reviews in the new year.

stitches_joss_r2r.pdf

observingClouds commented 10 months ago

@editorialbot generate pdf

editorialbot commented 10 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

observingClouds commented 10 months ago

Dear @abigailsnyder, Thank you for your response to the reviewers' comments. You mentioned a few changes you made to the manuscript, e.g. referencing the MESMER tool, however, I currently cannot find these changes in paper.md. Could you please confirm that those changes are pushed to GitHub? As soon as those changes are visible, I'll notify the reviewers to take one last look at your responses and changes.

Thank you.

Cheers, Hauke

abigailsnyder commented 10 months ago

@observingClouds Thank you for catching this! It was one of our earlier changes we made and it got stranded on a branch. It has been merged into main and can be seen at https://github.com/JGCRI/stitches/blob/main/paper/paper.md now Thanks again! Abigail

observingClouds commented 10 months ago

@editorialbot generate pdf

editorialbot commented 10 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

observingClouds commented 10 months ago

Thanks @abigailsnyder for merging the changes.

observingClouds commented 10 months ago

Dear @Zeitsperre, Dear @znicholls, Thank you very much for your review and also sharing your expertise with e.g., setting up the linters and opening some pull requests. Now that your comments have been responded too, I like you to have a last check if everything has been addressed accordingly. In particular, I like you to:

check all boxes in your checklist if those checkpoints are now satisfied
check if the code/functions are now well enough documented so that potential new collaborators could extend the functionality to files/scenarios beyond CMIP6, e.g. is it clear how the input data looks like. You both mentioned that this was not clear so far. While I do not expect that this software package currently supports much beyond CMIP6, I do want to make sure that it can be expanded in the future and could be useful during the next CMIPs. Thank you very much for your time and expertise.

Hauke

znicholls commented 10 months ago

Thanks Hauke, on the to-do list for next week

znicholls commented 10 months ago

@observingClouds I've gone through things again. I think this is now good enough.

My only hesitation was on the functionality question. A key step seems to be pre-processing of CMIP6. In my opinion, this still isn't documented in any real sense. I note that you could argue that this pre-processing is not part of the package, but in order to use it this pre-processed data is key so I would be more comfortable if it were included.

I would also note that the only installation option is still from source. That's fine, it works, but it is a bit odd for a Python package where the barrier to releasing on pypi is so low.

My other comments are in https://github.com/JGCRI/stitches/issues/78, they are not review blocking though.

observingClouds commented 10 months ago

Thanks for your feedback @znicholls. This is very much appreciated.

openjournals / joss-reviews