[REVIEW]: Clouddrift: a Python package to accelerate the use of Lagrangian data for atmospheric, oceanic, and climate sciences

editorialbot commented 1 month ago

Submitting author: !--author-handle-->@selipot@AnjaliSandip<!--end-editor-- Reviewers: @rcaneill, @malmans2 Archive: Pending

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/7c264f98cf7182b12a0ec70b84897dee"><img src="https://joss.theoj.org/papers/7c264f98cf7182b12a0ec70b84897dee/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/7c264f98cf7182b12a0ec70b84897dee/status.svg)](https://joss.theoj.org/papers/7c264f98cf7182b12a0ec70b84897dee)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@selipot & @rcaneill & @malmans2, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @AnjaliSandip know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @rcaneill

📝 Checklist for @malmans2

editorialbot commented 1 month ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 1 month ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.pocean.2008.02.002 is OK
- 10.1016/j.ocemod.2017.11.008 is OK
- 10.25921/x46c-3620 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot commented 1 month ago

Software report:

github.com/AlDanial/cloc v 1.90  T=0.01 s (599.4 files/s, 24727.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Markdown                         2             25              0             79
TeX                              1              2              0             35
YAML                             1              0              2             22
-------------------------------------------------------------------------------
SUM:                             4             27              2            136
-------------------------------------------------------------------------------

Commit count by author:

     6  Shane Elipot
     2  Kevin Santana
     1  Philippe Miron
     1  Philippe Miron, Ph. D

editorialbot commented 1 month ago

Paper file info:

📄 Wordcount for paper.md is 919

✅ The paper includes a Statement of need section

editorialbot commented 1 month ago

License info:

✅ License found: Apache License 2.0 (Valid open source OSI approved license)

editorialbot commented 1 month ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

rcaneill commented 1 month ago

Review checklist for @rcaneill

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/Cloud-Drift/clouddrift?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@selipot) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

malmans2 commented 1 month ago

Review checklist for @malmans2

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/Cloud-Drift/clouddrift?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@selipot) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

malmans2 commented 1 month ago

Hi @selipot,

I'll try to review this submission next week, sorry about the delay. I'll open issues if I have recommendations, and will provide a final recap here once I'm done. I have one question regarding the authorship: Rick Lumpkin is the only author who did not contribute to the code on GitHub. Could you please clarify how they contributed to Clouddrift?

Thanks!

AnjaliSandip commented 1 month ago

@rcaneill How close are you to completing the review?

malmans2 commented 1 month ago

Hi everyone,

I've been reviewing this submission for the past few days and have opened issues as needed. Here is a summary of the outstanding issues:

Contribution and Authorship: Needs clarification. See this comment.
Functionality/Performance/Example Usage/Functionality Documentation: My goal was to assess these points by running the examples and demo notebooks. However, the example usage and the notebook I tried did not work for me. See issues #449 and #452.
Installation Instructions: The dependencies upon which Clouddrift is built are not clearly mentioned anywhere. See issue #446.
State of the Field: I don't see a discussion about the state of the field in the paper. Is Clouddrift unique and cannot be compared to other libraries, whether Python or otherwise? This is a genuine question, as I'm not an expert in Lagrangian data at all.
References: Xarray is the only library mentioned in the paper, but I see a dozen of mandatory dependencies. Is Clouddrift meant to be an extension of Xarray? If yes, I suggest making it clearer in the paper. If no, I think more core dependencies need to be mentioned and cited.

I will finish my review when the example code/notebooks are fixed. If I was unlucky and picked the wrong notebook, please let me know which notebooks I should try.

rcaneill commented 1 month ago

@rcaneill How close are you to completing the review?

I hope to finish it on Friday

rcaneill commented 3 weeks ago

Hi,

I did my review, here are the comments:

Contribution and authorship: @selipot made major contributions. However, I do not see any commit from Rick Lumpkin
Reproducibility: importing numpy was missing in the paper https://github.com/Cloud-Drift/clouddrift/pull/455
Example usage: apart from the comment that @malmans2 made, the mosaic example is also broken: https://github.com/Cloud-Drift/clouddrift/issues/456 I believe that the examples notebooks are very good and full of information and should also be included into the main documentation
References: I think that xarray should be properly cited, and not only mentioned with an url
Installation instruction: It seems that the deps are handled by hand. The devs deps are found in file CONTRIBUTING.md (will be solved by https://github.com/Cloud-Drift/clouddrift/pull/451)

As @malmans2 , I'll have a proper look when the binder notebooks will be fixed. I did not re-write some comments I had that @malmans2 already covered, and I moved the comments I initially made on the checklist comment in this comment (I had misunderstood how to proceed)

selipot commented 3 weeks ago

Hi @selipot,

I'll try to review this submission next week, sorry about the delay. I'll open issues if I have recommendations, and will provide a final recap here once I'm done. I have one question regarding the authorship: Rick Lumpkin is the only author who did not contribute to the code on GitHub. Could you please clarify how they contributed to Clouddrift?

Thanks!

While Rick Lumpkin did not commit directly to the repo, he is an official collaborator of the NSF grant which is funding this work. He has consistently joined our weekly meeting to discuss strategies, datasets, and functionality of the software. As such, he is an integral part of the team that produced this software.

malmans2 commented 3 weeks ago

OK, we just needed this on record. I ticked the authorship box.

kevinsantana11 commented 2 weeks ago

commenting here to help Shane with any necessary future changes

AnjaliSandip commented 2 weeks ago

@selipot How close are you to finishing addressing the issues raised by the reviewers?

selipot commented 2 weeks ago

@selipot How close are you to finishing addressing the issues raised by the reviewers?

Very close! expect to hear from me in the following days.

selipot commented 1 week ago

Hi everyone,

I've been reviewing this submission for the past few days and have opened issues as needed. Here is a summary of the outstanding issues:

Contribution and Authorship: Needs clarification. See this comment.

This has now been clarified.

Functionality/Performance/Example Usage/Functionality Documentation: My goal was to assess these points by running the examples and demo notebooks. However, the example usage and the notebook I tried did not work for me. See issues #449 and #452.

We reviewed and assessed that the clouddrift-examples notebooks were not the best to showcase the library, having been designed a long time ago. We have therefore archived that repository. In the main README.md of the library we know point to gdp-get-started, mosaic-get-started, hurdat2-get-started, 3 notebooks that are running well, including in a binder environment. In addition, we link to a more advanced repository with further examples using a large cloud-based Lagrangian dataset.

Installation Instructions: The dependencies upon which Clouddrift is built are not clearly mentioned anywhere. See issue #446.

This issue has now been resolved.

State of the Field: I don't see a discussion about the state of the field in the paper. Is Clouddrift unique and cannot be compared to other libraries, whether Python or otherwise? This is a genuine question, as I'm not an expert in Lagrangian data at all.

To address this concern we have now added a new paragraph in the text of the paper (lines 31-45).

References: Xarray is the only library mentioned in the paper, but I see a dozen of mandatory dependencies. Is Clouddrift meant to be an extension of Xarray? If yes, I suggest making it clearer in the paper. If no, I think more core dependencies need to be mentioned and cited.

This has now been addressed in lines 31-45 of the revised paper.

I will finish my review when the example code/notebooks are fixed. If I was unlucky and picked the wrong notebook, please let me know which notebooks I should try.

The three notebooks mentioned above are now fixed.

selipot commented 1 week ago

I did my review, here are the comments:

Contribution and authorship: @selipot made major contributions. However, I do not see any commit from Rick Lumpkin

The contributions of Rick Lumpkin to the paper and software have now been clarified.

Reproducibility: importing numpy was missing in the paper 🐛 fix missing numpy in the paper's example Cloud-Drift/clouddrift#455

Now fixed.

Example usage: apart from the comment that @malmans2 made, the mosaic example is also broken: 🐛 binder mosaic is broken Cloud-Drift/clouddrift#456 I believe that the examples notebooks are very good and full of information and should also be included into the main documentation

These notebooks are now fixed. In a future version of the documentation we will probably include the notebooks but prefer to dissociate the ones mentioned above from the library at the moment.

References: I think that xarray should be properly cited, and not only mentioned with an url

Xarray is now properly cited, along with other core dependencies.

Installation instruction: It seems that the deps are handled by hand. The devs deps are found in file CONTRIBUTING.md (will be solved by ⭐ optional dep in pyproject.toml Cloud-Drift/clouddrift#451)

The installation instructions have now been updated.

As @malmans2 , I'll have a proper look when the binder notebooks will be fixed. I did not re-write some comments I had that @malmans2 already covered, and I moved the comments I initially made on the checklist comment in this comment (I had misunderstood how to proceed)

selipot commented 1 week ago

@AnjaliSandip we have now completed our revision. @malmans2 and @rcaneill thank you so much for your detailed and thoughtful comments. We hope that we have satisfactorily revised the paper and library and addressed each one of your comments (See point by point replies above).

malmans2 commented 1 week ago

@editorialbot generate pdf

editorialbot commented 1 week ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

malmans2 commented 1 week ago

Thanks @selipot for addressing my comments.

I think something is still not quite right with the authorship and Zenodo:

The paper authors are: Elipot, Miron, Curcic, Santana, and Lumpkin.
Lumpkin is missing in the .zenodo.json.
Lumpkin is missing in the version linked in the README, and there is an additional author: Vadim Bertrand.
Lumpkin is missing in the latest version on Zenodo, and the version in the title is not correct (it should be 0.36 instead of 0.35).

If the Zenodo authors are different on purpose, I'm not sure whether this is allowed by JOSS, so maybe the editor @AnjaliSandip can comment on that?

Other than that, in my opinion, the submission is in line with JOSS standards and is ready for publication. Clouddrift looks like a great tool to deal with Lagrangian data. I will advertise it with my former group at JHU that works extensively with Lagrangian analysis applied to physical oceanography. On this regard, you might find this JOSS paper interesting: Seaduck

kthyng commented 1 week ago

@editorialbot remove @selipot as reviewer

editorialbot commented 1 week ago

@selipot removed from the reviewers list!

AnjaliSandip commented 1 week ago

@selipot Could you address @malmans2 question on authorship?

selipot commented 1 week ago

Thanks @selipot for addressing my comments.

I think something is still not quite right with the authorship and Zenodo:

The paper authors are: Elipot, Miron, Curcic, Santana, and Lumpkin.

Lumpkin is missing in the .zenodo.json.

Lumpkin is missing in the version linked in the README, and there is an additional author: Vadim Bertrand.

Lumpkin is missing in the latest version on Zenodo, and the version in the title is not correct (it should be 0.36 instead of 0.35).

If the Zenodo authors are different on purpose, I'm not sure whether this is allowed by JOSS, so maybe the editor @AnjaliSandip can comment on that?

We have now made sure that the JOSS list of authors matchs the list of authors in the zenodo archive by adding Rick Lumpkin to the .zenodo.json file. His addition as an author will be triggered with the next release if @AnjaliSandip thinks that is appropriate?

Other than that, in my opinion, the submission is in line with JOSS standards and is ready for publication. Clouddrift looks like a great tool to deal with Lagrangian data. I will advertise it with my former group at JHU that works extensively with Lagrangian analysis applied to physical oceanography. On this regard, you might find this JOSS paper interesting: Seaduck

Thank you! We will definitely have a look at Seaduck.

selipot commented 1 week ago

@malmans2 the title of the zenodo archive should now reflect the correct version, matching the release.

rcaneill commented 1 week ago

@editorialbot generate pdf

editorialbot commented 1 week ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

rcaneill commented 1 week ago

Hi @selipot, thanks for addressing the comments, I have now checked all boxes of the checklist. Thanks to the updated notebooks, I have been able to tests the library on binder and everything works as expected!

I have one last comment that should not be hard to address: I struggled to understand how to produce a compatible xarray Dataset. Correct me if I’m wrong, but you only need a dataset with 1 dimension that represent the observation (e.g. "obs"), one dimension that represent each individual particle (e.g. "traj"), and one variable along "traj" that contains the rowsize. If I understood this correctly, could you provide a minimal example with synthetic data in the documentation? This would be more or less the same example as the velocity in the paper, but written as xarray Dataset. I believe that it would help new beginners.

When this last issue is fixed, to my opinion, the publication will be ready for publication!

selipot commented 1 week ago

@rcaneill I have a PR to add the example you requested. Would you be able to have a look?

rcaneill commented 1 week ago

@rcaneill I have a PR to add the example you requested. Would you be able to have a look?

@selipot this is perfect :)

selipot commented 1 week ago

@AnjaliSandip @rcaneill PR has been merged and documentation is updated

AnjaliSandip commented 1 week ago

Thanks @selipot! @rcaneill @malmans2 is the paper ready for publication?

rcaneill commented 1 week ago

Thanks @selipot! @rcaneill @malmans2 is the paper ready for publication?

Yes!

malmans2 commented 1 week ago

Thanks @selipot! @rcaneill @malmans2 is the paper ready for publication?

Yes!

AnjaliSandip commented 6 days ago

@editorialbot generate pdf

AnjaliSandip commented 6 days ago

@editorialbot check references

editorialbot commented 6 days ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.pocean.2008.02.002 is OK
- 10.1016/j.ocemod.2017.11.008 is OK
- 10.25921/x46c-3620 is OK
- 10.5194/gmd-12-3571-2019 is OK
- 10.21105/joss.02425 is OK
- 10.5281/zenodo.4547006 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.5281/zenodo.3509134 is OK
- 10.25080/Majora-92bf1922-00a is OK
- 10.3389/fclim.2021.782909 is OK

MISSING DOIs

- None

INVALID DOIs

- https://doi.org/10.1175/BAMS-D-14-00110.1 is INVALID because of 'https://doi.org/' prefix
- https://doi.org/10.5334/jors.148 is INVALID because of 'https://doi.org/' prefix

editorialbot commented 6 days ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

AnjaliSandip commented 6 days ago

@selipot Can you fix the invaid DOIs?

philippemiron commented 6 days ago

@AnjaliSandip It should be fixed now.

AnjaliSandip commented 6 days ago

@selipot can you make a new release of the software being reviewed and deposit a new archive of the software with Zenodo/figshare?

AnjaliSandip commented 6 days ago

@editorialbot check references

editorialbot commented 6 days ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.pocean.2008.02.002 is OK
- 10.1016/j.ocemod.2017.11.008 is OK
- 10.25921/x46c-3620 is OK
- 10.5194/gmd-12-3571-2019 is OK
- 10.1175/BAMS-D-14-00110.1 is OK
- 10.21105/joss.02425 is OK
- 10.5281/zenodo.4547006 is OK
- 10.5334/jors.148 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.5281/zenodo.3509134 is OK
- 10.25080/Majora-92bf1922-00a is OK
- 10.3389/fclim.2021.782909 is OK

MISSING DOIs

- None

INVALID DOIs

- None

philippemiron commented 6 days ago

@AnjaliSandip Just published a new version. And the zenodo archive was updated (https://doi.org/10.5281/zenodo.11081647).

openjournals / joss-reviews