[REVIEW]: quadcleanR: An R Package for the Cleanup and Visualization of Quadrat Data

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@DominiqueMaucieri@fboehm<!--end-editor-- Reviewers: @ViniciusBRodrigues, @diazrenata Archive: 10.5281/zenodo.7222742

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/613731dfb5cfe31aad5ce73b88280f6b"><img src="https://joss.theoj.org/papers/613731dfb5cfe31aad5ce73b88280f6b/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/613731dfb5cfe31aad5ce73b88280f6b/status.svg)](https://joss.theoj.org/papers/613731dfb5cfe31aad5ce73b88280f6b)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@ViniciusBRodrigues & @diazrenata, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @fboehm know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @ViniciusBRodrigues

📝 Checklist for @diazrenata

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 2 years ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.06 s (930.9 files/s, 92386.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
R                               35            328            897           1101
Markdown                         5            172              0            682
HTML                             1             90              5            518
Rmd                              4            429            311            479
YAML                             8             36             10            179
TeX                              1             15              0            107
-------------------------------------------------------------------------------
SUM:                            54           1070           1223           3066
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 2 years ago

Wordcount for paper.md is 988

editorialbot commented 2 years ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1371/journal.pone.0130312 is OK
- 10.1353/psc.2004.0021 is OK
- 10.3354/meps138093 is OK
- 10.2307/2257968 is OK
- 10.5962/bhl.title.56234 is OK
- 10.1038/nature14258 is OK
- 10.1038/nature22899 is OK
- 10.1111/ecog.03148 is OK
- 10.2307/2845590 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

ViniciusBRodrigues commented 2 years ago

Review checklist for @ViniciusBRodrigues

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/DominiqueMaucieri/quadcleanR?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@DominiqueMaucieri) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

diazrenata commented 2 years ago

Review checklist for @diazrenata

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/DominiqueMaucieri/quadcleanR?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@DominiqueMaucieri) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

diazrenata commented 2 years ago

Review summary

The primary functionality of the quadcleanR package is the crop_area function, which standardizes the sampling density and effort of data collected from quadrats of different dimensions to allow for robust comparisons between different-sized quadrats. The package also includes a number of data wrangling utility functions and a shiny app for basic exploratory plotting.

I can see this being a useful contribution, given that many ecologists have to confront quadrats of nonstandard area. The functions work as described, the function documentation is sufficient, and the cheatsheet and shiny app help with user onboarding. I do have some suggestions that would help clarify how this package adds to the wider ecosystem of available tools, and to improve the package vignettes as aids to a general user population.

Specific suggestions

Writing

This draft of the manuscript does not contain a current state of the field section identifying how this package complements existing tools.

This would be especially helpful, e.g. for the cleaning functions (a number of which are similar to functionality available in tidyverse, or operations I would be just as likely to code up from scratch specific to a given dataset) and the shiny app (which renders basic exploratory plots that seem similar to what users are probably doing on their own).

(For example - a general user might find it simpler to use dplyr::select or dplyr::filter rather than learn the syntax for quadcleanR::keep_rm, or the dplyr::rename family rather than create an extra dataframe and learn quadcleanR::change_names).

Documentation

I suggest adding structure to the vignettes to tighten the focus on the added-value of the functions provided in this package. For example, the Simple Cleaning Quadrat Data vignette is a detailed narrative walk through cleaning a specific dataset, including a mix of quadcleanR and other package functions woven together. That would make it a great methods supplement to a manuscript, but it's not so easy for a user to read it and quickly understand the quadcleanR functions. There is also a fair amount of repeated text between the different vignettes, which makes it harder for a user to extract the important information from each one.

You don't have to necessarily do it this way, but one suggestion would be to create a more detailed outline of the topics/functions highlighted in each vignette and structuring each section around demonstrating that group of functions.

It may also be helpful to use a lightly preprocessed version of the datasets used in the vignette, to get some of the additional data cleaning out of the way so you can focus the vignette specifically on quadcleanR functions.

Shiny app

The demo version of the shiny app (https://dominiquemaucieri.shinyapps.io/example/) is fun but has a lot of variables that aren't explained. With so many options, it's hard for someone who isn't familiar with this dataset to use the dashboard to generate informative plots. I think it'd be a more useful demo for a naive user if there were just a few options for faceting/grouping variables, and these were either described or given very intuitive names.

Substantive scholarly effort

I leave this up to the editor, and anything clarified in a "state-of-the-field" section of the manuscript, but from my perspective the primary unique contribution for this package is the crop_area function. There are a number of cleaning functions that are excellent utilities, but for me are in a grey area between useful scripts for streamlining a specific data cleaning workflow and general-use utilities that substantially improve on functionality already available. Again, I'm happy to see this clarified with a little more conversation.

Code

Finally, I just want to note that I've run and tested the functions as they are written, but I haven't done extensive stress-testing to see if I can break them with manufactured use cases. I notice that in the source code there isn't a lot of error handling built in. I think this is reasonable for this package because most of the functions are pretty small and lightweight, but in general I would encourage more defensive programming in a package intended for widespread use.

fboehm commented 2 years ago

@ViniciusBRodrigues and @diazrenata - Thank you for the initial reviews! @DominiqueMaucieri - please feel free to respond to the comments above in this comment thread. Thanks again!

fboehm commented 2 years ago

@DominiqueMaucieri - do you have any questions about how to proceed? I'm happy to help. Thanks!

DominiqueMaucieri commented 2 years ago

@fboehm Should I make adjustments first and then respond to reviewer comments, or respond first and then make adjustments after? Thanks

fboehm commented 2 years ago

@DominiqueMaucieri - Please implement the suggestions that the reviewers have shared. Once you've done that, please let us know in this thread. The submission can't be published until the reviewers complete their checklists, so please do what you must to enable them to check the unchecked checklist items. See above for the details of their checklists. Please let me know if you have additional questions.

DominiqueMaucieri commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

DominiqueMaucieri commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

DominiqueMaucieri commented 2 years ago

Thank you @fboehm for your time as editor and @ViniciusBRodrigues and @diazrenata for your reviews and insight, as these comments have helped me to produce a manuscript and package that are more applicable to the wider ecological and conservation community.

Response to @diazrenata:

Writing This draft of the manuscript does not contain a current state of the field section identifying how this package complements existing tools.

This would be especially helpful, e.g. for the cleaning functions (a number of which are similar to functionality available in tidyverse, or operations I would be just as likely to code up from scratch specific to a given dataset) and the shiny app (which renders basic exploratory plots that seem similar to what users are probably doing on their own).

(For example - a general user might find it simpler to use dplyr::select or dplyr::filter rather than learn the syntax for quadcleanR::keep_rm, or the dplyr::rename family rather than create an extra dataframe and learn quadcleanR::change_names).

Response: I have added a current state of the field section to the manuscript explaining why I believe these functions have a place amid the already immense number of cleaning functions in R. What the reviewer has said is completely true, that a comfortable coder “would be just as likely to code up from scratch” some of the functions I have made; however, this package was not created for a comfortable coder, as much as created for researchers who find themselves with quadrat data that is very uncleaned and will take a lot of coding to get their data to a cleaned state. The focus of this package is to standardize the way the quadrats are cropped for analyses, and the secondary focus is to streamline the process of cleaning quadrat data so the time of researchers can be put into the analysis of their data and not just getting the data into a usable state. Therefore, while many of these functions may resemble cleaning functions that already exist, they all have extra options and specifications which allow for these already existing functions to be built upon and accomplish many steps with a single function, especially for quadrat data.

Documentation I suggest adding structure to the vignettes to tighten the focus on the added-value of the functions provided in this package. For example, the Simple Cleaning Quadrat Data vignette is a detailed narrative walk through cleaning a specific dataset, including a mix of quadcleanR and other package functions woven together. That would make it a great methods supplement to a manuscript, but it's not so easy for a user to read it and quickly understand the quadcleanR functions. There is also a fair amount of repeated text between the different vignettes, which makes it harder for a user to extract the important information from each one.

You don't have to necessarily do it this way, but one suggestion would be to create a more detailed outline of the topics/functions highlighted in each vignette and structuring each section around demonstrating that group of functions.

It may also be helpful to use a lightly preprocessed version of the datasets used in the vignette, to get some of the additional data cleaning out of the way so you can focus the vignette specifically on quadcleanR functions.

Response: I began with changing the language around the vignettes. These vignettes do focus on a specific kind of quadrat data, those downloaded from CoralNet, and that has been made clearer. Then to incorporate more kinds of data, I have taken the reviewers suggestion and created a very focused vignette titled “Simple Examples of Functions” to walk package users through the different clean up functions. This vignette follows the same structured outline as the cheat sheet and uses a more preprocessed dataset which I called “corals”.

Shiny app The demo version of the shiny app (https://dominiquemaucieri.shinyapps.io/example/) is fun but has a lot of variables that aren't explained. With so many options, it's hard for someone who isn't familiar with this dataset to use the dashboard to generate informative plots. I think it'd be a more useful demo for a naive user if there were just a few options for faceting/grouping variables, and these were either described or given very intuitive names.

Response: I have gone back through the vignettes and added suggestions for variables to select to get a nice-looking plot. I also added another shiny example in the “Simple Examples of Functions” which is simpler and more intuitive to work with.

Substantive scholarly effort I leave this up to the editor, and anything clarified in a "state-of-the-field" section of the manuscript, but from my perspective the primary unique contribution for this package is the crop_area function. There are a number of cleaning functions that are excellent utilities, but for me are in a grey area between useful scripts for streamlining a specific data cleaning workflow and general-use utilities that substantially improve on functionality already available. Again, I'm happy to see this clarified with a little more conversation.

Response: I will also leave this decision up to the editor. I believe this package will be a useful contribution to the field of ecology and I have already seen this package help collaborators cleanup their data

Code Finally, I just want to note that I've run and tested the functions as they are written, but I haven't done extensive stress-testing to see if I can break them with manufactured use cases. I notice that in the source code there isn't a lot of error handling built in. I think this is reasonable for this package because most of the functions are pretty small and lightweight, but in general I would encourage more defensive programming in a package intended for widespread use.

Response: Thank you for looking into the error handling of my package. I have tried to implement error handling where I can such as when there are different lengths of vectors being used as function parameters. I have gone through and added more error handling code especially to deal with NAs and NaNs, but I would also appreciate hearing if there are ways you would add more error handling in as well.

fboehm commented 2 years ago

Thank you, @DominiqueMaucieri for the many changes! @diazrenata - please review the changes made and respond to them. I agree with the author that this submission likely meets the criterion for substantial scholarly effort.

Thanks again!

fboehm commented 2 years ago

@diazrenata - once you've had a chance to review the revisions from @DominiqueMaucieri , please feel free to update the review checklist, if possible. If additional revisions are needed, please feel free to discuss them here. Thanks again!

fboehm commented 2 years ago

@diazrenata - please see the replies from @DominiqueMaucieri above and feel free to continue the conversation in this thread. Thanks again!

diazrenata commented 2 years ago

@DominiqueMaucieri Thank you for the revision! It looks good to me at this point - one small point on the Shiny app, but totally optional, and more a question of streamlining the user experience!

State of the field/statement of need

Thank you for clarifying this, and highlighting the intention of providing a standardized workflow for novices working with CoralNet data.

Documentation

The Simple Examples of Functions vignette is a great starting place and really helps a user onboard quickly. I also appreciate including a simple version of the data, to facilitate this.

Shiny app

I see the suggested sets of variables in the vignettes, and these are helpful.

On this point - I don't believe this is a necessary change, but it might be helpful for users if there is a description of variables either included or linked to from the Shiny dashboard.

Substantive scholarly effort

Fantastic!

Error handling

Thank you for putting some thought into this! Given that these data are coming from a relatively standardized source, this looks like an appropriate level of error handling to me.

fboehm commented 2 years ago

Thanks so much for your reviews, @ViniciusBRodrigues and @diazrenata !

@DominiqueMaucieri - the reviewers have recommended your submission for publication. There are a few more small steps before it gets published. The next step requires me to proofread the manuscript. I anticipate completing that within a few days. I'll comment here again once I've done so. Thanks again!

fboehm commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

fboehm commented 2 years ago

@DominiqueMaucieri - I've now examined the pdf of the paper. I have one small suggestion for a fix that I ask you to implement:

line 46: "accounting the size" should be "accounting for the size" Thanks again!

fboehm commented 2 years ago

@DominiqueMaucieri - A few more suggestions - in looking at the dois for the pdf, I see that the last one, for the Williams reference, seems to resolve to a book review in a journal. Can you please check on this? Does the doi point to what you want to cite?

fboehm commented 2 years ago

@DominiqueMaucieri - Also, in the references, the shiny reference needs to have a capitalized "R". You can achieve this by putting curly braces around the R in your bib file, {R}. Please fix this.

fboehm commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1371/journal.pone.0130312 is OK
- 10.1353/psc.2004.0021 is OK
- 10.3354/meps138093 is OK
- 10.2307/2257968 is OK
- 10.5962/bhl.title.56234 is OK
- 10.1038/nature14258 is OK
- 10.1038/nature22899 is OK
- 10.1111/ecog.03148 is OK
- 10.2307/2845590 is OK

MISSING DOIs

- None

INVALID DOIs

- None

DominiqueMaucieri commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

DominiqueMaucieri commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1371/journal.pone.0130312 is OK
- 10.1353/psc.2004.0021 is OK
- 10.3354/meps138093 is OK
- 10.5962/bhl.title.56234 is OK
- 10.1038/nature14258 is OK
- 10.1038/nature22899 is OK
- 10.1111/ecog.03148 is OK
- 10.2307/2845590 is OK

MISSING DOIs

- None

INVALID DOIs

- None

DominiqueMaucieri commented 2 years ago

@fboehm I have updated line 46, and corrected the Wilsons and shiny references

fboehm commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

fboehm commented 2 years ago

Thanks for making those small changes, @DominiqueMaucieri. The remaining DOIs look good.

fboehm commented 2 years ago

@DominiqueMaucieri - please make a new release of your repository and archive it. Please report here the archive doi and the new release version number. Thanks again!

DominiqueMaucieri commented 2 years ago

@fboehm The new version is v1.1.1 and I have archived it on Zenodo with DOI: 10.5281/zenodo.7222742

fboehm commented 2 years ago

@editorialbot set version as v1.1.1

editorialbot commented 2 years ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

fboehm commented 2 years ago

@editorialbot set v1.1.1 as version

editorialbot commented 2 years ago

Done! version is now v1.1.1

fboehm commented 2 years ago

@editorialbot set 10.5281/zenodo.7222742 as archive

editorialbot commented 2 years ago

Done! Archive is now 10.5281/zenodo.7222742

fboehm commented 2 years ago

@DominiqueMaucieri - Thanks! I neglected to tell you that we actually need the title of the zenodo archive to match exactly that of the paper.md. Can you please fix the title on the zenodo archive?

Thanks again!

DominiqueMaucieri commented 2 years ago

@fboehm I have updated the title, they should match now

fboehm commented 2 years ago

Thanks, @DominiqueMaucieri ! It now looks good!

fboehm commented 2 years ago

@editorialbot recommend-accept

editorialbot commented 2 years ago

Attempting dry run of processing paper acceptance...

openjournals / joss-reviews