ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

forcis: An R client to access the FORCIS database #660

Open ahasverus opened 1 month ago

ahasverus commented 1 month ago

Submitting Author Name: Nicolas Casajus Submitting Author Github Handle: !--author1-->@ahasverus<!--end-author1-- Other Package Authors Github handles: (comma separated, delete if none) @MatGreco90, @ChaabaneS, @xgiraud Repository: https://github.com/frbcesab/forcis Version submitted: 0.1.0 Submission type: Standard Editor: !--editor-->@beatrizmilz<!--end-editor-- Reviewers: TBD

Archive: TBD Version accepted: TBD Language: en


Package: forcis
Type: Package
Title: An R Client to Access the FORCIS Database
Version: 0.1.0
Authors@R: c(
    person(given   = "Nicolas",
           family  = "Casajus",
           role    = c("aut", "cre", "cph"),
           email   = "nicolas.casajus@fondationbiodiversite.fr",
           comment = c(ORCID = "0000-0002-5537-5294")),
    person(given   = "Mattia",
           family  = "Greco",
           role    = "aut",
           email   = "mattia_greco@outlook.com",
           comment = c(ORCID = "0000-0003-2416-6235")),
    person(given   = "Sonia",
           family  = "Chaabane",
           role    = "aut",
           email   = "sonia.chaabane@gmail.com",
           comment = c(ORCID = "0000-0002-4653-8610")),
    person(given   = "Xavier",
           family  = "Giraud",
           role    = "aut",
           email   = "giraud@cerege.fr",
           comment = c(ORCID = "0000-0001-5067-8176")),
    person(given   = "Thibault",
           family  = "de Garidel-Thoron",
           role    = "aut",
           email   = "garidel@cerege.fr",
           comment = c(ORCID = "0000-0001-8983-9571")),
    person(given   = "Khalil",
           family  = "Hammami",
           role    = "ctb",
           email   = "khalil.hammami@enetcom.usf.tn"))
Description: Provides an interface to the FORCIS database 
    (<https://zenodo.org/doi/10.5281/zenodo.7390791>) on global foraminifera
    distribution. This package allows to download and to handle FORCIS data.
    It is part of the FRB-CESAB working group FORCIS.
    <https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/forcis/>.
URL: https://frbcesab.github.io/forcis
BugReports: https://github.com/FRBCesab/forcis/issues
License: GPL (>= 2)
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
VignetteBuilder: knitr
Depends: 
    R (>= 2.10)
Imports: 
    dplyr,
    ggplot2,
    jsonlite,
    rlang,
    sf,
    tidyr,
    utils,
    vroom
Suggests: 
    fs,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0),
    withr
Config/testthat/edition: 3

Scope

This package is designed to download the FORCIS data files hosted on Zenodo. It includes functions to download (data retrieval), select, filter, reshape, and visualize data (data munging).

This package should be of interest to scientists working on Foraminifera species distribution and interested in the FORCIS database (spatial analyses, time series analyses, etc.). The package have been developed to facilitate the data wrangling to avoid some pitfalls and to easily get data ready to be analyzed/visualized.

No other package exists to handle the FORCIS database. Note that we are authors of the database and already published a Data paper describing the database.

Not applicable.

Pre-submission inquiry: #655 Editor: @adamhsparks

The function pkgcheck::pkgcheck() returns the following report:

── forcis 0.1.0 ────────────────────────────────────────────

✔ Package name is available
✔ has a 'codemeta.json' file.
✔ has a 'contributing' file.
✔ uses 'roxygen2'.
✔ 'DESCRIPTION' has a URL field.
✔ 'DESCRIPTION' has a BugReports field.
✔ Package has at least one HTML vignette
✔ All functions have examples.
✔ Package has continuous integration checks.
✔ Package coverage is 97.4%.
✔ R CMD check found no errors.
✔ R CMD check found no warnings.

ℹ Current status:
✔ This package may be submitted.

The package goodpractice returns warnings:

  • Write unit tests: some functions are difficult to test (HTTP requests)
  • Avoid calling setwd(): this function is used in unit tests in combination with withr::defer()

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [x] The package is novel and will be of interest to the broad readership of the journal. - [x] The manuscript describing the package is no longer than 3000 words. - [x] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

ropensci-review-bot commented 1 month ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 1 month ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 month ago

Checks for forcis (v0.1.0)

git hash: e80b91c5

Package License: GPL (>= 2)


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:---------|------:| |internal |base | 105| |internal |forcis | 66| |internal |graphics | 3| |imports |utils | 56| |imports |sf | 12| |imports |vroom | 10| |imports |jsonlite | 3| |imports |dplyr | NA| |imports |ggplot2 | NA| |imports |rlang | NA| |imports |tidyr | NA| |suggests |fs | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |testthat | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

file.path (11), c (7), which (7), data.frame (6), for (6), seq_len (6), suppressWarnings (6), length (5), list.files (5), rbind (5), unique (5), as.numeric (4), colnames (3), tryCatch (3), url (3), drop (2), file (2), format (2), lapply (2), months (2), paste0 (2), strsplit (2), unlist (2), as.Date (1), gsub (1), nrow (1), options (1), readline (1), readLines (1), which.max (1)

forcis

get_species_names (11), data_to_sf (5), species_list (5), get_available_versions (3), get_metadata (3), cpr_north_filename (2), cpr_south_filename (2), get_current_version (2), get_latest_version (2), add_data_type (1), check_field_in_data (1), check_if_character (1), check_if_df (1), check_if_path_exists (1), check_if_valid_taxonomy (1), check_required_columns (1), check_unique_taxonomy (1), check_version (1), compute_abundances (1), compute_concentrations (1), compute_frequencies (1), convert_to_long_format (1), crs_robinson (1), data_types (1), date_format (1), download_file (1), download_forcis_db (1), filter_by_bbox (1), filter_by_month (1), filter_by_ocean (1), filter_by_polygon (1), filter_by_species (1), filter_by_year (1), geom_basemap (1), get_data_type (1), get_required_columns (1), get_version_metadata (1), plankton_net_filename (1), pump_filename (1), sediment_trap_filename (1)

utils

data (55), download.file (1)

sf

st_intersects (6), st_bbox (3), st_crs (2), st_as_sf (1)

vroom

vroom (10)

graphics

polygon (3)

jsonlite

read_json (3)

**NOTE:** Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 33 files) and - 5 authors - 6 vignettes - no internal data file - 8 imported packages - 31 exported functions (median 27 lines of code) - 81 non-exported functions in R (median 16 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 33| 90.1| | |files_vignettes | 6| 96.8| | |files_tests | 35| 98.0| | |loc_R | 1549| 76.5| | |loc_vignettes | 418| 71.2| | |loc_tests | 1200| 86.4| | |num_vignettes | 6| 97.7|TRUE | |n_fns_r | 112| 77.5| | |n_fns_r_exported | 31| 78.2| | |n_fns_r_not_exported | 81| 77.6| | |n_fns_per_file_r | 2| 31.7| | |num_params_per_fn | 2| 8.2| | |loc_per_fn_r | 19| 57.7| | |loc_per_fn_r_exp | 27| 58.8| | |loc_per_fn_r_not_exp | 16| 53.2| | |rel_whitespace_R | 48| 92.5| | |rel_whitespace_vignettes | 68| 89.5| | |rel_whitespace_tests | 55| 95.6|TRUE | |doclines_per_fn_exp | 38| 46.8| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 157| 84.7| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check.yaml](https://github.com/FRBCesab/forcis/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/FRBCesab/forcis/actions) [![pkgdown.yaml](https://github.com/FRBCesab/forcis/actions/workflows/pkgdown.yaml/badge.svg)](https://github.com/FRBCesab/forcis/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |-----------:|:--------------------------|:----------|:------|----------:|:----------| | 11082922089|pages build and deployment |success |702763 | 164|2024-09-28 | | 11082912111|pkgdown |success |e80b91 | 199|2024-09-28 | | 11082809930|R CMD Check |success |e80b91 | 198|2024-09-28 | | 11082809923|Test coverage |success |e80b91 | 94|2024-09-28 | | 11082912112|Update CITATION.cff |success |e80b91 | 23|2024-09-28 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following check_fail: 1. no_import_package_as_a_whole #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 97.38 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 7 potential issues: message | number of times --- | --- Avoid changing the working directory, or restore it in on.exit | 2 Avoid library() and require() calls in packages | 5


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.6.17 | |pkgcheck |0.1.2.58 |


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

adamhsparks commented 4 weeks ago

@ropensci-review-bot assign @beatrizmilz as editor

ropensci-review-bot commented 4 weeks ago

Assigned! @beatrizmilz is now the editor

beatrizmilz commented 3 weeks ago

Hi @ahasverus! I'm Beatriz, and I'll be the editor for your submission. 👋

beatrizmilz commented 3 weeks ago

Editor checks:

Editor comments

Hi @ahasverus ! Congratulations for you and the team for this great package, and also for the [publication of the data paper on Nature] (https://www.nature.com/articles/s41597-023-02264-2).

The comments below are related to the Editor Checks above.



For example:

How it is now:

On this page

    Setup
    select_taxonomy()
    select_forcis_columns()
    filter_by_month()
    filter_by_year()
    filter_by_bbox()
    filter_by_ocean()
    filter_by_polygon()
    filter_by_species()
    convert_to_long_format() 

Idea: (this is just an example)


    Setup
    Selecting columns
       Selecting columns by taxonomy
       Selecting required columns 
    Filtering rows
       Filter by month of data collection
       Filter by year of data collection
       Filter by location (bounding box)
       Filter by ocean
       Filter by polygon
       Filter by species
    Reshaping
       Convert to long format

From what I checked in some tests (eg. test-plot_record_by_month.R), the tests verify the class of the plot created. Could you improve the tests for the functions that create plots using vdiffr?


ahasverus commented 3 weeks ago

Hi @beatrizmilz!

Thank you for this first round of comments. I will start looking into them in the next few days and come back to you as soon as possible.

beatrizmilz commented 3 weeks ago

Hi @beatrizmilz!

Thank you for this first round of comments. I will start looking into them in the next few days and come back to you as soon as possible.

Hi! I hope the comments are helpful. Most of the comments are suggestions, feel free to work on the ones that makes sense for you.

The comments about testing are the most important, since they are recommendations from the dev guide!

ahasverus commented 3 weeks ago

Hi @beatrizmilz!

No, all your comments are very useful. I have started to work on some of them.


A good practice to ensure that the user works with the latest version of the database might be to add this line at the beginning of the script:

download_forcis_db(version = NULL, ...)

Answer: Thanks for reporting the lack of clarity of this section. I have added a few sentences to clarify this paragraph in the vignette Database versions (commit 3ab5baa). I hope it's clearer.


  • (2) In the vignette Select, reshape, and filter data: it's not clear to me what the required columns mean. I saw the page for the function get_required_columns(), and there is a list of columns. But could it be possible to describe a bit more? Why are they required? Is that because these columns are the most important for basic analysis?

Answer: I have added a few sentences to explain why these columns are required in the vignette Select, reshape, and filter data and in the documentation of the function get_required_columns() (commit 06a053b).


  • (3) In the vignettes (for example Select, reshape, and filter data): The names of the sections are the names of functions. I think it's best to name sections with a description of the task. You can see some examples in vignettes of other packages in rOpenSci: magick, tabulapdf, etc.

Answer: Thanks for this suggestion. I have modified the section names in the vignettes Select, reshape, and filter data and Data visualization (commit 81db6ae).


  • (6) This is more of a question. In the package webpage, it says that the package has been developed for the Centre for the Synthesis and Analysis of Biodiversity. Does this mean that they are the funder? If so, they can be added as fnd in the authors list. This post can be useful to understand these three-letter-code used in the authors list on DESCRIPTION.

Answer: Indeed, the research group FORCIS has been funded by the FRB-CESAB. I have added it to the DESCRIPTION file with the role fnd (commit 6bd6e2b).


Regarding your comments (4) and (5), I need to read more about the packages vdiffr and httptest to improve and implement unit tests for plotting functions and HTTP requests.

I will come back to you very soon. Thanks again.