`chopin`: Computation for Climate and Health research On Parallelized INfrastructure

sigmafelix commented 2 months ago

Submitting Author Name: Insang Song Submitting Author Github Handle: !--author1-->@sigmafelix@kyle-messier@beatrizmilz<!--end-editor-- Reviewers: @robitalec, @Aariq

Archive: TBD Version accepted: TBD Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: chopin
Title: CHOPIN: Computation for Climate and Health research On Parallelized INfrastructure
Version: 0.6.2.20240423
Authors@R: c(
    person("Insang", "Song", , "geoissong@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-8732-3256")),
    person("Kyle", "Messier", role = c("aut", "ctb"),
           comment = c(ORCID = "0000-0001-9508-9623"))
  )
Description: It enables users with basic understanding on geospatial data
    and sf and terra functions to parallelize geospatial operations for
    geospatial exposure assessment modeling and covariate computation.
    Parallelization is done by dividing large datasets into sub-regions
    with regular grids and data's own hierarchy. A computation over the
    large number of raster files can be parallelized with a chopin
    function as well.
License: MIT + file LICENSE
URL: https://github.com/NIEHS/chopin
BugReports: https://github.com/NIEHS/chopin/issues
Depends: 
    R (>= 4.1)
Encoding: UTF-8
LazyData: true
LazyDataCompression: xz
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Imports: 
    anticlust,
    dplyr (>= 1.1.0),
    exactextractr (>= 0.8.2),
    future.apply,
    igraph,
    methods,
    rlang (>= 0.4.9),
    sf (>= 1.0-10),
    stars (>= 0.6-0),
    terra (>= 1.7-18)
Suggests: 
    callr,
    covr,
    DiagrammeR,
    doFuture,
    doParallel,
    future,
    future.batchtools,
    future.callr,
    knitr,
    rmarkdown,
    spatstat.random,
    testthat (>= 3.0.0),
    units,
    withr
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- [ ] data retrieval
- [ ] data extraction
- [ ] data munging
- [ ] data deposition
- [ ] data validation and testing
- [ ] workflow automation
- [ ] version control
- [ ] citation management and bibliometrics
- [ ] scientific software wrappers
- [ ] field and lab reproducibility tools
- [ ] database software bindings
- [x] geospatial data
- [ ] text analysis
Explain how and why the package falls under these categories (briefly, 1-2 sentences): : chopin supports parallel processing for functions in popular spatial data manipulation packages sf and terra on a high-level parallelization framework future. This feature fits to the geospatial data category.
Who is the target audience and what are scientific applications of this package? : Our first target audience group is spatial epidemiologists and health geographers who want to calculating spatial covariates from spatial and spatiotemporal datasets including climate, transportation, demographics, topography, hydrography, and others. We expect that users are cognizant of basic geographic information system/science. The wider audience could take advantage of the flexibility of this package for expediting spatial operations.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category? : A selection of functions in terra (e.g., *app and predict) supports internal parallelization, where a single dataset is accepted. To the best of our knowledge, no packaged solution exists for parallelization of spatial operations where two datasets are involved. sprawl (GitHub-only; not maintained) partially overlaps this package's functionality in that it includes convenience functions connecting multiple basic spatial operations. Besides the R functions, a handful of teaching or demonstration materials briefly covered parallelization of spatial data applications ([1], [2], [3]).
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? : Not applicable
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted. : Presubmission inquiry of the previous version of this package -- #630 commented by @ldecicco-USGS
Explain reasons for any pkgcheck items which your package is unable to pass. : Coverage rate (99%) and installation size (25.7 MB; of which data (2.0 MB) and extdata (23.1 MB) exceeded recommended sizes) resulted in notes.

Technical checks

Confirm each of the following by checking the box.

[x] I have read the rOpenSci packaging guide.
[x] I have read the author guide and I expect to maintain this package for at least 2 years or to find a replacement.

This package:

[x] does not violate the Terms of Service of any service it interacts with.
[x] has a CRAN and OSI accepted license.
[x] contains a README with instructions for installing the development version.
[x] includes documentation with examples for all functions, created with roxygen2.
[x] contains a vignette with examples of its essential functions and uses.
[x] has a test suite.
[x] has continuous integration, including reporting of test coverage.

Publication options

[x] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options

- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

[x] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Thank you very much for your consideration!

ropensci-review-bot commented 2 months ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 2 months ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 months ago

Checks for chopin (v0.6.2.20240423)

git hash: 26153abb

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_multiplication_x: does not have a 'contributing' file.
:heavy_multiplication_x: The following function has no documented return value: [clip_ras_ext]
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 99.6%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 123| |internal |chopin | 33| |internal |graphics | 8| |internal |stats | 6| |internal |utils | 2| |imports |terra | 38| |imports |sf | 20| |imports |dplyr | 5| |imports |methods | 5| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |igraph | 2| |imports |anticlust | 1| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |doParallel | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (9), switch (9), lapply (7), c (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), data.frame (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), by (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)

terra

buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

graphics

points (8)

stats

dist (3), quantile (3)

dplyr

n (2), left_join (1), mutate (1), summarize (1)

methods

is (3), el (2)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

igraph

graph_from_edgelist (2)

utils

combn (2)

anticlust

balanced_clustering (1)

stars

st_as_stars (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1163| 72.0| | |loc_vignettes | 652| 83.8| | |loc_tests | 1193| 89.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 32| 85.6| | |rel_whitespace_tests | 17| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 59| 69.7| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 8817187321|pages build and deployment |success |a8aaed | 33|2024-04-24 | | 8817133786|pkgdown |success |26153a | 69|2024-04-24 | | 8817133742|R-CMD-check |success |26153a | 149|2024-04-24 | | 8817133730|test-coverage-local |success |26153a | 17|2024-04-24 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following notes: 1. checking installed package size ... NOTE installed size is 25.6Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb 2. checking re-building of vignette outputs ... NOTE Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-load-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 1

Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3.13 | |pkgcheck |0.1.2.21 |

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

sigmafelix commented 2 months ago

@ropensci-review-bot check package

ropensci-review-bot commented 2 months ago

Thanks, about to send the query.

ropensci-review-bot commented 2 months ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 months ago

Checks for chopin (v0.6.2.20240423)

git hash: c0e53b6e

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 99.6%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 123| |internal |chopin | 33| |internal |graphics | 8| |internal |stats | 6| |internal |utils | 2| |imports |terra | 38| |imports |sf | 20| |imports |dplyr | 5| |imports |methods | 5| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |igraph | 2| |imports |anticlust | 1| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |doParallel | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (9), switch (9), lapply (7), c (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), data.frame (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), by (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)

terra

buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

graphics

points (8)

stats

dist (3), quantile (3)

dplyr

n (2), left_join (1), mutate (1), summarize (1)

methods

is (3), el (2)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

igraph

graph_from_edgelist (2)

utils

combn (2)

anticlust

balanced_clustering (1)

stars

st_as_stars (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1163| 72.0| | |loc_vignettes | 652| 83.8| | |loc_tests | 1193| 89.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 32| 85.6| | |rel_whitespace_tests | 17| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 59| 69.7| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 8818976584|pages build and deployment |success |1b88a2 | 36|2024-04-24 | | 8818927586|pkgdown |success |f8c659 | 71|2024-04-24 | | 8818927598|R-CMD-check |success |f8c659 | 151|2024-04-24 | | 8818927585|test-coverage-local |success |f8c659 | 19|2024-04-24 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following notes: 1. checking installed package size ... NOTE installed size is 25.6Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb 2. checking re-building of vignette outputs ... NOTE Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-load-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 1

Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3.13 | |pkgcheck |0.1.2.21 |

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

sigmafelix commented 2 months ago

@ropensci-review-bot help

ropensci-review-bot commented 2 months ago

Hello @sigmafelix, here are the things you can ask me to do:


# Add an author's response info to the ROpenSci logs
@ropensci-review-bot submit response <AUTHOR_RESPONSE_URL>

# List all available commands
@ropensci-review-bot help

# Show our Code of Conduct
@ropensci-review-bot code of conduct

# Invite the author of a package to the corresponding rOpenSci team. This command should be issued by the author of the package.
@ropensci-review-bot invite me to ropensci/package-name

# Adds package's repo to the rOpenSci team. This command should be issued after approval and transfer of the package.
@ropensci-review-bot finalize transfer of package-name

# Various package checks
@ropensci-review-bot check package

# Checks srr documentation for stats packages
@ropensci-review-bot check srr

sigmafelix commented 2 months ago

@ropensci-review-bot submit response https://github.com/ropensci/software-review/issues/638#issuecomment-2075277239

ropensci-review-bot commented 2 months ago

Couldn't find entry for chopin in the packages log

ldecicco-USGS commented 2 months ago

@ropensci-review-bot assign @beatrizmilz as editor

ropensci-review-bot commented 2 months ago

Assigned! @beatrizmilz is now the editor

beatrizmilz commented 1 month ago

Hello @sigmafelix ! Thank you for your submission. I am the editor who will be working with you on this package.

Editor checks:

[x] Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [x] Is the reference index page clear (grouped by topic if necessary)?
- [ ] Are vignettes readable, sufficiently detailed and not just perfunctory?
[x] Fit: The package meets criteria for fit and overlap.
[x] Installation instructions: Are installation instructions clear enough for human users?
[x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
[x] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
[x] License: The package has a CRAN or OSI accepted license.
[x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Editor comments

1. Seems like the package was created in the organization called Spatiotemporal-Exposures-and-Toxicology and then was transfered to NIEHS, and some of the links in the documentation are still pointing to the old organization. Can you update them to point to the new organization, please? This might help: https://github.com/search?q=repo%3ANIEHS%2Fchopin%20Spatiotemporal-Exposures-and-Toxicology&type=code
1. The vignette Extracting Weather/Climate Geospatial Data with chopin is well detailed and informative. However, the vignette Generate computational grids can be more detailed. It would be great if you could improve that vignette with more text explaining the code and the steps.
1. I ran devtools::test() and all the tests passed. However, I noticed that there is a lot of messages from dplyr joins, because aparently there are some joins that are being done without declaring which are the key columns. This is not a problem, but it would be nice if you could add the by argument to the dplyr::**_join() functions in the code. Here is part of the messages that I got:

✔ |         41 | processing [8.1s]                                             
⠏ |          0 | tests                                                         Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
⠴ |          6 | tests                                                         Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
⠙ |         12 | tests                                                         Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id, FIPS)`
Joining with `by = join_by(from_id, to_id)`
✔ |         13 | tests

1. The package has >99% of code covered by tests. 🎉

Can you please address these comments? Thank you!

sigmafelix commented 1 month ago

@beatrizmilz I appreciate your time to review our package. I will address all your points in the revision on top of fixing some bugs we identified in the meantime and respond to you soon. Thank you.

sigmafelix commented 1 month ago

@beatrizmilz Thank you for your patience. I think I addressed all comments in the current version available in main. Please let me know if you have additional comments.

beatrizmilz commented 1 month ago

Thank you @sigmafelix !

beatrizmilz commented 1 month ago

@ropensci-review-bot seeking reviewers

ropensci-review-bot commented 1 month ago

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/638_status.svg)](https://github.com/ropensci/software-review/issues/638)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

beatrizmilz commented 1 month ago

@ropensci-review-bot check package

ropensci-review-bot commented 1 month ago

Thanks, about to send the query.

ropensci-review-bot commented 1 month ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 month ago

Checks for chopin (v0.6.3.20240515)

git hash: 094a6460

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 99.7%.
:heavy_multiplication_x: R CMD check found 1 error.
:heavy_check_mark: R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 133| |internal |chopin | 33| |internal |graphics | 8| |internal |stats | 6| |internal |utils | 2| |imports |terra | 38| |imports |sf | 20| |imports |dplyr | 10| |imports |methods | 5| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |igraph | 2| |imports |anticlust | 1| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |doParallel | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (9), switch (9), c (7), lapply (7), by (6), data.frame (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)

terra

buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

dplyr

left_join (6), n (2), mutate (1), summarize (1)

graphics

points (8)

stats

dist (3), quantile (3)

methods

is (3), el (2)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

igraph

graph_from_edgelist (2)

utils

combn (2)

anticlust

balanced_clustering (1)

stars

st_as_stars (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1176| 72.3| | |loc_vignettes | 831| 88.0| | |loc_tests | 1289| 90.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 33| 90.4| | |rel_whitespace_tests | 16| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 61| 70.3| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9116757038|pages build and deployment |success |9e3af8 | 43|2024-05-16 | | 9116716549|pkgdown |success |094a64 | 77|2024-05-16 | | 9116716555|R-CMD-check |success |094a64 | 157|2024-05-16 | | 9116716552|test-coverage-local |success |094a64 | 25|2024-05-16 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following error: 1. checking re-building of vignette outputs ... ERROR Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-gen-points-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 25.8Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 2 potential issues: message | number of times --- | --- Avoid using sapply, consider vapply instead, that's type safe | 1 Lines should not be more than 80 characters. This line is 81 characters. | 1

Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.5.2 | |pkgcheck |0.1.2.34 |

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

beatrizmilz commented 1 month ago

Hi @sigmafelix ! There was one error on the R CMD check: https://github.com/ropensci/software-review/issues/638#issuecomment-2125717203 . Could you please verify that, before we proceed to add the reviewers?

Thank you!

sigmafelix commented 1 month ago

Hi @sigmafelix ! There was one error on the R CMD check: #638 (comment) . Could you please verify that, before we proceed to add the reviewers?

Thank you!

@beatrizmilz Thank you for the update. In the most recent version of chopin, I think the R CMD CHECK error was fixed. Thank you!

beatrizmilz commented 1 month ago

@ropensci-review-bot check package

ropensci-review-bot commented 1 month ago

Thanks, about to send the query.

ropensci-review-bot commented 1 month ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 month ago

Checks for chopin (v0.7.0.20240525)

git hash: 052c8627

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 98.1%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 144| |internal |chopin | 32| |internal |graphics | 8| |internal |stats | 5| |internal |utils | 2| |imports |terra | 44| |imports |sf | 20| |imports |dplyr | 10| |imports |methods | 6| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |collapse | 3| |imports |anticlust | 2| |imports |igraph | 2| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (10), switch (9), c (7), lapply (7), by (6), unlist (6), nrow (5), which (5), data.frame (4), if (4), names (4), split (4), mapply (3), try (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), expand.grid (2), grep (2), grepl (2), length (2), log10 (2), mode (2), paste (2), round (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), unname (2), args (1), as.data.frame (1), as.list (1), as.logical (1), as.numeric (1), as.vector (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), matrix (1), ncol (1), paste0 (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), startsWith (1), strsplit (1), suppressWarnings (1), table (1), vector (1)

terra

buffer (9), crop (8), crs (6), distance (4), ext (4), vect (4), nearby (2), project (2), rast (2), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (14), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), check_subject (1), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

dplyr

left_join (6), n (2), mutate (1), summarize (1)

graphics

points (8)

methods

is (4), el (2)

stats

quantile (3), dist (2)

collapse

rowbind (3)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

anticlust

balanced_clustering (2)

igraph

graph_from_edgelist (2)

utils

combn (2)

stars

st_as_stars (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 11 imported packages - 34 exported functions (median 17 lines of code) - 35 non-exported functions in R (median 32 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 8| 88.2| | |loc_R | 1281| 74.5| | |loc_vignettes | 744| 86.2| | |loc_tests | 1449| 91.3| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 69| 66.3| | |n_fns_r_exported | 34| 81.0| | |n_fns_r_not_exported | 35| 57.4| | |n_fns_per_file_r | 7| 78.2| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 18| 42.0| | |loc_per_fn_r_not_exp | 32| 80.3| | |rel_whitespace_R | 11| 63.3| | |rel_whitespace_vignettes | 34| 89.0| | |rel_whitespace_tests | 15| 85.9| | |doclines_per_fn_exp | 36| 45.0| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 68| 72.4| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9308315230|pages build and deployment |success |764f4b | 53|2024-05-30 | | 9308277371|pkgdown |success |052c86 | 88|2024-05-30 | | 9308277362|R-CMD-check |success |052c86 | 168|2024-05-30 | | 9308277364|test-coverage-local |success |052c86 | 36|2024-05-30 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 25.7Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 98.11 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following function have cyclocomplexity >= 15: function | cyclocomplexity --- | --- par_grid | 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 7 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 7

Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.5.2 | |pkgcheck |0.1.2.41 |

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

beatrizmilz commented 1 month ago

Hi @sigmafelix ! There was one error on the R CMD check: #638 (comment) . Could you please verify that, before we proceed to add the reviewers? Thank you!

@beatrizmilz Thank you for the update. In the most recent version of chopin, I think the R CMD CHECK error was fixed. Thank you!

Great!

beatrizmilz commented 1 month ago

@ropensci-review-bot assign @robitalec as reviewer

ropensci-review-bot commented 1 month ago

@robitalec added to the reviewers list. Review due date is 2024-06-20. Thanks @robitalec for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 1 month ago

@robitalec: If you haven't done so, please fill this form for us to update our reviewers records.

beatrizmilz commented 1 week ago

@ropensci-review-bot assign @Aariq as reviewer

ropensci-review-bot commented 1 week ago

@Aariq added to the reviewers list. Review due date is 2024-07-08. Thanks @Aariq for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 1 week ago

@Aariq: If you haven't done so, please fill this form for us to update our reviewers records.

ropensci-review-bot commented 1 week ago

:calendar: @robitalec you have 2 days left before the due date for your review (2024-06-20).

robitalec commented 1 week ago

Package Review

[X] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

Note, I'm placing checkboxes below each bullet, to highlight the places where I think improvements would be beneficial. I won't checkbox all the things that are already completed to keep it tidier.

The package includes all the following forms of documentation:

[X] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[ ] Installation instructions: for the development version of package and any non-standard dependencies in README
- [ ] Consider listing system requirements
- [ ] Consider linking to dependency eg. {sf} install instructions
[ ] Vignette(s) demonstrating major functionality that runs successfully locally
- [X] All vignettes run successfully locally
- [ ] Consider revising, restructuring (details below)
[ ] Function Documentation: for all exported functions in R help
- [ ] Missing top-level documentation
[ ] Examples for all exported functions in R Help that run successfully locally
- [ ] Missing example for extract_at
[ ] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
- [ ] There are no contributing guidelines in the README or CONTRIBUTING
- [ ] Consider adding an .Rproj file for collaborators

Functionality

[X] Installation: Installation succeeds as documented.
[X] Functionality: Any functional claims of the software been confirmed.
[X] Performance: Any performance claims of the software been confirmed.
[ ] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- [ ] Consider improving descriptions and commenting of tests
- [ ] Consider replacing tests expect_no_error with more specific expectations
[ ] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
- [ ] See suggestions below
- [ ] See details in other sections

Review Comments

Thank you for inviting me to review @beatrizmilz and, as always, a huge thank you to the broader rOpenSci community for all of your incredible contributions, inclusion and support.

Thank you @sigmafelix and @kyle-messier for this interesting package with functionality I haven't seen elsewhere that should be a great tool for a wide userbase.

Before I start - I really link the name because it evokes chopping up spatial objects into chunks and -- music! Great choice.

My first main and most important comment is that, at the moment, I find {chopin} hard to get started using.

I think the pitch is currently too narrow (also mentioned here), which might turn away folks who don't identify with "health" or "climate" research. I think there is potential for many users across disciplines who have large spatial data (everything is spatial!) to use {chopin} to perform analyses in more manageable chunks, allowing folks to improve the efficiency of their analyses and extend the potential of their machines.

When I open the README, one of the first things I see is notes on data restrictions and z dimensions and spherical geometry. I think this should come later in a caveats section or FAQ vignette.

Next, we get to "Basic design" which reads more like an overview of functions but also gets into which packages underpin {chopin} functionality, package versions, and registering parallel workers. I think this is the section that really needs to give users a concise and clear overview of the functions. This can and should be reused in other places to follow the recommended principle of multiple points of entry, where users should be anticipated to first encounter {chopin} across every source of information (README, documentation, vignettes) and given a clear introduction.

"We suggest two flowcharts to help which function to use for parallel processing below" - note there is only one flowchart visible below and it isn't linked to in this sentence
extract_at_poly missing description

It seems there are currently three main families of functions exported and number of helper or internal functions:

par_*
- par_grid and others for running functions in parallel
- par_make_* for making grids
summarize_*
extract_at_*

These are the kinds of questions I'm left as an early user with after reading the overview

Should par_make_* functions be split into a new grid_* family of functions?
What is the difference to a user between extract_at_buffer with a kernel weight and summary_aw an area-weighted summary? What is the difference between par_make_grid and par_make_gridset?
If par_group_grid is an extension of par_group_balanced for padded grids, should it be renamed something like par_group_balanced_padded? Or should padded = TRUE be added as an argument?
Are regular R users going to find useful functions here or is {chopin} just for HPC users?

From the Description, current version

Package: chopin
Title: CHOPIN: Computation for Climate and Health research On Parallelized INfrastructure
Description: It enables users with basic understanding on geospatial data
    and sf and terra functions to parallelize geospatial operations for
    geospatial exposure assessment modeling and covariate computation.
    Parallelization is done by dividing large datasets into sub-regions
    with regular grids and data's own hierarchy. A computation over the
    large number of raster files can be parallelized with a chopin
    function as well.

Suggestions:

I think there's a lot of useful information in the flowchart, how can we incorporate more obviously?
No need to repeat the package name, capitalized, in the title
I think the acronym is neat, but the name stands on it's own. Broadening the scope would help users understand how they can use {chopin} and this starts at the title
I don't think it's necessary to state the expectations of users in the description
See more examples here

Package: chopin
Title: Parallelize Geospatial Operations Over Sub-Regions, Regular Grids and Lists of Rasters
Description: It enables users to parallelize geospatial operations over
    sub-regions, regular grids and lists of rasters. Parallelization 
    is done by dividing large datasets into sub-regions
    with regular grids and data's own hierarchy. A computation over the
    large number of raster files can be parallelized with a chopin
    function as well. (TODO: maybe mention {future}, {exactextractr}, {terra}, {sf})

Another reason that I might be finding {chopin} hard to get started using is that it seems to obscure lots of the regular terra/sf steps I would typically do myself as an R user. For example, the extract_at_buffer wraps the exact_extract function from {exactextractr} but before calling it, it runs terra::buffer, manages projections, crops, calculates extents, etc. This is also the case in functions like reproject_std and vect_valid_repair which reproject/repair objects using functions terra or sf based on the object type.

This also shows itself where the authors manually switch within functions depending on terra or sf input. For example, using dep_check and testing if == 'terra' else if == 'sf'. This feels like an opportunity to take advantage of S3 methods. Read more in Advanced R: S3 and R Packages: 11.9. An example applied is in {tmap}'s internal function tmapShape. Obviously this would require a fair amount of refactoring - I just wanted to highlight an alternative approach that might be more flexible to adding new supported data types in the future.

This is definitely an open ended comment - I just wonder where the point is for users where hiding away managing regular steps in a spatial analysis (like projections, extents, clipping, etc) actually results in more confusion than it helps.

My last general comment is - how do you envision {chopin} fitting into the broader High Performance Computing landscape in R? One thing I'm personally really interested is how it might interface with {targets}. Looks like maybe I'm saved from commenting too much on this, as my co reviewer @Aariq is one of the authors on the new {geotargets} package along with @njtierney and @brownag (thank you!). I have been eagerly following their progress as a regular {targets} user that has encountered the challenges of including spatial data in {targets} workflows. I wonder how these tools can be brought together or, alternatively, how the authors might communicate their differences and where a user might choose one over the other.

Below: detailed comments following sections in Documentation and Functionality bullets from the top.

Documentation

System requirements

Suggestions:

Consider listing system requirements (eg. netcdf) using recommendations here
See example from {magick}

Vignettes

Issues:

An overview of the vignettes and README material:

README
- "Examples will navigate par_grid, par_hierarchy, and par_multirasters functions in chopin to parallelize geospatial operations."
  - par_grid: parallelize over artificial grid polygons
  - chopin::par_hierarchy: parallelize geospatial computations using intrinsic data hierarchy
  - par_multirasters: parallelize over multiple rasters
  - Parallelization of a generic geospatial operation
Good practice of using chopin in HPC
- Assumptions
- Basic workflow
Generate computational grids
- Computational grids
- Types of computational grids and their generation
- Visualize computational grids
Extracting Weather/Climate Geospatial Data with chopin
- Introduction
- TerraClimate
- PRISM dataset
- Scaled up examples
- Discussion: which strategy is better? stacked vs file-based parallelization

It is unclear to a user which vignette is the introductory one. The rOpenSci Packaging Guide suggests "The general vignette should present a series of examples progressing in complexity from basic to advanced usage.". At the moment, this looks most like the "Extracting Weather/Climate Geospatial Data with chopin" vignette but the title is fairly specific and it is listed last on the website.

Consider splitting information about HPC or hardware into its own vignette. Also consider moving system.time benchmarking to a separate vignette? This could also potentially include tips for saving time and memory, and/or hardware recommendations.

Consider incorporating a section in the introductory vignette with a caveats section

why is parallelization slower
comparing single thread vs multi thread calculations
why we turn off spherical geometry

Consider revising the vignettes and README to develop bullet points into paragraphs. Some shorter lists to compare options or list steps can be helpful but, generally, I find a full page of bullets hard to process.

Images on the website are giving 404 errors

Generate computational grids

Package issues

Extracting Weather/Climate Geospatial Data with chopin
- pkgs <- c("chopin", "terra", "stars", "sf", "future", "doFuture", "dplyr", "parallelly", "tigris", "tictoc") invisible(sapply(pkgs, library, character.only = TRUE, quietly = TRUE))
  - packages
  - consider loading packages in way that most users will be more familiar with eg. library(chopin)

Using :: or both :: and library in vignettes (users are likely more familiar with library())

Generate computational grids
Extracting Weather/Climate Geospatial Data with chopin

Suggestions:

Missing link to external functions / packages

Generate computational grids
- {anticlust}
Good practice of using chopin in HPC
- {future.apply}
Extracting Weather/Climate Geospatial Data with chopin
- {tigris}
- {terra}
- ...

Function documentation

Issues:

Missing top-level documentation

See usethis::use_package_doc
Consider using this space to translate the flowchart into a written overview of the main functions with inputs and outputs

Suggestions:

Missing links to {chopin} functions, external functions, other packages

anticlust par_group_balanced
sf (throughout)
terra (throughout)
par_group_balanced from par_group_grid, ...

par_group_balanced

Users may benefit from a definition of an "inexhaustive grid"

Missing documentation of argument defaults

clip_ras_ext nqsegs = 180L
extract_at_poly func = mean
...

Unclear functions

Given current documentation, it is unclear why/when/where a user use eg. the following function
- https://niehs.github.io/chopin/reference/par_fallback.html

Use of ...

Consider explicitly demonstrating and describing this functionality in documentation and vignette
- Eg. par_grid

Examples

Issues:

Missing example

extract_at (reuse example see roxygen2 vignette)

Suggestions:

Examples not run with if (FALSE) or dontrun. Consider a smaller toy example?

par_grid
par_hierarchy
par_merge_grid
par_multirasters
vect_valid_repair

Using :: or both :: and library in examples

clip_vec_ext
clip_ras_ext
datamod
dep_check
dep_switch
...

Community guidelines

Issues:

There are no contributing guidelines in the README or CONTRIBUTING

Functionality

Automated tests

Issues:

There is good test coverage in the package but I am a bit concerned about the number of times the authors use the expect_no_error function. When I think of effective package testing, I think of both clarity for the user reading the tests and in the authors setting up expectations for how the package may fail (anticipating your own errors, warnings or messages). What does the test expect_no_error communicate to the user or set up as expectations?

Here are some ideas for expanding or adjusting the tests:

inline comments in the test scripts to explain to readers/our future selves as authors (example)
narrow the scope of each test_that() call and expand on the description
- from [https://testthat.r-lib.org/reference/test_that.html], "A test encapsulates a series of expectations about a small, self-contained unit of functionality"
- comparing in {chopin} eg. test-gridding.R (L116) to eg.{rnaturalearth} test_ne_states.R
  - we see {chopin} have one blanket description "grid split is well done" to contains all the tests
  - the {rnaturalearth} test has four smaller, more digestible test_that chunks that run a series of related tests
  - as a reader, I know what I'm looking at and find it easy to look for a specific test
  - as a potential contributor, I know which test might be related to my bug, or I have a good example of how to structure a potential new test in a pull request
more examples: dplyr test-join-cols.R, purrr test-pluck.R

library(data.table)

paths <- dir('../chopin/tests/testthat', full.names = TRUE)
DT <- data.table(path = paths)

counts <- DT[, tstrsplit(grep('testthat::', readLines(path), value = TRUE), c('\\('), keep = 1), by = path]

counts[, .N, V1][order(-N)]

Test function	N
testthat::expect_no_error	59
testthat::expect_error	47
testthat::test_that	30
testthat::expect_equal	20
testthat::expect_true	16
testthat::expect_s4_class	12
testthat::expect_s3_class	12
testthat::expect_no_error	12
testthat::expect_equal	10
testthat::expect_message	7
testthat::expect_error	6
testthat::expect_s3_class	5
testthat::expect_true	4
testthat::expect_condition	2
testthat::expect_warning	2
testthat::expect_warning	2
testthat::expect_no_warning	1

counts[, .N, .(path, V1)][order(path, -N)]

path	Test function	N
test-any_class_args.R	testthat::expect_true	5
test-any_class_args.R	testthat::test_that	1
test-any_class_args.R	testthat::expect_s4_class	1
test-check.R	testthat::expect_equal	10
test-check.R	testthat::test_that	9
test-check.R	testthat::expect_no_error	8
test-check.R	testthat::expect_error	8
test-check.R	testthat::expect_equal	4
test-check.R	testthat::expect_s3_class	2
test-check.R	testthat::expect_s4_class	2
test-check.R	testthat::expect_error	1
test-distribute_suites.R	testthat::expect_no_error	9
test-distribute_suites.R	testthat::expect_no_error	8
test-distribute_suites.R	testthat::expect_s3_class	5
test-distribute_suites.R	testthat::test_that	4
test-distribute_suites.R	testthat::expect_s3_class	4
test-distribute_suites.R	testthat::expect_true	4
test-distribute_suites.R	testthat::expect_error	2
test-distribute_suites.R	testthat::expect_true	2
test-distribute_suites.R	testthat::expect_error	2
test-distribute_suites.R	testthat::expect_equal	2
test-distribute_suites.R	testthat::expect_equal	2
test-distribute_suites.R	testthat::expect_condition	2
test-distribute_suites.R	testthat::expect_s4_class	1
test-gridding.R	testthat::expect_error	18
test-gridding.R	testthat::expect_no_error	17
test-gridding.R	testthat::expect_message	7
test-gridding.R	testthat::test_that	6
test-gridding.R	testthat::expect_true	6
test-gridding.R	testthat::expect_s4_class	6
test-gridding.R	testthat::expect_equal	5
test-gridding.R	testthat::expect_warning	2
test-gridding.R	testthat::expect_warning	1
test-gridding.R	testthat::expect_s3_class	1
test-gridding.R	testthat::expect_no_warning	1
test-par_fallback.R	testthat::test_that	1
test-par_fallback.R	testthat::expect_no_error	1
test-par_fallback.R	testthat::expect_error	1
test-preprocessing.R	testthat::expect_equal	4
test-preprocessing.R	testthat::test_that	2
test-preprocessing.R	testthat::expect_error	2
test-preprocessing.R	testthat::expect_equal	2
test-preprocessing.R	testthat::expect_s3_class	1
test-preprocessing.R	testthat::expect_s4_class	1
test-processing.R	testthat::expect_no_error	25
test-processing.R	testthat::expect_error	19
test-processing.R	testthat::test_that	7
test-processing.R	testthat::expect_s3_class	4
test-processing.R	testthat::expect_no_error	3
test-processing.R	testthat::expect_true	3
test-processing.R	testthat::expect_s4_class	1
test-processing.R	testthat::expect_equal	1
test-processing.R	testthat::expect_warning	1

Packaging guidelines

Issues:

URLs

Found some URLs moved or broken

> urlchecker::url_check()
! Warning: vignettes/v02_climate_examples.Rmd:135:107 Moved
Fourteen variables are available in TerraClimate data. The table below is from [the TerraClimate website](http://www.climatologylab.org/terraclimate-variables.html).
                                                                                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                                                                          https://www.climatologylab.org/terraclimate-variables.html
✖ Error: man/par_make_gridset.Rd:45:11 Error: CRAN URL not in canonical form
See \href{https://cran.r-project.org/web/packages/units/vignettes/measurement_units_in_R.html}{link}
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: man/summarize_sedc.Rd:86:13 403: Forbidden
\item \href{https://dx.doi.org/10.1021/es203152a}{Messier KP, Akita Y, Serre ML. (2012). Integrating Address Geocoding, Land Use Regression, and Spatiotemporal Geostatistical Estimation for Groundwater Tetrachloroethylene. \emph{Environmental Science & Technology} 46(5), 2772-2780. }
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: README.md:656:16 404: Not Found
    [vignette](https://kyle-messier.github.io/chopin/articles/v02_climate_examples.html)
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: man/summarize_sedc.Rd:87:32 Error: SSL certificate problem: unable to get local issuer certificate
\item Wiesner C. (n.d.). \href{https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html}{Euclidean Sum of Exponentially Decaying Contributions Tutorial}
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: README.md:74:24 Error: SSL certificate problem: unable to get local issuer certificate
        contributions](https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

README

avoid #### Last edited: March 22, 2024 risk of being outdated, users can see last modification date on GitHub
increase the font size in the diagram?
add citation information
installation instructions
- could use details for installing dependencies, system requirements
- no need to list devtools install if listing remotes since devtools just wraps remotes
- remove commented out install.packages() until {chopin} is on CRAN
git clone https://github.com/Spatiotemporal-Exposures-and-Toxicology/chopin
- repo moved
source(\"README_run.r\")
- DiagrammeR needs to be installed, consider adding to the rlang check installed line

Functions

par_cut_coords
- Could consider using sf::st_zm(drop = TRUE)
extract_at_poly
- Could consider using st_crs(polys) == st_crs(surf) instead of assuming the same coordinate reference systems are used

Suggestions:

codemetar

consider some of these ways to keep it up to date

print()

is_bbox_within_reference uses print(). Consider using message() instead

Code style

consider running styler::style_pkg
pre commit hook https://styler.r-lib.org/articles/third-party-integrations.html
see this branch in my fork for what the results would look like: https://github.com/robitalec/chopin/tree/fix/styler

Citation file

consider adding a CITATION file with usethis::use_citation(), see here

README

Consider mentioning related packages or alternative approaches

Functions

Are these internal functions?
- datamod
- dep_check
- par_def_q
- par_fallback

sigmafelix commented 1 week ago

Thank you for the thorough review @robitalec! I am going to start from the points on which I can work immediately, and wait for the other review from @Aariq.

To your comment on the benefits/advantages of hiding necessary tasks for spatial data handling, I agree that this practice would confuse experienced users. In the beginning of the function design, we wanted to make parallelization easier for users who are less experienced with spatial data such that would have difficulty with obtaining variables from source data. Thus, most of intermediate steps between data input and export the final table were obscured by design. For details, in the current version of chopin, a vector input is reprojected to a raster input's coordinate system in favor of keeping raster cell values and layouts as they are. For experienced users, I think I could elaborate on the design choices in an introductory vignette for the package. Thank you.

robitalec commented 1 week ago

Great, that sounds good @sigmafelix. Please let me know if there are any comments in my review that aren't clear.

Aariq commented 6 days ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors.
☒ As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

☒ A statement of need: clearly stating problems the software is designed to solve and its target audience in README
☒ Installation instructions: for the development version of package and any non-standard dependencies in README
☐ Vignette(s): demonstrating major functionality that runs successfully locally
- v02_climate_examples.Rmd is not reproducible locally for me
☒ Function Documentation: for all exported functions
☒ Examples: (that run successfully locally) for all exported functions
☐ Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
- There’s a code of conduct, but no contribution guidelines

Functionality

☒ Installation: Installation succeeds as documented.
☒ Functionality: Any functional claims of the software have been confirmed.
☒ Performance: Any performance claims of the software have been confirmed.
- Although, it would be good to see more examples where the overhead cost of parallelizaiton is worth it
☒ Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
☒ Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 3.5

☒ Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (“rev” role) in the package DESCRIPTION file.

Review Comments

The main functionality of this package is really nice. I’m especially impressed with the thought that went into how to break rasters into tiles for optimal parallelization. I’m looking forward to using this package (or at least adapting its ideas) in my own workflows. I think there is some overlap between chopin and geotargets, in that you can do parallel operations via “branching” in targets workflows, but I like that chopin doesn’t require knowing targets (which can be a bit of a paradigm shift in how one writes R code). I also think the functions for creating grids to iterate over could work well with branching in a targets workflow, and I’m looking forward to exploring that. A few things stand out to me as areas for potential improvement: First, it seems that a major limitation to the parallel processing workflow is that it will not work with plan(multisession) or with plan(multicore) on RStudio or Windows. When I try plan(multisession) on macOS with RStudio, I get the error Error in eval(expr, p) : object 'ncpoints_tr' not found or subscript out of bounds, I assume because the raster is not passed to the parallel processes or not wrap()ed and unwrap()ed if it is. That leaves the command line on macOS or linux as the only ways to run this workflow currently. I’ll be honest—that’s enough of a barrier for me that I’m unlikely to use this package. I’m curious if there’s a plan to get plan(multisession) working or if there’s a reason it can’t work. I discovered this limitation on my own running the examples in the readme, but it might be good to document this limitation in the readme and/or in the @note for par_grid(), par_heirarchy(), etc. Second, the majority (or at least very many) of the examples showing parallel workflows are actually slower than the sequential versions. I want to better understand when the parallel processing really helps—either through examples that actually show a time savings, or through more discussion in the documentation of when it is useful (ideally with specifics like raster extent and resolution, number of points, system specs, number of grids, etc.). Third, this package includes a lot of helper functions. Some of these seem like they maybe shouldn’t be exported because they obfuscate more than abstract. Others seem useful, but also like they open users to potentially writing confusing code. I’ve included some more detailed notes on this below.

README.md

GitHub can process and display mermaid charts in README.md, so rather than displaying an image of your mermaid graph, you could consider just including a mermaid chunk with ```mermaid—that gives viewers zoom and pan controls. I’m not sure how pkgdown would handle it though.

I think the “To run the examples” section is probably not necessary.

The section on “why parallelization is slower” points the reader to the climate example vignette. However, most of the examples there also seem to be faster as single threaded operations (just based on outputs from tictoc or system.time(), haven’t been able to test myself).

Documentation

The ... arg is described as “placeholder” in the documentation for extract_at(), but seems like it should say “additional arguments passed to extract_at_buffer() or extract_at_poly()”. The ... argument for extract_at_buffer() and extract_at_poly() might be more accurately documented as “currently unused”.

Also regarding ..., I’m not sure it’s good advice to recommend users use position based arguments with ... as in ?par_grid. It would be safer to recommend using named arguments—e.g. for fun_dist = terra::extract users should pass x and y arguments.

In summarize_sedc() I would love a little more detail about what exactly SEDC is in the description.

Vignettes

Images in the v01_par_make_gridset vignette do not show up on the pkgdown website.
Consider precomputing the climate example vignette rather than just using unexecuted code chunks—this may make it easier to keep the vignette up to date as the package evolves. The lwgeom package appears to be required for that vignette, the saveRDS() code fails if the input/ directory doesn’t exist, and there doesn’t appear to be any code or instructions for downloading the TerraClimate data, so I can’t reproduce the code in this vignette easily.

Functions

Just a few notes on things that stand out to me:

I think a lot of the helper functions might be viewed as less helpful by some (which is fine, they don’t have to use them) because they are not explicit about what objects they take or return. While the package authors likely viewed this as a convenience, others may view these helpers as danger. E.g., let’s say you switch your code to start with a sf object instead of a terra object—now dep_switch() does something different than it did before even though that part of the code has not changed! A “safer” alternative would be st_as_sf() which does the same thing whether run on a SpatVector or an sf object. Similarly, I could imagine a wrapper around terra::vect(), that when given a SpatVector does nothing, would be a “safer” alternative when expecting a SpatVector output.
any_class_args() I don’t quite understand the problem this is attempting to solve, but it feels like something that should not be exported. Also, why not use inherits() here in place of grepl(obj, class)?
dep_check() Another place where inherits() might be cleaner. E.g. I think this does the same: r dep_check <- function(input) { classes <- c("sf", "stars", "SpatVector", "SpatRaster", "SpatVectorProxy") if (!inherits(input, classes)) stop("Input should be one of sf or Spat* object.\n") class <- classes[inherits(input, classes, which = TRUE)] ifelse(class %in% c("SpatVector", "SpatRaster", "SpatVectorProxy"), "terra", "sf") }
dep_switch() is a nice shortcut, but I feel it makes code a little harder to read since it’s not immediately clear whether it is translating from sf/stars to terra or the other way around.
get_clip_ext(): a little confusing/surprising that the description is “setting the clipping extent” (does it get or set?). This is another example of a helper that could potentially do more damage than good because of its assumption that the projection has units in meters (there don’t appear to be any checks on this).
par_def_q(): this seems like an internal function that shouldn’t be exported—it obscures what it does compared to just an example using quantiles() with seq(). It doesn’t appear to be used internally either, so maybe just removed?

Tests

I agree with @robitalec that test coverage is great, but there is maybe too much use of expect_no_error(). Possible improvments might include additional checks the output has the expected structure, checks that the expected object type is returned with expect_s3_class(), and/or using expect_snapshot() to at least check that the output of functions doesn’t unexpectedly change. It would be good to see tests with different future::plan()s running on different operating systems with CI. Some combination of withr::local_options(list(future.plan = "multisession")), and testthat::skip_on_os() should make that possible.

sigmafelix commented 5 days ago

Thank you for the prompt and detailed review, @Aariq! There are a lot of functions/details I need to learn, but I am sure that these functions will help me to improve the package. multicore plan was applied in the most of use cases because it ran without serialization of terra objects. I am going to integrate serialization at the beginning of parallelization (like geotargets) or another workaround in the revision. I will try to submit a revision in a few weeks. Thanks!

kyle-messier commented 5 days ago

@robitalec @Aariq This is what peer-review is all about! Thank y'all so much for the thoughtful and detailed reviews. It has been a tremendous learning experience. We look forward to improving chopin and your comments will also make other packages and projects better too.

beatrizmilz commented 4 days ago

Thank you for your reviews @robitalec and @Aariq !

beatrizmilz commented 3 days ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/638#issuecomment-2187531795 time 3.5

ropensci-review-bot commented 3 days ago

Logged review for Aariq (hours: 3.5)

beatrizmilz commented 3 days ago

Hi @robitalec! How many hours did you spend to make the review? (an estimation, we need this information to record the review!) Thank you!

robitalec commented 3 days ago

Approximately 8 hours @beatrizmilz

Thank you all!

beatrizmilz commented 3 days ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/638#issuecomment-2181114794 time 8

ropensci-review-bot commented 3 days ago

Logged review for robitalec (hours: 8)

ropensci / software-review