Open sigmafelix opened 2 months ago
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help
for help.
:rocket:
Editor check started
:wave:
git hash: 26153abb
Important: All failing checks above must be addressed prior to proceeding
Package License: MIT + file LICENSE
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
|type |package | ncalls|
|:----------|:-----------------|------:|
|internal |base | 123|
|internal |chopin | 33|
|internal |graphics | 8|
|internal |stats | 6|
|internal |utils | 2|
|imports |terra | 38|
|imports |sf | 20|
|imports |dplyr | 5|
|imports |methods | 5|
|imports |exactextractr | 3|
|imports |future.apply | 3|
|imports |rlang | 3|
|imports |igraph | 2|
|imports |anticlust | 1|
|imports |stars | 1|
|suggests |callr | NA|
|suggests |covr | NA|
|suggests |DiagrammeR | NA|
|suggests |doFuture | NA|
|suggests |doParallel | NA|
|suggests |future | NA|
|suggests |future.batchtools | NA|
|suggests |future.callr | NA|
|suggests |knitr | NA|
|suggests |rmarkdown | NA|
|suggests |spatstat.random | NA|
|suggests |testthat | NA|
|suggests |units | NA|
|suggests |withr | NA|
|linking_to |NA | NA|
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(
list (9), switch (9), lapply (7), c (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), data.frame (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), by (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)
buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)
dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)
st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)
points (8)
dist (3), quantile (3)
n (2), left_join (1), mutate (1), summarize (1)
is (3), el (2)
exact_extract (3)
future_lapply (3)
inject (3)
graph_from_edgelist (2)
combn (2)
balanced_clustering (1)
st_as_stars (1)
base
terra
chopin
sf
graphics
stats
dplyr
methods
exactextractr
future.apply
rlang
igraph
utils
anticlust
stars
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1163| 72.0| | |loc_vignettes | 652| 83.8| | |loc_tests | 1193| 89.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 32| 85.6| | |rel_whitespace_tests | 17| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 59| 69.7| | ---
Click to see the interactive network visualisation of calls between objects in package
goodpractice
and other checks#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 8817187321|pages build and deployment |success |a8aaed | 33|2024-04-24 | | 8817133786|pkgdown |success |26153a | 69|2024-04-24 | | 8817133742|R-CMD-check |success |26153a | 149|2024-04-24 | | 8817133730|test-coverage-local |success |26153a | 17|2024-04-24 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following notes: 1. checking installed package size ... NOTE installed size is 25.6Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb 2. checking re-building of vignette outputs ... NOTE Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-load-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 1
|package |version | |:--------|:--------| |pkgstats |0.1.3.13 | |pkgcheck |0.1.2.21 |
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
git hash: c0e53b6e
Package License: MIT + file LICENSE
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
|type |package | ncalls|
|:----------|:-----------------|------:|
|internal |base | 123|
|internal |chopin | 33|
|internal |graphics | 8|
|internal |stats | 6|
|internal |utils | 2|
|imports |terra | 38|
|imports |sf | 20|
|imports |dplyr | 5|
|imports |methods | 5|
|imports |exactextractr | 3|
|imports |future.apply | 3|
|imports |rlang | 3|
|imports |igraph | 2|
|imports |anticlust | 1|
|imports |stars | 1|
|suggests |callr | NA|
|suggests |covr | NA|
|suggests |DiagrammeR | NA|
|suggests |doFuture | NA|
|suggests |doParallel | NA|
|suggests |future | NA|
|suggests |future.batchtools | NA|
|suggests |future.callr | NA|
|suggests |knitr | NA|
|suggests |rmarkdown | NA|
|suggests |spatstat.random | NA|
|suggests |testthat | NA|
|suggests |units | NA|
|suggests |withr | NA|
|linking_to |NA | NA|
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(
list (9), switch (9), lapply (7), c (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), data.frame (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), by (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)
buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)
dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)
st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)
points (8)
dist (3), quantile (3)
n (2), left_join (1), mutate (1), summarize (1)
is (3), el (2)
exact_extract (3)
future_lapply (3)
inject (3)
graph_from_edgelist (2)
combn (2)
balanced_clustering (1)
st_as_stars (1)
base
terra
chopin
sf
graphics
stats
dplyr
methods
exactextractr
future.apply
rlang
igraph
utils
anticlust
stars
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1163| 72.0| | |loc_vignettes | 652| 83.8| | |loc_tests | 1193| 89.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 32| 85.6| | |rel_whitespace_tests | 17| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 59| 69.7| | ---
Click to see the interactive network visualisation of calls between objects in package
goodpractice
and other checks#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 8818976584|pages build and deployment |success |1b88a2 | 36|2024-04-24 | | 8818927586|pkgdown |success |f8c659 | 71|2024-04-24 | | 8818927598|R-CMD-check |success |f8c659 | 151|2024-04-24 | | 8818927585|test-coverage-local |success |f8c659 | 19|2024-04-24 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following notes: 1. checking installed package size ... NOTE installed size is 25.6Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb 2. checking re-building of vignette outputs ... NOTE Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-load-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 1
|package |version | |:--------|:--------| |pkgstats |0.1.3.13 | |pkgcheck |0.1.2.21 |
This package is in top shape and may be passed on to a handling editor
@ropensci-review-bot help
Hello @sigmafelix, here are the things you can ask me to do:
# Add an author's response info to the ROpenSci logs
@ropensci-review-bot submit response <AUTHOR_RESPONSE_URL>
# List all available commands
@ropensci-review-bot help
# Show our Code of Conduct
@ropensci-review-bot code of conduct
# Invite the author of a package to the corresponding rOpenSci team. This command should be issued by the author of the package.
@ropensci-review-bot invite me to ropensci/package-name
# Adds package's repo to the rOpenSci team. This command should be issued after approval and transfer of the package.
@ropensci-review-bot finalize transfer of package-name
# Various package checks
@ropensci-review-bot check package
# Checks srr documentation for stats packages
@ropensci-review-bot check srr
@ropensci-review-bot submit response https://github.com/ropensci/software-review/issues/638#issuecomment-2075277239
Couldn't find entry for chopin in the packages log
@ropensci-review-bot assign @beatrizmilz as editor
Assigned! @beatrizmilz is now the editor
Hello @sigmafelix ! Thank you for your submission. I am the editor who will be working with you on this package.
Spatiotemporal-Exposures-and-Toxicology
and then was transfered to NIEHS
, and some of the links in the documentation are still pointing to the old organization. Can you update them to point to the new organization, please? This might help: https://github.com/search?q=repo%3ANIEHS%2Fchopin%20Spatiotemporal-Exposures-and-Toxicology&type=codechopin
is well detailed and informative. However, the vignette Generate computational grids can be more detailed. It would be great if you could improve that vignette with more text explaining the code and the steps.devtools::test()
and all the tests passed. However, I noticed that there is a lot of messages from dplyr joins, because aparently there are some joins that are being done without declaring which are the key columns. This is not a problem, but it would be nice if you could add the by
argument to the dplyr::**_join()
functions in the code. Here is part of the messages that I got:✔ | 41 | processing [8.1s]
⠏ | 0 | tests Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
⠴ | 6 | tests Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
⠙ | 12 | tests Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id, FIPS)`
Joining with `by = join_by(from_id, to_id)`
✔ | 13 | tests
Can you please address these comments? Thank you!
@beatrizmilz I appreciate your time to review our package. I will address all your points in the revision on top of fixing some bugs we identified in the meantime and respond to you soon. Thank you.
@beatrizmilz Thank you for your patience. I think I addressed all comments in the current version available in main
. Please let me know if you have additional comments.
Thank you @sigmafelix !
@ropensci-review-bot seeking reviewers
Please add this badge to the README of your package repository:
[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/638_status.svg)](https://github.com/ropensci/software-review/issues/638)
Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
git hash: 094a6460
Important: All failing checks above must be addressed prior to proceeding
Package License: MIT + file LICENSE
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
|type |package | ncalls|
|:----------|:-----------------|------:|
|internal |base | 133|
|internal |chopin | 33|
|internal |graphics | 8|
|internal |stats | 6|
|internal |utils | 2|
|imports |terra | 38|
|imports |sf | 20|
|imports |dplyr | 10|
|imports |methods | 5|
|imports |exactextractr | 3|
|imports |future.apply | 3|
|imports |rlang | 3|
|imports |igraph | 2|
|imports |anticlust | 1|
|imports |stars | 1|
|suggests |callr | NA|
|suggests |covr | NA|
|suggests |DiagrammeR | NA|
|suggests |doFuture | NA|
|suggests |doParallel | NA|
|suggests |future | NA|
|suggests |future.batchtools | NA|
|suggests |future.callr | NA|
|suggests |knitr | NA|
|suggests |rmarkdown | NA|
|suggests |spatstat.random | NA|
|suggests |testthat | NA|
|suggests |units | NA|
|suggests |withr | NA|
|linking_to |NA | NA|
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(
list (9), switch (9), c (7), lapply (7), by (6), data.frame (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)
buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)
dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)
st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)
left_join (6), n (2), mutate (1), summarize (1)
points (8)
dist (3), quantile (3)
is (3), el (2)
exact_extract (3)
future_lapply (3)
inject (3)
graph_from_edgelist (2)
combn (2)
balanced_clustering (1)
st_as_stars (1)
base
terra
chopin
sf
dplyr
graphics
stats
methods
exactextractr
future.apply
rlang
igraph
utils
anticlust
stars
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1176| 72.3| | |loc_vignettes | 831| 88.0| | |loc_tests | 1289| 90.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 33| 90.4| | |rel_whitespace_tests | 16| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 61| 70.3| | ---
Click to see the interactive network visualisation of calls between objects in package
goodpractice
and other checks#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9116757038|pages build and deployment |success |9e3af8 | 43|2024-05-16 | | 9116716549|pkgdown |success |094a64 | 77|2024-05-16 | | 9116716555|R-CMD-check |success |094a64 | 157|2024-05-16 | | 9116716552|test-coverage-local |success |094a64 | 25|2024-05-16 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following error: 1. checking re-building of vignette outputs ... ERROR Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-gen-points-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 25.8Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 2 potential issues: message | number of times --- | --- Avoid using sapply, consider vapply instead, that's type safe | 1 Lines should not be more than 80 characters. This line is 81 characters. | 1
|package |version | |:--------|:--------| |pkgstats |0.1.5.2 | |pkgcheck |0.1.2.34 |
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
Hi @sigmafelix ! There was one error on the R CMD check: https://github.com/ropensci/software-review/issues/638#issuecomment-2125717203 . Could you please verify that, before we proceed to add the reviewers?
Thank you!
Hi @sigmafelix ! There was one error on the R CMD check: #638 (comment) . Could you please verify that, before we proceed to add the reviewers?
Thank you!
@beatrizmilz Thank you for the update. In the most recent version of chopin
, I think the R CMD CHECK error was fixed. Thank you!
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
git hash: 052c8627
Package License: MIT + file LICENSE
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
|type |package | ncalls|
|:----------|:-----------------|------:|
|internal |base | 144|
|internal |chopin | 32|
|internal |graphics | 8|
|internal |stats | 5|
|internal |utils | 2|
|imports |terra | 44|
|imports |sf | 20|
|imports |dplyr | 10|
|imports |methods | 6|
|imports |exactextractr | 3|
|imports |future.apply | 3|
|imports |rlang | 3|
|imports |collapse | 3|
|imports |anticlust | 2|
|imports |igraph | 2|
|imports |stars | 1|
|suggests |callr | NA|
|suggests |covr | NA|
|suggests |DiagrammeR | NA|
|suggests |doFuture | NA|
|suggests |future | NA|
|suggests |future.batchtools | NA|
|suggests |future.callr | NA|
|suggests |knitr | NA|
|suggests |rmarkdown | NA|
|suggests |spatstat.random | NA|
|suggests |testthat | NA|
|suggests |units | NA|
|suggests |withr | NA|
|linking_to |NA | NA|
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(
list (10), switch (9), c (7), lapply (7), by (6), unlist (6), nrow (5), which (5), data.frame (4), if (4), names (4), split (4), mapply (3), try (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), expand.grid (2), grep (2), grepl (2), length (2), log10 (2), mode (2), paste (2), round (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), unname (2), args (1), as.data.frame (1), as.list (1), as.logical (1), as.numeric (1), as.vector (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), matrix (1), ncol (1), paste0 (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), startsWith (1), strsplit (1), suppressWarnings (1), table (1), vector (1)
buffer (9), crop (8), crs (6), distance (4), ext (4), vect (4), nearby (2), project (2), rast (2), intersect (1), is.lonlat (1), nlyr (1)
dep_check (14), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), check_subject (1), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), extract_at_buffer_kernel (1), par_group_balanced (1)
st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)
left_join (6), n (2), mutate (1), summarize (1)
points (8)
is (4), el (2)
quantile (3), dist (2)
rowbind (3)
exact_extract (3)
future_lapply (3)
inject (3)
balanced_clustering (2)
graph_from_edgelist (2)
combn (2)
st_as_stars (1)
base
terra
chopin
sf
dplyr
graphics
methods
stats
collapse
exactextractr
future.apply
rlang
anticlust
igraph
utils
stars
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 11 imported packages - 34 exported functions (median 17 lines of code) - 35 non-exported functions in R (median 32 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 8| 88.2| | |loc_R | 1281| 74.5| | |loc_vignettes | 744| 86.2| | |loc_tests | 1449| 91.3| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 69| 66.3| | |n_fns_r_exported | 34| 81.0| | |n_fns_r_not_exported | 35| 57.4| | |n_fns_per_file_r | 7| 78.2| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 18| 42.0| | |loc_per_fn_r_not_exp | 32| 80.3| | |rel_whitespace_R | 11| 63.3| | |rel_whitespace_vignettes | 34| 89.0| | |rel_whitespace_tests | 15| 85.9| | |doclines_per_fn_exp | 36| 45.0| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 68| 72.4| | ---
Click to see the interactive network visualisation of calls between objects in package
goodpractice
and other checks#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9308315230|pages build and deployment |success |764f4b | 53|2024-05-30 | | 9308277371|pkgdown |success |052c86 | 88|2024-05-30 | | 9308277362|R-CMD-check |success |052c86 | 168|2024-05-30 | | 9308277364|test-coverage-local |success |052c86 | 36|2024-05-30 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 25.7Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 98.11 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following function have cyclocomplexity >= 15: function | cyclocomplexity --- | --- par_grid | 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 7 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 7
|package |version | |:--------|:--------| |pkgstats |0.1.5.2 | |pkgcheck |0.1.2.41 |
This package is in top shape and may be passed on to a handling editor
Hi @sigmafelix ! There was one error on the R CMD check: #638 (comment) . Could you please verify that, before we proceed to add the reviewers? Thank you!
@beatrizmilz Thank you for the update. In the most recent version of
chopin
, I think the R CMD CHECK error was fixed. Thank you!
Great!
@ropensci-review-bot assign @robitalec as reviewer
@robitalec added to the reviewers list. Review due date is 2024-06-20. Thanks @robitalec for accepting to review! Please refer to our reviewer guide.
rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.
@robitalec: If you haven't done so, please fill this form for us to update our reviewers records.
@ropensci-review-bot assign @Aariq as reviewer
@Aariq added to the reviewers list. Review due date is 2024-07-08. Thanks @Aariq for accepting to review! Please refer to our reviewer guide.
rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.
@Aariq: If you haven't done so, please fill this form for us to update our reviewers records.
:calendar: @robitalec you have 2 days left before the due date for your review (2024-06-20).
Note, I'm placing checkboxes below each bullet, to highlight the places where I think improvements would be beneficial. I won't checkbox all the things that are already completed to keep it tidier.
The package includes all the following forms of documentation:
extract_at
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).
expect_no_error
with more specific expectationsThank you for inviting me to review @beatrizmilz and, as always, a huge thank you to the broader rOpenSci community for all of your incredible contributions, inclusion and support.
Thank you @sigmafelix and @kyle-messier for this interesting package with functionality I haven't seen elsewhere that should be a great tool for a wide userbase.
Before I start - I really link the name because it evokes chopping up spatial objects into chunks and -- music! Great choice.
My first main and most important comment is that, at the moment, I find {chopin} hard to get started using.
I think the pitch is currently too narrow (also mentioned here), which might turn away folks who don't identify with "health" or "climate" research. I think there is potential for many users across disciplines who have large spatial data (everything is spatial!) to use {chopin} to perform analyses in more manageable chunks, allowing folks to improve the efficiency of their analyses and extend the potential of their machines.
When I open the README, one of the first things I see is notes on data restrictions and z dimensions and spherical geometry. I think this should come later in a caveats section or FAQ vignette.
Next, we get to "Basic design" which reads more like an overview of functions but also gets into which packages underpin {chopin} functionality, package versions, and registering parallel workers. I think this is the section that really needs to give users a concise and clear overview of the functions. This can and should be reused in other places to follow the recommended principle of multiple points of entry, where users should be anticipated to first encounter {chopin} across every source of information (README, documentation, vignettes) and given a clear introduction.
extract_at_poly
missing descriptionIt seems there are currently three main families of functions exported and number of helper or internal functions:
par_*
par_grid
and others for running functions in parallelpar_make_*
for making gridssummarize_*
extract_at_*
These are the kinds of questions I'm left as an early user with after reading the overview
par_make_*
functions be split into a new grid_*
family of functions?extract_at_buffer
with a kernel weight and summary_aw
an area-weighted summary? What is the difference between par_make_grid
and par_make_gridset
?par_group_grid
is an extension of par_group_balanced for padded grids, should it be renamed something like par_group_balanced_padded
? Or should padded = TRUE
be added as an argument?From the Description, current version
Package: chopin
Title: CHOPIN: Computation for Climate and Health research On Parallelized INfrastructure
Description: It enables users with basic understanding on geospatial data
and sf and terra functions to parallelize geospatial operations for
geospatial exposure assessment modeling and covariate computation.
Parallelization is done by dividing large datasets into sub-regions
with regular grids and data's own hierarchy. A computation over the
large number of raster files can be parallelized with a chopin
function as well.
Suggestions:
Package: chopin
Title: Parallelize Geospatial Operations Over Sub-Regions, Regular Grids and Lists of Rasters
Description: It enables users to parallelize geospatial operations over
sub-regions, regular grids and lists of rasters. Parallelization
is done by dividing large datasets into sub-regions
with regular grids and data's own hierarchy. A computation over the
large number of raster files can be parallelized with a chopin
function as well. (TODO: maybe mention {future}, {exactextractr}, {terra}, {sf})
Another reason that I might be finding {chopin} hard to get started using is that it seems to obscure lots of the regular terra/sf steps I would typically do myself as an R user. For example, the extract_at_buffer
wraps the exact_extract
function from {exactextractr} but before calling it, it runs terra::buffer
, manages projections, crops, calculates extents, etc. This is also the case in functions like reproject_std
and vect_valid_repair
which reproject/repair objects using functions terra or sf based on the object type.
This also shows itself where the authors manually switch within functions depending on terra or sf input. For example, using dep_check
and testing if == 'terra' else if == 'sf'
. This feels like an opportunity to take advantage of S3 methods. Read more in Advanced R: S3 and R Packages: 11.9. An example applied is in {tmap}'s internal function tmapShape
. Obviously this would require a fair amount of refactoring - I just wanted to highlight an alternative approach that might be more flexible to adding new supported data types in the future.
This is definitely an open ended comment - I just wonder where the point is for users where hiding away managing regular steps in a spatial analysis (like projections, extents, clipping, etc) actually results in more confusion than it helps.
My last general comment is - how do you envision {chopin} fitting into the broader High Performance Computing landscape in R? One thing I'm personally really interested is how it might interface with {targets}. Looks like maybe I'm saved from commenting too much on this, as my co reviewer @Aariq is one of the authors on the new {geotargets} package along with @njtierney and @brownag (thank you!). I have been eagerly following their progress as a regular {targets} user that has encountered the challenges of including spatial data in {targets} workflows. I wonder how these tools can be brought together or, alternatively, how the authors might communicate their differences and where a user might choose one over the other.
Below: detailed comments following sections in Documentation and Functionality bullets from the top.
Suggestions:
Issues:
An overview of the vignettes and README material:
chopin
in HPC
chopin
It is unclear to a user which vignette is the introductory one. The rOpenSci Packaging Guide suggests "The general vignette should present a series of examples progressing in complexity from basic to advanced usage.". At the moment, this looks most like the "Extracting Weather/Climate Geospatial Data with chopin
" vignette but the title is fairly specific and it is listed last on the website.
Consider splitting information about HPC or hardware into its own vignette. Also consider moving system.time benchmarking to a separate vignette? This could also potentially include tips for saving time and memory, and/or hardware recommendations.
Consider incorporating a section in the introductory vignette with a caveats section
Consider revising the vignettes and README to develop bullet points into paragraphs. Some shorter lists to compare options or list steps can be helpful but, generally, I find a full page of bullets hard to process.
Images on the website are giving 404 errors
Package issues
chopin
pkgs <- c("chopin", "terra", "stars", "sf", "future", "doFuture", "dplyr", "parallelly", "tigris", "tictoc") invisible(sapply(pkgs, library, character.only = TRUE, quietly = TRUE))
library(chopin)
Using :: or both :: and library in vignettes (users are likely more familiar with library()
)
chopin
Suggestions:
Missing link to external functions / packages
chopin
in HPC
chopin
Issues:
Missing top-level documentation
usethis::use_package_doc
Suggestions:
Missing links to {chopin} functions, external functions, other packages
par_group_balanced
par_group_balanced
from par_group_grid
, ...par_group_balanced
Missing documentation of argument defaults
clip_ras_ext
nqsegs = 180Lextract_at_poly
func = meanUnclear functions
Use of ...
par_grid
Issues:
Missing example
extract_at
(reuse example see roxygen2 vignette)Suggestions:
Examples not run with if (FALSE)
or dontrun. Consider a smaller toy example?
par_grid
par_hierarchy
par_merge_grid
par_multirasters
vect_valid_repair
Using :: or both :: and library in examples
clip_vec_ext
clip_ras_ext
datamod
dep_check
dep_switch
Issues:
There are no contributing guidelines in the README or CONTRIBUTING
Issues:
There is good test coverage in the package but I am a bit concerned about the number of times the authors use the expect_no_error
function.
When I think of effective package testing, I think of both clarity for the user reading the tests and in the authors setting up expectations for how the package may fail (anticipating your own errors, warnings or messages). What does the test expect_no_error
communicate to the user or set up as expectations?
Here are some ideas for expanding or adjusting the tests:
test_that()
call and expand on the description
test-gridding.R
(L116) to eg.{rnaturalearth} test_ne_states.R
library(data.table)
paths <- dir('../chopin/tests/testthat', full.names = TRUE)
DT <- data.table(path = paths)
counts <- DT[, tstrsplit(grep('testthat::', readLines(path), value = TRUE), c('\\('), keep = 1), by = path]
counts[, .N, V1][order(-N)]
Test function | N |
---|---|
testthat::expect_no_error | 59 |
testthat::expect_error | 47 |
testthat::test_that | 30 |
testthat::expect_equal | 20 |
testthat::expect_true | 16 |
testthat::expect_s4_class | 12 |
testthat::expect_s3_class | 12 |
testthat::expect_no_error | 12 |
testthat::expect_equal | 10 |
testthat::expect_message | 7 |
testthat::expect_error | 6 |
testthat::expect_s3_class | 5 |
testthat::expect_true | 4 |
testthat::expect_condition | 2 |
testthat::expect_warning | 2 |
testthat::expect_warning | 2 |
testthat::expect_no_warning | 1 |
counts[, .N, .(path, V1)][order(path, -N)]
path | Test function | N |
---|---|---|
test-any_class_args.R | testthat::expect_true | 5 |
test-any_class_args.R | testthat::test_that | 1 |
test-any_class_args.R | testthat::expect_s4_class | 1 |
test-check.R | testthat::expect_equal | 10 |
test-check.R | testthat::test_that | 9 |
test-check.R | testthat::expect_no_error | 8 |
test-check.R | testthat::expect_error | 8 |
test-check.R | testthat::expect_equal | 4 |
test-check.R | testthat::expect_s3_class | 2 |
test-check.R | testthat::expect_s4_class | 2 |
test-check.R | testthat::expect_error | 1 |
test-distribute_suites.R | testthat::expect_no_error | 9 |
test-distribute_suites.R | testthat::expect_no_error | 8 |
test-distribute_suites.R | testthat::expect_s3_class | 5 |
test-distribute_suites.R | testthat::test_that | 4 |
test-distribute_suites.R | testthat::expect_s3_class | 4 |
test-distribute_suites.R | testthat::expect_true | 4 |
test-distribute_suites.R | testthat::expect_error | 2 |
test-distribute_suites.R | testthat::expect_true | 2 |
test-distribute_suites.R | testthat::expect_error | 2 |
test-distribute_suites.R | testthat::expect_equal | 2 |
test-distribute_suites.R | testthat::expect_equal | 2 |
test-distribute_suites.R | testthat::expect_condition | 2 |
test-distribute_suites.R | testthat::expect_s4_class | 1 |
test-gridding.R | testthat::expect_error | 18 |
test-gridding.R | testthat::expect_no_error | 17 |
test-gridding.R | testthat::expect_message | 7 |
test-gridding.R | testthat::test_that | 6 |
test-gridding.R | testthat::expect_true | 6 |
test-gridding.R | testthat::expect_s4_class | 6 |
test-gridding.R | testthat::expect_equal | 5 |
test-gridding.R | testthat::expect_warning | 2 |
test-gridding.R | testthat::expect_warning | 1 |
test-gridding.R | testthat::expect_s3_class | 1 |
test-gridding.R | testthat::expect_no_warning | 1 |
test-par_fallback.R | testthat::test_that | 1 |
test-par_fallback.R | testthat::expect_no_error | 1 |
test-par_fallback.R | testthat::expect_error | 1 |
test-preprocessing.R | testthat::expect_equal | 4 |
test-preprocessing.R | testthat::test_that | 2 |
test-preprocessing.R | testthat::expect_error | 2 |
test-preprocessing.R | testthat::expect_equal | 2 |
test-preprocessing.R | testthat::expect_s3_class | 1 |
test-preprocessing.R | testthat::expect_s4_class | 1 |
test-processing.R | testthat::expect_no_error | 25 |
test-processing.R | testthat::expect_error | 19 |
test-processing.R | testthat::test_that | 7 |
test-processing.R | testthat::expect_s3_class | 4 |
test-processing.R | testthat::expect_no_error | 3 |
test-processing.R | testthat::expect_true | 3 |
test-processing.R | testthat::expect_s4_class | 1 |
test-processing.R | testthat::expect_equal | 1 |
test-processing.R | testthat::expect_warning | 1 |
Issues:
URLs
> urlchecker::url_check()
! Warning: vignettes/v02_climate_examples.Rmd:135:107 Moved
Fourteen variables are available in TerraClimate data. The table below is from [the TerraClimate website](http://www.climatologylab.org/terraclimate-variables.html).
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
https://www.climatologylab.org/terraclimate-variables.html
✖ Error: man/par_make_gridset.Rd:45:11 Error: CRAN URL not in canonical form
See \href{https://cran.r-project.org/web/packages/units/vignettes/measurement_units_in_R.html}{link}
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: man/summarize_sedc.Rd:86:13 403: Forbidden
\item \href{https://dx.doi.org/10.1021/es203152a}{Messier KP, Akita Y, Serre ML. (2012). Integrating Address Geocoding, Land Use Regression, and Spatiotemporal Geostatistical Estimation for Groundwater Tetrachloroethylene. \emph{Environmental Science & Technology} 46(5), 2772-2780. }
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: README.md:656:16 404: Not Found
[vignette](https://kyle-messier.github.io/chopin/articles/v02_climate_examples.html)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: man/summarize_sedc.Rd:87:32 Error: SSL certificate problem: unable to get local issuer certificate
\item Wiesner C. (n.d.). \href{https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html}{Euclidean Sum of Exponentially Decaying Contributions Tutorial}
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: README.md:74:24 Error: SSL certificate problem: unable to get local issuer certificate
contributions](https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
README
#### Last edited: March 22, 2024
risk of being outdated, users can see last modification date on GitHubgit clone https://github.com/Spatiotemporal-Exposures-and-Toxicology/chopin
source(\"README_run.r\")
Functions
par_cut_coords
sf::st_zm(drop = TRUE)
extract_at_poly
st_crs(polys) == st_crs(surf)
instead of assuming the same coordinate reference systems are usedSuggestions:
codemetar
print()
is_bbox_within_reference
uses print()
. Consider using message()
insteadCode style
Citation file
usethis::use_citation()
, see hereREADME
Functions
datamod
dep_check
par_def_q
par_fallback
Thank you for the thorough review @robitalec! I am going to start from the points on which I can work immediately, and wait for the other review from @Aariq.
To your comment on the benefits/advantages of hiding necessary tasks for spatial data handling, I agree that this practice would confuse experienced users. In the beginning of the function design, we wanted to make parallelization easier for users who are less experienced with spatial data such that would have difficulty with obtaining variables from source data. Thus, most of intermediate steps between data input and export the final table were obscured by design. For details, in the current version of chopin
, a vector input is reprojected to a raster input's coordinate system in favor of keeping raster cell values and layouts as they are. For experienced users, I think I could elaborate on the design choices in an introductory vignette for the package. Thank you.
Great, that sounds good @sigmafelix. Please let me know if there are any comments in my review that aren't clear.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).
Estimated hours spent reviewing: 3.5
The main functionality of this package is really nice. I’m especially impressed with the thought that went into how to break rasters into tiles for optimal parallelization. I’m looking forward to using this package (or at least adapting its ideas) in my own workflows. I think there is some overlap between chopin
and geotargets
, in that you can do parallel operations via “branching” in targets
workflows, but I like that chopin
doesn’t require knowing targets
(which can be a bit of a paradigm shift in how one writes R code). I also think the functions for creating grids to iterate over could work well with branching in a targets
workflow, and I’m looking forward to exploring that. A few things stand out to me as areas for potential improvement: First, it seems that a major limitation to the parallel processing workflow is that it will not work with plan(multisession)
or with plan(multicore)
on RStudio or Windows. When I try plan(multisession)
on macOS with RStudio, I get the error Error in eval(expr, p) : object 'ncpoints_tr' not found
or subscript out of bounds
, I assume because the raster is not passed to the parallel processes or not wrap()
ed and unwrap()
ed if it is. That leaves the command line on macOS or linux as the only ways to run this workflow currently. I’ll be honest—that’s enough of a barrier for me that I’m unlikely to use this package. I’m curious if there’s a plan to get plan(multisession)
working or if there’s a reason it can’t work. I discovered this limitation on my own running the examples in the readme, but it might be good to document this limitation in the readme and/or in the @note
for par_grid()
, par_heirarchy()
, etc. Second, the majority (or at least very many) of the examples showing parallel workflows are actually slower than the sequential versions. I want to better understand when the parallel processing really helps—either through examples that actually show a time savings, or through more discussion in the documentation of when it is useful (ideally with specifics like raster extent and resolution, number of points, system specs, number of grids, etc.). Third, this package includes a lot of helper functions. Some of these seem like they maybe shouldn’t be exported because they obfuscate more than abstract. Others seem useful, but also like they open users to potentially writing confusing code. I’ve included some more detailed notes on this below.
GitHub can process and display mermaid charts in README.md, so rather than displaying an image of your mermaid graph, you could consider just including a mermaid chunk with ```mermaid
—that gives viewers zoom and pan controls. I’m not sure how pkgdown
would handle it though.
I think the “To run the examples” section is probably not necessary.
The section on “why parallelization is slower” points the reader to the climate example vignette. However, most of the examples there also seem to be faster as single threaded operations (just based on outputs from tictoc
or system.time()
, haven’t been able to test myself).
The ...
arg is described as “placeholder” in the documentation for extract_at()
, but seems like it should say “additional arguments passed to extract_at_buffer()
or extract_at_poly()
”. The ...
argument for extract_at_buffer()
and extract_at_poly()
might be more accurately documented as “currently unused”.
Also regarding ...
, I’m not sure it’s good advice to recommend users use position based arguments with ...
as in ?par_grid
. It would be safer to recommend using named arguments—e.g. for fun_dist = terra::extract
users should pass x
and y
arguments.
In summarize_sedc()
I would love a little more detail about what exactly SEDC is in the description.
lwgeom
package appears to be required for that vignette, the saveRDS()
code fails if the input/
directory doesn’t exist, and there doesn’t appear to be any code or instructions for downloading the TerraClimate data, so I can’t reproduce the code in this vignette easily.Just a few notes on things that stand out to me:
I think a lot of the helper functions might be viewed as less helpful by some (which is fine, they don’t have to use them) because they are not explicit about what objects they take or return. While the package authors likely viewed this as a convenience, others may view these helpers as danger. E.g., let’s say you switch your code to start with a sf
object instead of a terra
object—now dep_switch()
does something different than it did before even though that part of the code has not changed! A “safer” alternative would be st_as_sf()
which does the same thing whether run on a SpatVector or an sf object. Similarly, I could imagine a wrapper around terra::vect()
, that when given a SpatVector does nothing, would be a “safer” alternative when expecting a SpatVector output.
any_class_args()
I don’t quite understand the problem this is attempting to solve, but it feels like something that should not be exported. Also, why not use inherits()
here in place of grepl(obj, class)
?
dep_check()
Another place where inherits()
might be cleaner. E.g. I think this does the same: r dep_check <- function(input) { classes <- c("sf", "stars", "SpatVector", "SpatRaster", "SpatVectorProxy") if (!inherits(input, classes)) stop("Input should be one of sf or Spat* object.\n") class <- classes[inherits(input, classes, which = TRUE)] ifelse(class %in% c("SpatVector", "SpatRaster", "SpatVectorProxy"), "terra", "sf") }
dep_switch()
is a nice shortcut, but I feel it makes code a little harder to read since it’s not immediately clear whether it is translating from sf
/stars
to terra
or the other way around.
get_clip_ext()
: a little confusing/surprising that the description is “setting the clipping extent” (does it get or set?). This is another example of a helper that could potentially do more damage than good because of its assumption that the projection has units in meters (there don’t appear to be any checks on this).
par_def_q()
: this seems like an internal function that shouldn’t be exported—it obscures what it does compared to just an example using quantiles()
with seq()
. It doesn’t appear to be used internally either, so maybe just removed?
I agree with @robitalec that test coverage is great, but there is maybe too much use of expect_no_error()
. Possible improvments might include additional checks the output has the expected structure, checks that the expected object type is returned with expect_s3_class()
, and/or using expect_snapshot()
to at least check that the output of functions doesn’t unexpectedly change. It would be good to see tests with different future::plan()
s running on different operating systems with CI. Some combination of withr::local_options(list(future.plan = "multisession"))
, and testthat::skip_on_os()
should make that possible.
Thank you for the prompt and detailed review, @Aariq! There are a lot of functions/details I need to learn, but I am sure that these functions will help me to improve the package. multicore
plan was applied in the most of use cases because it ran without serialization of terra
objects. I am going to integrate serialization at the beginning of parallelization (like geotargets
) or another workaround in the revision. I will try to submit a revision in a few weeks. Thanks!
@robitalec @Aariq This is what peer-review is all about! Thank y'all so much for the thoughtful and detailed reviews. It has been a tremendous learning experience. We look forward to improving chopin
and your comments will also make other packages and projects better too.
Thank you for your reviews @robitalec and @Aariq !
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/638#issuecomment-2187531795 time 3.5
Logged review for Aariq (hours: 3.5)
Hi @robitalec! How many hours did you spend to make the review? (an estimation, we need this information to record the review!) Thank you!
Approximately 8 hours @beatrizmilz
Thank you all!
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/638#issuecomment-2181114794 time 8
Logged review for robitalec (hours: 8)
Submitting Author Name: Insang Song Submitting Author Github Handle: !--author1-->@sigmafelix<!--end-author1-- Other Package Authors Github handles: (comma separated, delete if none) !--author-others-->@kyle-messier<!--end-author-others-- Repository: https://github.com/NIEHS/chopin Version submitted: 0.6.2.20240423 Submission type: Standard Editor: !--editor-->@beatrizmilz<!--end-editor-- Reviewers: @robitalec, @Aariq
Archive: TBD Version accepted: TBD Language: en
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences): :
chopin
supports parallel processing for functions in popular spatial data manipulation packagessf
andterra
on a high-level parallelization frameworkfuture
. This feature fits to the geospatial data category.Who is the target audience and what are scientific applications of this package? : Our first target audience group is spatial epidemiologists and health geographers who want to calculating spatial covariates from spatial and spatiotemporal datasets including climate, transportation, demographics, topography, hydrography, and others. We expect that users are cognizant of basic geographic information system/science. The wider audience could take advantage of the flexibility of this package for expediting spatial operations.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category? : A selection of functions in
terra
(e.g.,*app
andpredict
) supports internal parallelization, where a single dataset is accepted. To the best of our knowledge, no packaged solution exists for parallelization of spatial operations where two datasets are involved. sprawl (GitHub-only; not maintained) partially overlaps this package's functionality in that it includes convenience functions connecting multiple basic spatial operations. Besides the R functions, a handful of teaching or demonstration materials briefly covered parallelization of spatial data applications ([1], [2], [3]).(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? : Not applicable
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted. : Presubmission inquiry of the previous version of this package -- #630 commented by @ldecicco-USGS
Explain reasons for any
pkgcheck
items which your package is unable to pass. : Coverage rate (99%) and installation size (25.7 MB; of which data (2.0 MB) and extdata (23.1 MB) exceeded recommended sizes) resulted in notes.Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
[x] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct
Thank you very much for your consideration!