ropensci / software-review

rOpenSci Software Peer Review.
286 stars 104 forks source link

`chopin`: Computation for Climate and Health research On Parallelized INfrastructure #638

Open sigmafelix opened 2 months ago

sigmafelix commented 2 months ago

Submitting Author Name: Insang Song Submitting Author Github Handle: !--author1-->@sigmafelix<!--end-author1-- Other Package Authors Github handles: (comma separated, delete if none) !--author-others-->@kyle-messier<!--end-author-others-- Repository: https://github.com/NIEHS/chopin Version submitted: 0.6.2.20240423 Submission type: Standard Editor: !--editor-->@beatrizmilz<!--end-editor-- Reviewers: @robitalec, @Aariq

Archive: TBD Version accepted: TBD Language: en


Package: chopin
Title: CHOPIN: Computation for Climate and Health research On Parallelized INfrastructure
Version: 0.6.2.20240423
Authors@R: c(
    person("Insang", "Song", , "geoissong@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-8732-3256")),
    person("Kyle", "Messier", role = c("aut", "ctb"),
           comment = c(ORCID = "0000-0001-9508-9623"))
  )
Description: It enables users with basic understanding on geospatial data
    and sf and terra functions to parallelize geospatial operations for
    geospatial exposure assessment modeling and covariate computation.
    Parallelization is done by dividing large datasets into sub-regions
    with regular grids and data's own hierarchy. A computation over the
    large number of raster files can be parallelized with a chopin
    function as well.
License: MIT + file LICENSE
URL: https://github.com/NIEHS/chopin
BugReports: https://github.com/NIEHS/chopin/issues
Depends: 
    R (>= 4.1)
Encoding: UTF-8
LazyData: true
LazyDataCompression: xz
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Imports: 
    anticlust,
    dplyr (>= 1.1.0),
    exactextractr (>= 0.8.2),
    future.apply,
    igraph,
    methods,
    rlang (>= 0.4.9),
    sf (>= 1.0-10),
    stars (>= 0.6-0),
    terra (>= 1.7-18)
Suggests: 
    callr,
    covr,
    DiagrammeR,
    doFuture,
    doParallel,
    future,
    future.batchtools,
    future.callr,
    knitr,
    rmarkdown,
    spatstat.random,
    testthat (>= 3.0.0),
    units,
    withr
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

Thank you very much for your consideration!

ropensci-review-bot commented 2 months ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 2 months ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 months ago

Checks for chopin (v0.6.2.20240423)

git hash: 26153abb

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 123| |internal |chopin | 33| |internal |graphics | 8| |internal |stats | 6| |internal |utils | 2| |imports |terra | 38| |imports |sf | 20| |imports |dplyr | 5| |imports |methods | 5| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |igraph | 2| |imports |anticlust | 1| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |doParallel | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (9), switch (9), lapply (7), c (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), data.frame (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), by (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)

terra

buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

graphics

points (8)

stats

dist (3), quantile (3)

dplyr

n (2), left_join (1), mutate (1), summarize (1)

methods

is (3), el (2)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

igraph

graph_from_edgelist (2)

utils

combn (2)

anticlust

balanced_clustering (1)

stars

st_as_stars (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1163| 72.0| | |loc_vignettes | 652| 83.8| | |loc_tests | 1193| 89.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 32| 85.6| | |rel_whitespace_tests | 17| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 59| 69.7| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 8817187321|pages build and deployment |success |a8aaed | 33|2024-04-24 | | 8817133786|pkgdown |success |26153a | 69|2024-04-24 | | 8817133742|R-CMD-check |success |26153a | 149|2024-04-24 | | 8817133730|test-coverage-local |success |26153a | 17|2024-04-24 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following notes: 1. checking installed package size ... NOTE installed size is 25.6Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb 2. checking re-building of vignette outputs ... NOTE Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-load-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 1


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3.13 | |pkgcheck |0.1.2.21 |


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

sigmafelix commented 2 months ago

@ropensci-review-bot check package

ropensci-review-bot commented 2 months ago

Thanks, about to send the query.

ropensci-review-bot commented 2 months ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 months ago

Checks for chopin (v0.6.2.20240423)

git hash: c0e53b6e

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 123| |internal |chopin | 33| |internal |graphics | 8| |internal |stats | 6| |internal |utils | 2| |imports |terra | 38| |imports |sf | 20| |imports |dplyr | 5| |imports |methods | 5| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |igraph | 2| |imports |anticlust | 1| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |doParallel | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (9), switch (9), lapply (7), c (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), data.frame (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), by (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)

terra

buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

graphics

points (8)

stats

dist (3), quantile (3)

dplyr

n (2), left_join (1), mutate (1), summarize (1)

methods

is (3), el (2)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

igraph

graph_from_edgelist (2)

utils

combn (2)

anticlust

balanced_clustering (1)

stars

st_as_stars (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1163| 72.0| | |loc_vignettes | 652| 83.8| | |loc_tests | 1193| 89.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 32| 85.6| | |rel_whitespace_tests | 17| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 59| 69.7| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 8818976584|pages build and deployment |success |1b88a2 | 36|2024-04-24 | | 8818927586|pkgdown |success |f8c659 | 71|2024-04-24 | | 8818927598|R-CMD-check |success |f8c659 | 151|2024-04-24 | | 8818927585|test-coverage-local |success |f8c659 | 19|2024-04-24 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following notes: 1. checking installed package size ... NOTE installed size is 25.6Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb 2. checking re-building of vignette outputs ... NOTE Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-load-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 1


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3.13 | |pkgcheck |0.1.2.21 |


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

sigmafelix commented 2 months ago

@ropensci-review-bot help

ropensci-review-bot commented 2 months ago

Hello @sigmafelix, here are the things you can ask me to do:


# Add an author's response info to the ROpenSci logs
@ropensci-review-bot submit response <AUTHOR_RESPONSE_URL>

# List all available commands
@ropensci-review-bot help

# Show our Code of Conduct
@ropensci-review-bot code of conduct

# Invite the author of a package to the corresponding rOpenSci team. This command should be issued by the author of the package.
@ropensci-review-bot invite me to ropensci/package-name

# Adds package's repo to the rOpenSci team. This command should be issued after approval and transfer of the package.
@ropensci-review-bot finalize transfer of package-name

# Various package checks
@ropensci-review-bot check package

# Checks srr documentation for stats packages
@ropensci-review-bot check srr
sigmafelix commented 2 months ago

@ropensci-review-bot submit response https://github.com/ropensci/software-review/issues/638#issuecomment-2075277239

ropensci-review-bot commented 2 months ago

Couldn't find entry for chopin in the packages log

ldecicco-USGS commented 2 months ago

@ropensci-review-bot assign @beatrizmilz as editor

ropensci-review-bot commented 2 months ago

Assigned! @beatrizmilz is now the editor

beatrizmilz commented 1 month ago

Hello @sigmafelix ! Thank you for your submission. I am the editor who will be working with you on this package.

Editor checks:

Editor comments

✔ |         41 | processing [8.1s]                                             
⠏ |          0 | tests                                                         Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
⠴ |          6 | tests                                                         Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id)`
Joining with `by = join_by(from_id, to_id)`
⠙ |         12 | tests                                                         Joining with `by = join_by(from_id)`
Joining with `by = join_by(to_id, FIPS)`
Joining with `by = join_by(from_id, to_id)`
✔ |         13 | tests

Can you please address these comments? Thank you!

sigmafelix commented 1 month ago

@beatrizmilz I appreciate your time to review our package. I will address all your points in the revision on top of fixing some bugs we identified in the meantime and respond to you soon. Thank you.

sigmafelix commented 1 month ago

@beatrizmilz Thank you for your patience. I think I addressed all comments in the current version available in main. Please let me know if you have additional comments.

beatrizmilz commented 1 month ago

Thank you @sigmafelix !

beatrizmilz commented 1 month ago

@ropensci-review-bot seeking reviewers

ropensci-review-bot commented 1 month ago

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/638_status.svg)](https://github.com/ropensci/software-review/issues/638)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

beatrizmilz commented 1 month ago

@ropensci-review-bot check package

ropensci-review-bot commented 1 month ago

Thanks, about to send the query.

ropensci-review-bot commented 1 month ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 month ago

Checks for chopin (v0.6.3.20240515)

git hash: 094a6460

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 133| |internal |chopin | 33| |internal |graphics | 8| |internal |stats | 6| |internal |utils | 2| |imports |terra | 38| |imports |sf | 20| |imports |dplyr | 10| |imports |methods | 5| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |igraph | 2| |imports |anticlust | 1| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |doParallel | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (9), switch (9), c (7), lapply (7), by (6), data.frame (6), unlist (6), split (5), which (5), names (4), grepl (3), if (3), mapply (3), nrow (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), expand.grid (2), log10 (2), mode (2), paste (2), round (2), rownames (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), args (1), as.data.frame (1), as.logical (1), as.numeric (1), as.vector (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), strsplit (1), suppressWarnings (1), table (1), try (1), vector (1)

terra

buffer (9), crop (8), ext (4), rast (4), distance (3), crs (2), nearby (2), project (2), crds (1), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (15), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), ext2poly (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

dplyr

left_join (6), n (2), mutate (1), summarize (1)

graphics

points (8)

stats

dist (3), quantile (3)

methods

is (3), el (2)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

igraph

graph_from_edgelist (2)

utils

combn (2)

anticlust

balanced_clustering (1)

stars

st_as_stars (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 10 imported packages - 33 exported functions (median 17 lines of code) - 34 non-exported functions in R (median 29 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 9| 89.6| | |loc_R | 1176| 72.3| | |loc_vignettes | 831| 88.0| | |loc_tests | 1289| 90.2| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 67| 65.5| | |n_fns_r_exported | 33| 80.4| | |n_fns_r_not_exported | 34| 56.6| | |n_fns_per_file_r | 7| 77.9| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 17| 40.3| | |loc_per_fn_r_not_exp | 29| 77.1| | |rel_whitespace_R | 12| 61.7| | |rel_whitespace_vignettes | 33| 90.4| | |rel_whitespace_tests | 16| 85.6| | |doclines_per_fn_exp | 37| 45.3| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 61| 70.3| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9116757038|pages build and deployment |success |9e3af8 | 43|2024-05-16 | | 9116716549|pkgdown |success |094a64 | 77|2024-05-16 | | 9116716555|R-CMD-check |success |094a64 | 157|2024-05-16 | | 9116716552|test-coverage-local |success |094a64 | 25|2024-05-16 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following error: 1. checking re-building of vignette outputs ... ERROR Error(s) in re-building vignettes: ... --- re-building ‘v00_good_practice_parallelization.Rmd’ using rmarkdown --- finished re-building ‘v00_good_practice_parallelization.Rmd’ --- re-building ‘v01_par_make_gridset.Rmd’ using rmarkdown File figures/nc-gen-points-1.png not found in resource path Error: processing vignette 'v01_par_make_gridset.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v01_par_make_gridset.Rmd’ --- re-building ‘v02_climate_examples.Rmd’ using rmarkdown File figures/climate-se-states.png not found in resource path Error: processing vignette 'v02_climate_examples.Rmd' failed with diagnostics: pandoc document conversion failed with error 99 --- failed re-building ‘v02_climate_examples.Rmd’ SUMMARY: processing the following files failed: ‘v01_par_make_gridset.Rmd’ ‘v02_climate_examples.Rmd’ Error: Vignette re-building failed. Execution halted R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 25.8Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.65 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 2 potential issues: message | number of times --- | --- Avoid using sapply, consider vapply instead, that's type safe | 1 Lines should not be more than 80 characters. This line is 81 characters. | 1


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.5.2 | |pkgcheck |0.1.2.34 |


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

beatrizmilz commented 1 month ago

Hi @sigmafelix ! There was one error on the R CMD check: https://github.com/ropensci/software-review/issues/638#issuecomment-2125717203 . Could you please verify that, before we proceed to add the reviewers?

Thank you!

sigmafelix commented 1 month ago

Hi @sigmafelix ! There was one error on the R CMD check: #638 (comment) . Could you please verify that, before we proceed to add the reviewers?

Thank you!

@beatrizmilz Thank you for the update. In the most recent version of chopin, I think the R CMD CHECK error was fixed. Thank you!

beatrizmilz commented 1 month ago

@ropensci-review-bot check package

ropensci-review-bot commented 1 month ago

Thanks, about to send the query.

ropensci-review-bot commented 1 month ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 month ago

Checks for chopin (v0.7.0.20240525)

git hash: 052c8627

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:-----------------|------:| |internal |base | 144| |internal |chopin | 32| |internal |graphics | 8| |internal |stats | 5| |internal |utils | 2| |imports |terra | 44| |imports |sf | 20| |imports |dplyr | 10| |imports |methods | 6| |imports |exactextractr | 3| |imports |future.apply | 3| |imports |rlang | 3| |imports |collapse | 3| |imports |anticlust | 2| |imports |igraph | 2| |imports |stars | 1| |suggests |callr | NA| |suggests |covr | NA| |suggests |DiagrammeR | NA| |suggests |doFuture | NA| |suggests |future | NA| |suggests |future.batchtools | NA| |suggests |future.callr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |spatstat.random | NA| |suggests |testthat | NA| |suggests |units | NA| |suggests |withr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (10), switch (9), c (7), lapply (7), by (6), unlist (6), nrow (5), which (5), data.frame (4), if (4), names (4), split (4), mapply (3), try (3), abs (2), any (2), as.integer (2), ceiling (2), class (2), expand.grid (2), grep (2), grepl (2), length (2), log10 (2), mode (2), paste (2), round (2), seq (2), seq_len (2), sum (2), t (2), tryCatch (2), unique (2), unname (2), args (1), as.data.frame (1), as.list (1), as.logical (1), as.numeric (1), as.vector (1), exp (1), for (1), formals (1), ifelse (1), intersect (1), lengths (1), matrix (1), ncol (1), paste0 (1), pi (1), rbind (1), Reduce (1), search (1), seq_along (1), sort (1), sprintf (1), startsWith (1), strsplit (1), suppressWarnings (1), table (1), vector (1)

terra

buffer (9), crop (8), crs (6), distance (4), ext (4), vect (4), nearby (2), project (2), rast (2), intersect (1), is.lonlat (1), nlyr (1)

chopin

dep_check (14), any_class_args (4), dep_switch (4), datamod (2), get_clip_ext (2), check_subject (1), clip_ras_ext (1), clip_vec_ext (1), crs_check (1), extract_at_buffer_kernel (1), par_group_balanced (1)

sf

st_crs (3), st_relate (3), st_as_sf (2), st_area (1), st_as_sfc (1), st_bbox (1), st_centroid (1), st_coordinates (1), st_covered_by (1), st_geometry_type (1), st_intersection (1), st_intersects (1), st_make_grid (1), st_transform (1), st_within (1)

dplyr

left_join (6), n (2), mutate (1), summarize (1)

graphics

points (8)

methods

is (4), el (2)

stats

quantile (3), dist (2)

collapse

rowbind (3)

exactextractr

exact_extract (3)

future.apply

future_lapply (3)

rlang

inject (3)

anticlust

balanced_clustering (2)

igraph

graph_from_edgelist (2)

utils

combn (2)

stars

st_as_stars (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 3 vignettes - 2 internal data files - 11 imported packages - 34 exported functions (median 17 lines of code) - 35 non-exported functions in R (median 32 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 3| 92.4| | |files_tests | 8| 88.2| | |loc_R | 1281| 74.5| | |loc_vignettes | 744| 86.2| | |loc_tests | 1449| 91.3| | |num_vignettes | 3| 94.2| | |data_size_total | 2100956| 97.8|TRUE | |data_size_median | 1050478| 98.5|TRUE | |n_fns_r | 69| 66.3| | |n_fns_r_exported | 34| 81.0| | |n_fns_r_not_exported | 35| 57.4| | |n_fns_per_file_r | 7| 78.2| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 21| 62.0| | |loc_per_fn_r_exp | 18| 42.0| | |loc_per_fn_r_not_exp | 32| 80.3| | |rel_whitespace_R | 11| 63.3| | |rel_whitespace_vignettes | 34| 89.0| | |rel_whitespace_tests | 15| 85.9| | |doclines_per_fn_exp | 36| 45.0| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 68| 72.4| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/NIEHS/chopin/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/NIEHS/chopin/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9308315230|pages build and deployment |success |764f4b | 53|2024-05-30 | | 9308277371|pkgdown |success |052c86 | 88|2024-05-30 | | 9308277362|R-CMD-check |success |052c86 | 168|2024-05-30 | | 9308277364|test-coverage-local |success |052c86 | 36|2024-05-30 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 25.7Mb sub-directories of 1Mb or more: data 2.0Mb extdata 23.1Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 98.11 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following function have cyclocomplexity >= 15: function | cyclocomplexity --- | --- par_grid | 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 7 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 7


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.5.2 | |pkgcheck |0.1.2.41 |


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

beatrizmilz commented 1 month ago

Hi @sigmafelix ! There was one error on the R CMD check: #638 (comment) . Could you please verify that, before we proceed to add the reviewers? Thank you!

@beatrizmilz Thank you for the update. In the most recent version of chopin, I think the R CMD CHECK error was fixed. Thank you!

Great!

beatrizmilz commented 1 month ago

@ropensci-review-bot assign @robitalec as reviewer

ropensci-review-bot commented 1 month ago

@robitalec added to the reviewers list. Review due date is 2024-06-20. Thanks @robitalec for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 1 month ago

@robitalec: If you haven't done so, please fill this form for us to update our reviewers records.

beatrizmilz commented 1 week ago

@ropensci-review-bot assign @Aariq as reviewer

ropensci-review-bot commented 1 week ago

@Aariq added to the reviewers list. Review due date is 2024-07-08. Thanks @Aariq for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 1 week ago

@Aariq: If you haven't done so, please fill this form for us to update our reviewers records.

ropensci-review-bot commented 1 week ago

:calendar: @robitalec you have 2 days left before the due date for your review (2024-06-20).

robitalec commented 1 week ago

Package Review

Documentation

Note, I'm placing checkboxes below each bullet, to highlight the places where I think improvements would be beneficial. I won't checkbox all the things that are already completed to keep it tidier.

The package includes all the following forms of documentation:

Functionality

Review Comments

Thank you for inviting me to review @beatrizmilz and, as always, a huge thank you to the broader rOpenSci community for all of your incredible contributions, inclusion and support.

Thank you @sigmafelix and @kyle-messier for this interesting package with functionality I haven't seen elsewhere that should be a great tool for a wide userbase.

Before I start - I really link the name because it evokes chopping up spatial objects into chunks and -- music! Great choice.


My first main and most important comment is that, at the moment, I find {chopin} hard to get started using.

I think the pitch is currently too narrow (also mentioned here), which might turn away folks who don't identify with "health" or "climate" research. I think there is potential for many users across disciplines who have large spatial data (everything is spatial!) to use {chopin} to perform analyses in more manageable chunks, allowing folks to improve the efficiency of their analyses and extend the potential of their machines.

When I open the README, one of the first things I see is notes on data restrictions and z dimensions and spherical geometry. I think this should come later in a caveats section or FAQ vignette.

Next, we get to "Basic design" which reads more like an overview of functions but also gets into which packages underpin {chopin} functionality, package versions, and registering parallel workers. I think this is the section that really needs to give users a concise and clear overview of the functions. This can and should be reused in other places to follow the recommended principle of multiple points of entry, where users should be anticipated to first encounter {chopin} across every source of information (README, documentation, vignettes) and given a clear introduction.

It seems there are currently three main families of functions exported and number of helper or internal functions:

These are the kinds of questions I'm left as an early user with after reading the overview

From the Description, current version

Package: chopin
Title: CHOPIN: Computation for Climate and Health research On Parallelized INfrastructure
Description: It enables users with basic understanding on geospatial data
    and sf and terra functions to parallelize geospatial operations for
    geospatial exposure assessment modeling and covariate computation.
    Parallelization is done by dividing large datasets into sub-regions
    with regular grids and data's own hierarchy. A computation over the
    large number of raster files can be parallelized with a chopin
    function as well.

Suggestions:

Package: chopin
Title: Parallelize Geospatial Operations Over Sub-Regions, Regular Grids and Lists of Rasters
Description: It enables users to parallelize geospatial operations over
    sub-regions, regular grids and lists of rasters. Parallelization 
    is done by dividing large datasets into sub-regions
    with regular grids and data's own hierarchy. A computation over the
    large number of raster files can be parallelized with a chopin
    function as well. (TODO: maybe mention {future}, {exactextractr}, {terra}, {sf})

Another reason that I might be finding {chopin} hard to get started using is that it seems to obscure lots of the regular terra/sf steps I would typically do myself as an R user. For example, the extract_at_buffer wraps the exact_extract function from {exactextractr} but before calling it, it runs terra::buffer, manages projections, crops, calculates extents, etc. This is also the case in functions like reproject_std and vect_valid_repair which reproject/repair objects using functions terra or sf based on the object type.

This also shows itself where the authors manually switch within functions depending on terra or sf input. For example, using dep_check and testing if == 'terra' else if == 'sf'. This feels like an opportunity to take advantage of S3 methods. Read more in Advanced R: S3 and R Packages: 11.9. An example applied is in {tmap}'s internal function tmapShape. Obviously this would require a fair amount of refactoring - I just wanted to highlight an alternative approach that might be more flexible to adding new supported data types in the future.

This is definitely an open ended comment - I just wonder where the point is for users where hiding away managing regular steps in a spatial analysis (like projections, extents, clipping, etc) actually results in more confusion than it helps.


My last general comment is - how do you envision {chopin} fitting into the broader High Performance Computing landscape in R? One thing I'm personally really interested is how it might interface with {targets}. Looks like maybe I'm saved from commenting too much on this, as my co reviewer @Aariq is one of the authors on the new {geotargets} package along with @njtierney and @brownag (thank you!). I have been eagerly following their progress as a regular {targets} user that has encountered the challenges of including spatial data in {targets} workflows. I wonder how these tools can be brought together or, alternatively, how the authors might communicate their differences and where a user might choose one over the other.


Below: detailed comments following sections in Documentation and Functionality bullets from the top.

Documentation

System requirements

Suggestions:

Vignettes

Issues:

An overview of the vignettes and README material:

It is unclear to a user which vignette is the introductory one. The rOpenSci Packaging Guide suggests "The general vignette should present a series of examples progressing in complexity from basic to advanced usage.". At the moment, this looks most like the "Extracting Weather/Climate Geospatial Data with chopin" vignette but the title is fairly specific and it is listed last on the website.

Consider splitting information about HPC or hardware into its own vignette. Also consider moving system.time benchmarking to a separate vignette? This could also potentially include tips for saving time and memory, and/or hardware recommendations.

Consider incorporating a section in the introductory vignette with a caveats section

Consider revising the vignettes and README to develop bullet points into paragraphs. Some shorter lists to compare options or list steps can be helpful but, generally, I find a full page of bullets hard to process.

Images on the website are giving 404 errors

Package issues

Using :: or both :: and library in vignettes (users are likely more familiar with library())

Suggestions:

Missing link to external functions / packages

Function documentation

Issues:

Missing top-level documentation

Suggestions:

Missing links to {chopin} functions, external functions, other packages

par_group_balanced

Missing documentation of argument defaults

Unclear functions

Use of ...

Examples

Issues:

Missing example

Suggestions:

Examples not run with if (FALSE) or dontrun. Consider a smaller toy example?

Using :: or both :: and library in examples

Community guidelines

Issues:

There are no contributing guidelines in the README or CONTRIBUTING

Functionality

Automated tests

Issues:

There is good test coverage in the package but I am a bit concerned about the number of times the authors use the expect_no_error function. When I think of effective package testing, I think of both clarity for the user reading the tests and in the authors setting up expectations for how the package may fail (anticipating your own errors, warnings or messages). What does the test expect_no_error communicate to the user or set up as expectations?

Here are some ideas for expanding or adjusting the tests:

library(data.table)

paths <- dir('../chopin/tests/testthat', full.names = TRUE)
DT <- data.table(path = paths)

counts <- DT[, tstrsplit(grep('testthat::', readLines(path), value = TRUE), c('\\('), keep = 1), by = path]

counts[, .N, V1][order(-N)]
Test function N
testthat::expect_no_error 59
testthat::expect_error 47
testthat::test_that 30
testthat::expect_equal 20
testthat::expect_true 16
testthat::expect_s4_class 12
testthat::expect_s3_class 12
testthat::expect_no_error 12
testthat::expect_equal 10
testthat::expect_message 7
testthat::expect_error 6
testthat::expect_s3_class 5
testthat::expect_true 4
testthat::expect_condition 2
testthat::expect_warning 2
testthat::expect_warning 2
testthat::expect_no_warning 1
counts[, .N, .(path, V1)][order(path, -N)]
path Test function N
test-any_class_args.R testthat::expect_true 5
test-any_class_args.R testthat::test_that 1
test-any_class_args.R testthat::expect_s4_class 1
test-check.R testthat::expect_equal 10
test-check.R testthat::test_that 9
test-check.R testthat::expect_no_error 8
test-check.R testthat::expect_error 8
test-check.R testthat::expect_equal 4
test-check.R testthat::expect_s3_class 2
test-check.R testthat::expect_s4_class 2
test-check.R testthat::expect_error 1
test-distribute_suites.R testthat::expect_no_error 9
test-distribute_suites.R testthat::expect_no_error 8
test-distribute_suites.R testthat::expect_s3_class 5
test-distribute_suites.R testthat::test_that 4
test-distribute_suites.R testthat::expect_s3_class 4
test-distribute_suites.R testthat::expect_true 4
test-distribute_suites.R testthat::expect_error 2
test-distribute_suites.R testthat::expect_true 2
test-distribute_suites.R testthat::expect_error 2
test-distribute_suites.R testthat::expect_equal 2
test-distribute_suites.R testthat::expect_equal 2
test-distribute_suites.R testthat::expect_condition 2
test-distribute_suites.R testthat::expect_s4_class 1
test-gridding.R testthat::expect_error 18
test-gridding.R testthat::expect_no_error 17
test-gridding.R testthat::expect_message 7
test-gridding.R testthat::test_that 6
test-gridding.R testthat::expect_true 6
test-gridding.R testthat::expect_s4_class 6
test-gridding.R testthat::expect_equal 5
test-gridding.R testthat::expect_warning 2
test-gridding.R testthat::expect_warning 1
test-gridding.R testthat::expect_s3_class 1
test-gridding.R testthat::expect_no_warning 1
test-par_fallback.R testthat::test_that 1
test-par_fallback.R testthat::expect_no_error 1
test-par_fallback.R testthat::expect_error 1
test-preprocessing.R testthat::expect_equal 4
test-preprocessing.R testthat::test_that 2
test-preprocessing.R testthat::expect_error 2
test-preprocessing.R testthat::expect_equal 2
test-preprocessing.R testthat::expect_s3_class 1
test-preprocessing.R testthat::expect_s4_class 1
test-processing.R testthat::expect_no_error 25
test-processing.R testthat::expect_error 19
test-processing.R testthat::test_that 7
test-processing.R testthat::expect_s3_class 4
test-processing.R testthat::expect_no_error 3
test-processing.R testthat::expect_true 3
test-processing.R testthat::expect_s4_class 1
test-processing.R testthat::expect_equal 1
test-processing.R testthat::expect_warning 1
Packaging guidelines

Issues:

URLs

> urlchecker::url_check()
! Warning: vignettes/v02_climate_examples.Rmd:135:107 Moved
Fourteen variables are available in TerraClimate data. The table below is from [the TerraClimate website](http://www.climatologylab.org/terraclimate-variables.html).
                                                                                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                                                                          https://www.climatologylab.org/terraclimate-variables.html
✖ Error: man/par_make_gridset.Rd:45:11 Error: CRAN URL not in canonical form
See \href{https://cran.r-project.org/web/packages/units/vignettes/measurement_units_in_R.html}{link}
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: man/summarize_sedc.Rd:86:13 403: Forbidden
\item \href{https://dx.doi.org/10.1021/es203152a}{Messier KP, Akita Y, Serre ML. (2012). Integrating Address Geocoding, Land Use Regression, and Spatiotemporal Geostatistical Estimation for Groundwater Tetrachloroethylene. \emph{Environmental Science & Technology} 46(5), 2772-2780. }
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: README.md:656:16 404: Not Found
    [vignette](https://kyle-messier.github.io/chopin/articles/v02_climate_examples.html)
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: man/summarize_sedc.Rd:87:32 Error: SSL certificate problem: unable to get local issuer certificate
\item Wiesner C. (n.d.). \href{https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html}{Euclidean Sum of Exponentially Decaying Contributions Tutorial}
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✖ Error: README.md:74:24 Error: SSL certificate problem: unable to get local issuer certificate
        contributions](https://mserre.sph.unc.edu/BMElab_web/SEDCtutorial/index.html)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

README

Functions

Suggestions:

codemetar

print()

Code style

Citation file

README

Functions

sigmafelix commented 1 week ago

Thank you for the thorough review @robitalec! I am going to start from the points on which I can work immediately, and wait for the other review from @Aariq.

To your comment on the benefits/advantages of hiding necessary tasks for spatial data handling, I agree that this practice would confuse experienced users. In the beginning of the function design, we wanted to make parallelization easier for users who are less experienced with spatial data such that would have difficulty with obtaining variables from source data. Thus, most of intermediate steps between data input and export the final table were obscured by design. For details, in the current version of chopin, a vector input is reprojected to a raster input's coordinate system in favor of keeping raster cell values and layouts as they are. For experienced users, I think I could elaborate on the design choices in an introductory vignette for the package. Thank you.

robitalec commented 1 week ago

Great, that sounds good @sigmafelix. Please let me know if there are any comments in my review that aren't clear.

Aariq commented 6 days ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 3.5


Review Comments

The main functionality of this package is really nice. I’m especially impressed with the thought that went into how to break rasters into tiles for optimal parallelization. I’m looking forward to using this package (or at least adapting its ideas) in my own workflows. I think there is some overlap between chopin and geotargets, in that you can do parallel operations via “branching” in targets workflows, but I like that chopin doesn’t require knowing targets (which can be a bit of a paradigm shift in how one writes R code). I also think the functions for creating grids to iterate over could work well with branching in a targets workflow, and I’m looking forward to exploring that. A few things stand out to me as areas for potential improvement: First, it seems that a major limitation to the parallel processing workflow is that it will not work with plan(multisession) or with plan(multicore) on RStudio or Windows. When I try plan(multisession) on macOS with RStudio, I get the error Error in eval(expr, p) : object 'ncpoints_tr' not found or subscript out of bounds, I assume because the raster is not passed to the parallel processes or not wrap()ed and unwrap()ed if it is. That leaves the command line on macOS or linux as the only ways to run this workflow currently. I’ll be honest—that’s enough of a barrier for me that I’m unlikely to use this package. I’m curious if there’s a plan to get plan(multisession) working or if there’s a reason it can’t work. I discovered this limitation on my own running the examples in the readme, but it might be good to document this limitation in the readme and/or in the @note for par_grid(), par_heirarchy(), etc. Second, the majority (or at least very many) of the examples showing parallel workflows are actually slower than the sequential versions. I want to better understand when the parallel processing really helps—either through examples that actually show a time savings, or through more discussion in the documentation of when it is useful (ideally with specifics like raster extent and resolution, number of points, system specs, number of grids, etc.). Third, this package includes a lot of helper functions. Some of these seem like they maybe shouldn’t be exported because they obfuscate more than abstract. Others seem useful, but also like they open users to potentially writing confusing code. I’ve included some more detailed notes on this below.

README.md

GitHub can process and display mermaid charts in README.md, so rather than displaying an image of your mermaid graph, you could consider just including a mermaid chunk with ```mermaid—that gives viewers zoom and pan controls. I’m not sure how pkgdown would handle it though.

I think the “To run the examples” section is probably not necessary.

The section on “why parallelization is slower” points the reader to the climate example vignette. However, most of the examples there also seem to be faster as single threaded operations (just based on outputs from tictoc or system.time(), haven’t been able to test myself).

Documentation

The ... arg is described as “placeholder” in the documentation for extract_at(), but seems like it should say “additional arguments passed to extract_at_buffer() or extract_at_poly()”. The ... argument for extract_at_buffer() and extract_at_poly() might be more accurately documented as “currently unused”.

Also regarding ..., I’m not sure it’s good advice to recommend users use position based arguments with ... as in ?par_grid. It would be safer to recommend using named arguments—e.g. for fun_dist = terra::extract users should pass x and y arguments.

In summarize_sedc() I would love a little more detail about what exactly SEDC is in the description.

Vignettes

Functions

Just a few notes on things that stand out to me:

Tests

I agree with @robitalec that test coverage is great, but there is maybe too much use of expect_no_error(). Possible improvments might include additional checks the output has the expected structure, checks that the expected object type is returned with expect_s3_class(), and/or using expect_snapshot() to at least check that the output of functions doesn’t unexpectedly change. It would be good to see tests with different future::plan()s running on different operating systems with CI. Some combination of withr::local_options(list(future.plan = "multisession")), and testthat::skip_on_os() should make that possible.

sigmafelix commented 5 days ago

Thank you for the prompt and detailed review, @Aariq! There are a lot of functions/details I need to learn, but I am sure that these functions will help me to improve the package. multicore plan was applied in the most of use cases because it ran without serialization of terra objects. I am going to integrate serialization at the beginning of parallelization (like geotargets) or another workaround in the revision. I will try to submit a revision in a few weeks. Thanks!

kyle-messier commented 5 days ago

@robitalec @Aariq This is what peer-review is all about! Thank y'all so much for the thoughtful and detailed reviews. It has been a tremendous learning experience. We look forward to improving chopin and your comments will also make other packages and projects better too.

beatrizmilz commented 4 days ago

Thank you for your reviews @robitalec and @Aariq !

beatrizmilz commented 3 days ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/638#issuecomment-2187531795 time 3.5

ropensci-review-bot commented 3 days ago

Logged review for Aariq (hours: 3.5)

beatrizmilz commented 3 days ago

Hi @robitalec! How many hours did you spend to make the review? (an estimation, we need this information to record the review!) Thank you!

robitalec commented 3 days ago

Approximately 8 hours @beatrizmilz

Thank you all!

beatrizmilz commented 3 days ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/638#issuecomment-2181114794 time 8

ropensci-review-bot commented 3 days ago

Logged review for robitalec (hours: 8)