AAoritz commented 10 months ago

Date accepted: 2024-03-14 Submitting Author Name: Moritz Hennicke Submitting Author Github Handle: !--author1-->@AAoritz@krausewe@maelle<!--end-editor-- Reviewers: @nolwenn, @jospueyo

Archive: TBD Version accepted: TBD Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: nuts
Title: 'nuts': Convert European Regional Data
Version: 0.0.0.9000
Authors@R: c(person("Moritz", "Hennicke", email = "AAoritz@posteo.de", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-6811-1821")),
         person("Werner", "Krause", email = "werner.krause@uni-potsdam.de", role = c("aut"),
           comment = c(ORCID = "0000-0002-5069-7964")))
Description: Motivated by changing administrative boundaries over time, the 'nuts' package can convert European regional data with NUTS codes between versions (2006, 2010, 2013, 2016 and 2021) and levels (NUTS 1, NUTS 2 and NUTS 3). The package uses spatial interpolation based on granular (100mx100m) population and land use data provided by the European Commission's Joint Research Center.  
URL: https://AAoritz.github.io/nuts/
BugReports: https://github.com/AAoritz/nuts/
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Suggests: 
    distill,
    eurostat,
    formatR,
    ggalluvial,
    ggfittext,
    ggplot2,
    ggpubr,
    ggrepel,
    gridExtra,
    kableExtra,
    knitr,
    raster,
    RColorBrewer,
    readr,
    rmarkdown,
    sf,
    terra,
    testthat,
    tidyr
Config/testthat/edition: 3
VignetteBuilder: knitr
Imports:
    crayon,
    dplyr,
    stringr
Depends: 
    R (>= 3.5.0)
LazyData: true

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- [ ] data retrieval
- [ ] data extraction
- [X] data munging
- [ ] data deposition
- [X] data validation and testing
- [ ] workflow automation
- [ ] version control
- [ ] citation management and bibliometrics
- [ ] scientific software wrappers
- [ ] field and lab reproducibility tools
- [ ] database software bindings
- [X] geospatial data
- [ ] text analysis
Explain how and why the package falls under these categories (briefly, 1-2 sentences):

Data munging: Linking regional data from different sources at the level of NUTS codes is often complicated by the usage of different versions or varying levels of geographical granularity across sources. The package can be used to harmonize NUTS versions and levels across different data sources using spatial interpolation.

Data validation and testing: The package includes routine tasks to test for the validity and completeness of NUTS codes.

Geospatial data: NUTS codes are the dominant format for European regional data.

Who is the target audience and what are scientific applications of this package?

The target audience are academics, journalists and data scientists interested in European regional data. Users who want to exploit changes within NUTS regions over time face the challenge that administrative boundaries are redrawn over time. The package enables the construction of consistent panel data across NUTS regions and over time through the harmonization of NUTS regions to one common version or level of granularity.

Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

To our knowledge there is currently no package that is targeted at the conversion of NUTS versions using spatial interpolation.

The regions package allows for re-coding of NUTS codes from version to version without spatial interpolation. The package offers some code validation routines, but not an automated detection of the NUTS version used.

(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?

Yes

If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Explain reasons for any pkgcheck items which your package is unable to pass.

Technical checks

Confirm each of the following by checking the box.

[X] I have read the rOpenSci packaging guide.
[X] I have read the author guide and I expect to maintain this package for at least 2 years or to find a replacement.

This package:

[X] does not violate the Terms of Service of any service it interacts with.
[X] has a CRAN and OSI accepted license.
[X] contains a README with instructions for installing the development version.
[X] includes documentation with examples for all functions, created with roxygen2.
[X] contains a vignette with examples of its essential functions and uses.
[X] has a test suite.
[X] has continuous integration, including reporting of test coverage.

Publication options

[X] Do you intend for this package to go on CRAN?

Code of conduct

[X] I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

ropensci-review-bot commented 10 months ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 10 months ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 10 months ago

Checks for nuts (v0.0.0.9000)

git hash: 815f2a79

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_multiplication_x: does not have a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_multiplication_x: Package has no HTML vignettes
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 95%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:------------|------:| |internal |utils | 26| |internal |base | 25| |imports |crayon | NA| |imports |dplyr | NA| |imports |stringr | NA| |suggests |distill | NA| |suggests |eurostat | NA| |suggests |formatR | NA| |suggests |ggalluvial | NA| |suggests |ggfittext | NA| |suggests |ggplot2 | NA| |suggests |ggpubr | NA| |suggests |ggrepel | NA| |suggests |gridExtra | NA| |suggests |kableExtra | NA| |suggests |knitr | NA| |suggests |raster | NA| |suggests |RColorBrewer | NA| |suggests |readr | NA| |suggests |rmarkdown | NA| |suggests |sf | NA| |suggests |terra | NA| |suggests |testthat | NA| |suggests |tidyr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

utils

data (26)

base

get (5), paste0 (5), c (4), names (4), attributes (2), by (2), missing (2), list (1)

**NOTE:** No imported packages appear to have associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 1 vignette - 4 internal data files - 3 imported packages - 3 exported functions (median 132 lines of code) - 3 non-exported functions in R (median 260 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 2| 85.7| | |files_tests | 4| 79.0| | |loc_R | 602| 53.3| | |loc_vignettes | 874| 88.9| | |loc_tests | 702| 81.2| | |num_vignettes | 1| 64.8| | |data_size_total | 625034| 92.9| | |data_size_median | 156882| 93.1| | |n_fns_r | 6| 6.6| | |n_fns_r_exported | 3| 12.9| | |n_fns_r_not_exported | 3| 5.3| | |n_fns_per_file_r | 1| 0.2|TRUE | |num_params_per_fn | 6| 79.0| | |loc_per_fn_r | 193| 99.1|TRUE | |loc_per_fn_r_exp | 132| 94.8| | |loc_per_fn_r_not_exp | 260| 99.5|TRUE | |rel_whitespace_R | 18| 55.4| | |rel_whitespace_vignettes | 29| 89.2| | |rel_whitespace_tests | 12| 68.9| | |doclines_per_fn_exp | 60| 72.8| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 0| 0.0|TRUE | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check.yaml](https://github.com/AAoritz/nuts/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/AAoritz/nuts/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:----------------------------------------------|:----------|:------|----------:|:----------| | 7671209982|pages build and deployment with artifacts-next |success |4aab2b | 11|2024-01-26 | | 7671179570|pkgdown |success |815f2a | 14|2024-01-26 | | 7671179564|R-CMD-check |success |815f2a | 8|2024-01-26 | | 7671179577|test-coverage |success |815f2a | 12|2024-01-26 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following check_fail: 1. no_import_package_as_a_whole #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 94.97 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- convert_nuts_level | 24 convert_nuts_version | 21 classify_nuts | 19 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 184 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 22 Lines should not be more than 80 characters. | 116 Use <-, not =, for assignment. | 46

Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3.9 | |pkgcheck |0.1.2.11 |

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

AAoritz commented 10 months ago

Hi rOpenSci Team and @ropensci-review-bot!

We added a contributing file with this commit
The package uses an HTML vignette based on distill. The bot prefers in check-vignette.R rmarkdown HTML files. Should we switch to rmarkdown?

Thank your for your time and consideration!

mpadge commented 10 months ago

@AAoritz Sorry about the inconvenience there. We've updated our system to detect distill vignettes, so your package will now pass those tests. (The server might take a few days to incorporate those updates; please be paitent.)

AAoritz commented 10 months ago

Wow, thanks @mpadge! Looking forward to the review process!

jhollist commented 10 months ago

Yep, thanks @mpadge. Will wait a bit of sending another check request to let those changes work through.

@AAoritz I think this looks good and is a fit for rOpenSci. I will start working on finding a handling editor for your submission!

AAoritz commented 10 months ago

Great news @jhollist! Thank you!

jhollist commented 10 months ago

@ropensci-review-bot check package

ropensci-review-bot commented 10 months ago

Thanks, about to send the query.

ropensci-review-bot commented 10 months ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 10 months ago

Checks for nuts (v0.0.0.9000)

git hash: cf771ced

:heavy_check_mark: Package name is available
:heavy_check_mark: has a 'codemeta.json' file.
:heavy_check_mark: has a 'contributing' file.
:heavy_check_mark: uses 'roxygen2'.
:heavy_check_mark: 'DESCRIPTION' has a URL field.
:heavy_check_mark: 'DESCRIPTION' has a BugReports field.
:heavy_check_mark: Package has at least one HTML vignette
:heavy_check_mark: All functions have examples.
:heavy_check_mark: Package has continuous integration checks.
:heavy_check_mark: Package coverage is 95%.
:heavy_check_mark: R CMD check found no errors.
:heavy_check_mark: R CMD check found no warnings.

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:------------|------:| |internal |utils | 26| |internal |base | 25| |imports |crayon | NA| |imports |dplyr | NA| |imports |stringr | NA| |suggests |distill | NA| |suggests |eurostat | NA| |suggests |formatR | NA| |suggests |ggalluvial | NA| |suggests |ggfittext | NA| |suggests |ggplot2 | NA| |suggests |ggpubr | NA| |suggests |ggrepel | NA| |suggests |gridExtra | NA| |suggests |kableExtra | NA| |suggests |knitr | NA| |suggests |raster | NA| |suggests |RColorBrewer | NA| |suggests |readr | NA| |suggests |rmarkdown | NA| |suggests |sf | NA| |suggests |terra | NA| |suggests |testthat | NA| |suggests |tidyr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

utils

data (26)

base

get (5), paste0 (5), c (4), names (4), attributes (2), by (2), missing (2), list (1)

**NOTE:** No imported packages appear to have associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 7 files) and - 2 authors - 1 vignette - 4 internal data files - 3 imported packages - 3 exported functions (median 132 lines of code) - 3 non-exported functions in R (median 260 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|------:|----------:|:----------| |files_R | 7| 45.7| | |files_vignettes | 2| 85.7| | |files_tests | 4| 79.0| | |loc_R | 602| 53.3| | |loc_vignettes | 874| 88.9| | |loc_tests | 702| 81.2| | |num_vignettes | 1| 64.8| | |data_size_total | 625034| 92.9| | |data_size_median | 156882| 93.1| | |n_fns_r | 6| 6.6| | |n_fns_r_exported | 3| 12.9| | |n_fns_r_not_exported | 3| 5.3| | |n_fns_per_file_r | 1| 0.2|TRUE | |num_params_per_fn | 6| 79.0| | |loc_per_fn_r | 193| 99.1|TRUE | |loc_per_fn_r_exp | 132| 94.8| | |loc_per_fn_r_not_exp | 260| 99.5|TRUE | |rel_whitespace_R | 18| 55.4| | |rel_whitespace_vignettes | 29| 89.2| | |rel_whitespace_tests | 12| 68.9| | |doclines_per_fn_exp | 60| 72.8| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 0| 0.0|TRUE | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check.yaml](https://github.com/AAoritz/nuts/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/AAoritz/nuts/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:----------------------------------------------|:----------|:------|----------:|:----------| | 7694130869|pages build and deployment with artifacts-next |success |38d719 | 13|2024-01-29 | | 7694097562|pkgdown |success |cf771c | 16|2024-01-29 | | 7694097559|R-CMD-check |success |cf771c | 10|2024-01-29 | | 7694097589|test-coverage |success |cf771c | 14|2024-01-29 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following check_fail: 1. no_import_package_as_a_whole #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 94.97 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- convert_nuts_level | 24 convert_nuts_version | 21 classify_nuts | 19 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 184 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 22 Lines should not be more than 80 characters. | 116 Use <-, not =, for assignment. | 46

Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3.9 | |pkgcheck |0.1.2.13 |

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

jhollist commented 10 months ago

@AAoritz looking good! Assigning editor in just a few.

jhollist commented 10 months ago

@ropensci-review-bot assign @maelle as editor

ropensci-review-bot commented 10 months ago

Assigned! @maelle is now the editor

maelle commented 10 months ago

Editor checks:

[x] Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
- [x] Is the case for the package well made?
- [ ] Is the reference index page clear (grouped by topic if necessary)?
- [x] Are vignettes readable, sufficiently detailed and not just perfunctory?
[x] Fit: The package meets criteria for fit and overlap.
[ ] Installation instructions: Are installation instructions clear enough for human users?
[x] Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
[ ] Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
[x] License: The package has a CRAN or OSI accepted license.
[x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Editor comments

Many thanks for your submission @AAoritz & @krausewe! Useful package! I know of the French equivalent to nuts, COGjugaison, so was vaguely aware of the idea https://antuki.github.io/COGugaison/ :smile_cat:

I have a few comments that I'd like to resolve or discuss before I proceed to the reviewer search.

Contributing guide

In the CONTRIBUTING.md file could you please remove the tidyverse specific bits such as " For more detailed info about contributing to this, and other tidyverse packages,"?
I was surprised to see optional dependencies before Imports in DESCRIPTION. Would you mind running desc::desc_normalize() and see whether you like the result?

Docs

Instead of remotes in the README you could recommend pak, pak::pak("AAoritz/nuts") but in any case I don't think the force = TRUE argument is warranted, is it?
Since there is only one vignette I'd recommend naming it "nuts.Rmd" so that it is automatically added in the pkgdown website navbar as "Get started". See https://pkgdown.r-lib.org/reference/build_articles.html#get-started
Your images have alternative text but that text is not complete, it does not describe the image so that a screenreader user would miss a lot. https://www.w3.org/WAI/tutorials/images/informative/
I was surprised by the short pipeline in https://github.com/AAoritz/nuts/blob/cf771ced02a20b226b56e6908959044ca62804e4/vignettes/nuts-vignette.Rmd#L247, could it avoid using the pipe, or if it does, could filter( go on a separate line? I feel it's an unusual pattern (but I might be wrong). I see this pattern of one-line pipelines in the tests too so it might be a personal preference! It's more noticeable/important in the user facing code but I won't impose my personal style.
The reference index is short but it might make sense to group the functions then the datasets. https://pkgdown.r-lib.org/reference/build_reference.html#reference-index

User interface

The function output of https://aaoritz.github.io/nuts/reference/classify_nuts.html#value is a list where you identify parts via their index. Could you add names to the output? Something like

output <- list(
  data = data, 
  versions_data = data_all_versions,  
  missing_data = data_missing_nuts
)

so that then the user could use output [["missing_data"]] instead of output[[3]] which is less readable.

Regarding the messages,
- I see you use crayon for information but stop() for error messages. The crayon package has been superseeded by cli (see https://cli.r-lib.org/ and https://blog.r-hub.io/2023/11/30/cliff-notes-about-cli/), according to https://cran.r-project.org/web/packages/crayon/index.html. You could keep using crayon (it's still supported) but it might make sense switching, as cli provides really nice features. Code like

stop("Input 'nuts_code' must be provided as a string.")

could become

cli::cli_abort("Input {.arg nuts_code} must be provided as a string, not {.obj_type_friendly {nuts_code}}.")

(note that this function also requires importing rlang in DESCRIPTION)

and info messages could use cli_alert_ functions https://cli.r-lib.org/reference/cli_alert.html

Most importantly, at the moment there is no way for the user to turn off messages. This is guidance we've just added to the development version of the dev guide so pretty new but I would still like this to be implemented as it might improve the quality of life of the users (including you, for instance in tests). See https://devdevguide.netlify.app/pkg_building#console-messages, https://blog.r-hub.io/2023/11/30/cliff-notes-about-cli/#how-to-make-cli-quiet-or-not if you use cli, and, hum, a draft blog post @mpadge and I are working on https://github.com/ropensci/roweb3/pull/717 Based on the interface tooling you end up selecting I am happy to provide tips about silencing if needed. Thanks for your understanding. :sweat_smile:

Code

Why does the package import whole packages rather than individual functions? Furthermore the imports should happen only once so they could happen in the package-level manual page source. You can create it with https://usethis.r-lib.org/reference/use_package_doc.html whose docs say "This .R file is also a good place for roxygen directives that apply to the whole package (vs. a specific function), such as global namespace tags like @importFrom."
In lines such as https://github.com/AAoritz/nuts/blob/cf771ced02a20b226b56e6908959044ca62804e4/R/convert_nuts_version.R#L128 the code might benefit from the use of "explaining variables", for instance (with a better name probably, again an example :sweat_smile: )

not_enough_codes <- (length(data$from_code[check_nuts_codes]) < length(data$from_code) &&
               length(data$from_code[check_nuts_codes]) > 0)
if  (not_enough_codes)

(oh and the use of && instead of & see https://lintr.r-lib.org/reference/vector_logic_linter.html)

Tests

You do not need data() to load package data in tests.
I'd recommend not having top-level code in tests which means no code outside of test_that(. This makes tests more self contained. So in a helper file, tests/testthat/helper-test-data.R for instance you'd have the function

manure_indic_DE_2003 <- function() {
manure %>%
  filter(nchar(geo) == 4) %>%
  filter(indic_ag == "I07A_EQ_Y") %>%
  select(-indic_ag) %>%
  filter(grepl("^DE", geo)) %>%
  filter(time == 2003) %>%
  select(-time)
}

and in tests/testthat/test-classify_nuts.R you'd have

test_that("Needs geo var 1", {
  expect_error(
    manure_indic_DE_2003() %>% classify_nuts(nuts_code = NULL),
    "Input 'nuts_code' cannot be NULL."
  )
})

test_that("Needs geo var 2", {
  expect_error(manure_indic_DE_2003() %>% classify_nuts())
})

test_that("nuts_code not valid", {
  expect_error(
    manure_indic_DE_2003() %>% classify_nuts(nuts_code = 1),
    "Input 'nuts_code' must be provided as a string."
  )
})

See https://blog.r-hub.io/2020/11/18/testthat-utility-belt/ and https://r-pkgs.org/testing-design.html + https://r-pkgs.org/testing-advanced.html

Instead of using set.seed(), in tests please use the withr version: https://withr.r-lib.org/reference/with_seed.html

Happy to answer any question and to discuss! Thanks again for submitting your package!

AAoritz commented 10 months ago

Hi @maelle ! Many thanks for these super useful checks and comments. @krausewe and I start looking into them now and we will get back to you soon.

krausewe commented 9 months ago

Hi @maelle!

We have addressed the very helpful remarks & recommendations.

Please find below the list of commits.

We are looking forward to the next steps!

Contributing guide

Remove tidyverse specific points in CONTRIBUTING: https://github.com/AAoritz/nuts/commit/f5c4a5a4049c52c9f34dd253d99a0eb2515af164
Normalize DESCRIPTION: https://github.com/AAoritz/nuts/commit/b80c8db0ad253a631b6b06f8fa20a61523c0d218

Docs

Change install to pak in README: https://github.com/AAoritz/nuts/commit/51c2e44addb5b5052421b967cb9ddc11837fb6af
Rename vignette to nuts.Rmd: https://github.com/AAoritz/nuts/commit/3fe96c87ece74a6e28d35f350302d282523238e8
Include alt-text of images in vignette: https://github.com/AAoritz/nuts/commit/51f1cb3c67b1e85acaef84dc3937620c963a3985
Add line breaks for one-line pipelines: https://github.com/AAoritz/nuts/commit/baf01ff9227695fa60e95cc6ee61c7d6c3189202
Group functions and data pkgdown: https://github.com/AAoritz/nuts/commit/f11a08070d0c1356339936c742165cfe6a551c34

User interface

Simplify names of output list of classify: https://github.com/AAoritz/nuts/commit/a5bbea99ace46bd32be7bc0628d239ebe0f319f4
Use cli messages: https://github.com/AAoritz/nuts/commit/8cb9be6e0fa526f1fc43cbaeb8ab5b3bec89c8e8
Silencing messages: https://github.com/AAoritz/nuts/commit/9a93ff174389a922ca7bae39daa039d93eb7454b

Code

Import functions as opposed to entire packages: https://github.com/AAoritz/nuts/commit/d2744d97cd43dd7a45c52416640633cfd604d34f
Use explaining variables and double &&: https://github.com/AAoritz/nuts/commit/4bc0303f8b2be824ac554d31cb772ac27d242140

Tests

Don't load data(): https://github.com/AAoritz/nuts/commit/e2ba58f62169b8c84b33817faf305dfc6aeb6c09
Include helper file that prepares data: https://github.com/AAoritz/nuts/commit/e2ba58f62169b8c84b33817faf305dfc6aeb6c09
Replace set.seed() with withr: https://github.com/AAoritz/nuts/commit/0d9e3ad4084780c69179f52afd21c30b1eedba6c

maelle commented 9 months ago

Awesome, thank you! I'll now look for reviewers.

Note that @mpadge's and my tech note is now published: https://ropensci.org/blog/2024/02/06/verbosity-control-packages/

maelle commented 9 months ago

@ropensci-review-bot seeking reviewers

ropensci-review-bot commented 9 months ago

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/623_status.svg)](https://github.com/ropensci/software-review/issues/623)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

krausewe commented 9 months ago

@ropensci-review-bot.

Added Peer Review Status.

NEWS.md was already created and included.

maelle commented 9 months ago

@ropensci-review-bot add @nolwenn to reviewers

ropensci-review-bot commented 9 months ago

@nolwenn added to the reviewers list. Review due date is 2024-03-01. Thanks @nolwenn for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 9 months ago

@nolwenn: If you haven't done so, please fill this form for us to update our reviewers records.

AAoritz commented 9 months ago

Hi @maelle! We are a bit lost with implementing verbosity control with local_options. This example from your blog does not seem to work with cli:

> pkg_message <- function(...) {
+   is_verbose_mode <- (getOption("mypackage.verbose", "quiet") == "verbose")
+   if (is_verbose_mode) {
+     # Options local to this function only; reset on exit!
+     rlang::local_options(rlib_message_verbosity = "verbose")
+   }
+   rlang::inform(...)
+ }
> pkg_message("normal message")
normal message
> rlang::local_options(rlib_message_verbosity = "quiet")
Setting global deferred event(s).
i These will be run:
  * Automatically, when the R session ends.
  * On demand, if you call `withr::deferred_run()`.
i Use `withr::deferred_clear()` to clear them without executing.
> pkg_message("suppressed message")
> rlang::local_options(mypackage.verbose = "verbose")
> pkg_message("reawakened message")
reawakened message

> withr::deferred_run()
Ran 2/2 deferred expressions
> pkg_message <- function(...) {
+   is_verbose_mode <- (getOption("mypackage.verbose", "quiet") == "verbose")
+   if (is_verbose_mode) {
+     # Options local to this function only; reset on exit!
+     rlang::local_options(rlib_message_verbosity = "verbose")
+   }
+   cli::cli_h1(...)
+ }
> pkg_message("normal message")

── normal message ───────────────────────────────────────────────────────────────────────────────────────────────
> rlang::local_options(rlib_message_verbosity = "quiet")
Setting global deferred event(s).
i These will be run:
  * Automatically, when the R session ends.
  * On demand, if you call `withr::deferred_run()`.
i Use `withr::deferred_clear()` to clear them without executing.
> pkg_message("suppressed message")

── suppressed message ───────────────────────────────────────────────────────────────────────────────────────────
> rlang::local_options(mypackage.verbose = "verbose")
> pkg_message("reawakened message")

── reawakened message ───────────────────────────────────────────────────────────────────────────────────────────
>

maelle commented 9 months ago

Right, it seems specific to rlang (cc @mpadge).

How about something simpler like

pkg_message <- function(...) {
  is_verbose_mode <- (getOption("mypackage.verbose", "quiet") == "verbose")
  if (is_verbose_mode) {
    cli::cli_alert(...)
  }

}

withr::with_options(list("mypackage.verbose" = "quiet"), {
  pkg_message("pof")
})

withr::with_options(list("mypackage.verbose" = "verbose"), {
  pkg_message("pof")
})
#> → pof

^{Created on 2024-02-13 with reprex v2.1.0}

:thinking:

mpadge commented 9 months ago

@AAoritz The options we wrote about in the blog post only apply to the cli_inform(), cli_warn(), and cli_abort() functions, not to any other cli functions. If you want to use others such as cli_h<N>(), you'd need to further adapt the custom handler to not call that at all if verbose == "quiet".

AAoritz commented 9 months ago

Thank you @maelle and @mpadge! I wrapped our cli message as suggested in https://github.com/AAoritz/nuts/commit/47ee441ea6ea98b30326157b803f81dba5557b42 by @maelle. Here is the result:

rlang::local_options(nuts.verbose = "quiet")
df <- patents %>%
  filter(unit == "NR", nchar(geo) == 4, time == 2012) %>%
  filter(grepl("^DE", geo)) %>%
  classify_nuts(data = ., nuts_code = "geo")

withr::deferred_run()
#> Ran 1/1 deferred expressions
rlang::local_options(nuts.verbose = "verbose")
df <- patents %>%
  filter(unit == "NR", nchar(geo) == 4, time == 2012) %>%
  filter(grepl("^DE", geo)) %>%
  classify_nuts(data = ., nuts_code = "geo")
#> 
#> ── Classifying version of NUTS codes ───────────────────────────────────────────
#> Within groups defined by country:
#> ! These NUTS codes cannot be identified or classified: DEXX and DEZZ.
#> ✔ Unique NUTS version classified.
#> ✔ No missing NUTS codes.

withr::deferred_run()
#> Ran 2/2 deferred expressions
rlang::local_options(rlib_message_verbosity = "quiet")
df <- patents %>%
  filter(unit == "NR", nchar(geo) == 4, time == 2012) %>%
  filter(grepl("^DE", geo)) %>%
  classify_nuts(data = ., nuts_code = "geo")

^{Created on 2024-02-14 with reprex v2.0.2}

How would you recommend to document this option? Below the examples in the function level help/documentation?

mpadge commented 9 months ago

@AAoritz Would you please be so kind as to ask that question on our discussion forum? The issue of where to document verbosity control is important, and would be good to have in that general context, rather than here.

maelle commented 9 months ago

@ropensci-review-bot add @jospueyo to reviewers

ropensci-review-bot commented 9 months ago

@jospueyo added to the reviewers list. Review due date is 2024-03-07. Thanks @jospueyo for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 9 months ago

@jospueyo: If you haven't done so, please fill this form for us to update our reviewers records.

jospueyo commented 9 months ago

Package Review

Briefly describe any working relationship you have (had) with the package authors.
[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work.

Documentation

The package includes all the following forms of documentation:

[x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s): demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions
[x] Examples: (that run successfully locally) for all exported functions
[x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

[x] Installation: Installation succeeds as documented.
[ ] Functionality: Any functional claims of the software have been confirmed.
[x] Performance: Any performance claims of the software have been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 3 h

[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (“rev” role) in the package DESCRIPTION file.

Review Comments

The package is useful, very useful indeed, and also well designed. Most of my comments below are just suggestions that IMHO will make the package more intuitive, more robust, or more easily maintainable. Don't take them as mandatory. At the end, this is your package and you should have the last word.

Disclaimer: Apologies if some comments sound though or rude. English is not my native language.

General comments

Function naming does not follow rOpenSci package guidelines. Specifically, the design object_verb(). Please, start your exported functions with nuts_, this also facilitates autocompletion.

Variables in vignettes should use underscores and not points, that can clash with S3 methods.

group_vars argument could be changed to .by, to be consinstent with dplyr functions.

convert_nuts_levels might be changed to nuts_aggregate to be explicit about that the level change is only aggregation and not disaggregation.

Provide wrapper functions to inspect sub-elements of classifiy_nuts, something like:

nuts_get_data()
nuts_get_version()
nuts_get_missing()

missing_rm could also provide some imputation technique, such as using the mean. In the future, it might be linked to imputation packages such as mice, for instance.

There are some inconsistencies in your constructions that can make the package harder to mantain, especially if there is a maintainer change in the future. For instance, sometimes you are explicit inside your conditions: if(condition == FALSE) do something... vs if(!condition) do something.... You should check the constructions you used and try to be consistent. BTW, you should avoid using T and F, replace them with TRUE and FALSE.

You should not supply default values in mandatory arguments. Otherwise, if you run the function without providing these values, it raises a cryptographic error instead of the usual Error in classify_nuts() : argument "data" is missing, with no default.

When an argument accepts a set of values, such asmultiple_versions or weights, it’s preferable to pass a vector with all options as default to show all possible values. E.g. multiple_versions = c('break', 'most_frequent'). Then, you use the first element, which would be break by default. BTW, break is a concept that not anybody might be familiar with. Maybe, using error would be more intuitive.

When you check if an argument is one of the accepted values, you should contemplate that users could pass a vector of length > 1 (never underestimate end-users). If this happens, an error is given because if() can only accept 1-length vectors. You should do something like if(!all(nuts_code %in% colnames(data))).

The “nuts.verbose” options should be mentioned in the help of each function. Otherwise, users can’t know what options they have about verbosity.

Documentation

In the table about converting relative values, some * are interpreted by markdown as italics. This makes impossible to interpret the weighted average formulas. You should fix it, possibly converting to equations or to code. You will have to try.

Imports

All the scoped verbs (*_at) were superseeded in current versions of dplyr. Replace with new forms such as summarize(across()).

The pipe from magrittr and as_tibble from tibble package can be imported from dplyr and you can remove two packages from imports. You will depend on tibble and magrittr anyway, but not directly but trough dplyr.

Likewise, if you don’t want to depend on stringr, you can easily replace str_remove with gsub(), passing "" as replacement; and str_detect with grepl.

`classify_nuts` (suggested name: `nuts_classify`)

L310, you should add the class “nuts.classified” instead of overwriting the class “list”: class(x) <- c("new_class", class(x). Otherwise, if the object is no longer a list and only your custom class, this could break some methods for lists and raise unexpected behavior if the output of classify_nuts is used by something else than feed your other functions.

`convert_nuts_version` (suggested name: `nuts_convert_version`)

L72: There is inherits(object, class) to check a class. It’s more robust than class(x) == "your_class".
L183: You could do the subset just after the get(). This way, you save loading the entire table in memory: cross_walks <- get("cross_walks")[cross_walks$to_version == to_version, ].

`convert_nuts_level` (suggested name: `nuts_aggregate`)

In the vignettes, you explain that only aggregation is possible, which makes sense. However, this is not reflected in the help page. For instance, why to_level can be set to 3 if it’s not possible convert from level 1 or 2 to 3? You should add this in the description. My suggestion of name also makes this clearer.

L151: This chunk of code is repeated in convert_nuts_version. This is complex enough to avoid repeating yourself. I suggest moving all this chunk to a new function and call this function in both. This makes you code more maintainable.

Looking forward your responses!

maelle commented 9 months ago

Thanks a lot @jospueyo! :pray:

Could you please add the number of hours you spent reviewing near "Estimated hours spent reviewing:" in the template? Thanks!

krausewe commented 9 months ago

Thanks so much @jospueyo! All of this is super helpful.

We will start addressing the recommendations asap!

maelle commented 9 months ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/623#issuecomment-1951446662 time 3

ropensci-review-bot commented 9 months ago

Logged review for jospueyo (hours: 3)

nolwenn commented 9 months ago

nuts package review

Briefly describe any working relationship you have (had) with the package authors.

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need: clearly stating problems the software is designed to solve and its target audience in README
[ ] Installation instructions: for the development version of package and any non-standard dependencies in README
[ ] Vignette(s): demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions
[x] Examples: (that run successfully locally) for all exported functions
[x] Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

#### Functionality

[ ] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 2

[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

The package could be really useful for many applications where spatial-temporal question is involved. At the time of my review, the authors have already considered some of the comments of reviewer 1 so you will excuse me if some of my comments might be outdated.

Installation instructions github

you have a error, an extra "/" . It should be r pak::pak("AAoritz/nuts") and not r pak::pak("AAoritz/nuts/")

At first load I had the message:

Error: package or namespace load failed for ‘nuts’ in loadNamespace(j \<- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): namespace ‘vctrs’ 0.6.3 is already loaded, but >= 0.6.4 is required

I had to close and reopen R/Rstudio then it worked fine.

Vignette(s)

The Rmd file has been updated with the previous recommendations by @jospueyo concerning the naming of the function (which is good) but not the HTML version https://aaoritz.github.io/nuts/articles/nuts-vignette.html. The vignette in HTML version need to be rebuilt and push for online display.

In the vignette, the author did not explicitly call the libraries required for running the data management needed for the examples like the meta-library tidyverse with dplyr for the filter() functions and other data management functions.

In the vignette, maps are displayed with various NUTS over time. Even so it is not the goal of the nuts package, a note mentioning which package(s) (or software) was(were) used to create those maps could be of interest for the reader.

Functionality

The function nuts_test_multiple_versions do not have an examples. Would it be possible to have at least one?

After a nuts_convert_version(), it might be interesting to have a helping function that computes the estimated variations of the values between the different NUTS versions. It could be the difference (increase or decrease) in absolute or relative frequency at the higher NUT level. Like to get the 7.7% that you estimated in your first manure example (Spatial interpolation in a nutshell)

Hope it helps,

Best, Nolwenn

krausewe commented 9 months ago

Amazing, @nolwenn. Thank you for the feedback!

(And sorry for already starting to revise the package while you were still reviewing.)

maelle commented 9 months ago

Thanks a lot @nolwenn for your review! :pray:

maelle commented 9 months ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/623#issuecomment-1961501137 time 2

ropensci-review-bot commented 9 months ago

Logged review for nolwenn (hours: 2)

maelle commented 9 months ago

@AAoritz @krausewe both reviews are in! :tada:

AAoritz commented 8 months ago

Review by @jospueyo

Many thanks @jospueyo for your time and effort! We highly appreciate your thorough and constructive feedback. The package greatly improved from your comments. To summarize our reply, we have no objections and adopted all comments, except for 3 points below that may require further discussion.

Adopted points

General comments

Function naming: https://github.com/AAoritz/nuts/commit/2e0f4c37a6005af8d343aeb85c8144e1fffa2de4

Variables in vignettes should use underscores: https://github.com/AAoritz/nuts/commit/70b2dec3bd2fa5651757975b2ea07b2738cbb368

convert_nuts_levels might be changed to nuts_aggregate: https://github.com/AAoritz/nuts/commit/2e0f4c37a6005af8d343aeb85c8144e1fffa2de4

Provide wrapper functions for classifiy_nuts: https://github.com/AAoritz/nuts/commit/f3db2b79da2d134c37002e0d5038e9e43fb85b36

Be consistent inside your conditions: if(condition == FALSE) do something... vs if(!condition) do something.... and avoid using T and F, replace them with TRUE and FALSE: https://github.com/AAoritz/nuts/commit/abf45e4531fde33b5b7878f6476830b1180165d4

You should not supply default values in mandatory arguments: https://github.com/AAoritz/nuts/commit/474c89d64e2721f8b9d45313c8e5230991dd23c1

When an argument accepts a set of values, such as multiple_versions or weights, it’s preferable to pass a vector with all options as default to show all possible values: https://github.com/AAoritz/nuts/commit/501347840b685a9dc3fa5e409b2fa1bf6dd56372

The “nuts.verbose” options should be mentioned in the help of each function: https://github.com/AAoritz/nuts/commit/94accd4a469472183bbfce1a604444d28bbd89e3

Documentation

In the table about converting relative values, some * are interpreted by markdown as italics: https://github.com/AAoritz/nuts/commit/7e643c700b25ad04bb1f60a5f10124d7bd35a0e8

Imports

All the scoped verbs (*_at) were superseeded in current versions of dplyr. Replace with new forms such as summarize(across()): https://github.com/AAoritz/nuts/commit/ce238d60e57fc209882d8846fd910d2dba8afb38

The pipe from magrittr and as_tibble from tibble package can be imported from dplyr and you can remove two packages from imports: https://github.com/AAoritz/nuts/commit/ed2ae3ee5e1ce0c38bb1973e234edda8d31fe54c

Likewise, if you don’t want to depend on stringr, you can easily replace str_remove with gsub(), passing "" as replacement; and str_detect with grepl: https://github.com/AAoritz/nuts/commit/3ef29c1b859d3b3f99fb4ec2388fc85c5bf39c75

classify_nuts (suggested name: nuts_classify)

L310, you should add the class “nuts.classified” instead of overwriting the class “list”: class(x) <- c("new_class", class(x). Otherwise, if the object is no longer a list and only your custom class, this could break some methods for lists and raise unexpected behavior if the output of classify_nuts is used by something else than feed your other functions: https://github.com/AAoritz/nuts/commit/7d7b2b3ee3bff70667cb625c5b969d389619d918

convert_nuts_version (suggested name: nuts_convert_version)

L72: There is inherits(object, class) to check a class. It’s more robust than class(x) == "your_class": https://github.com/AAoritz/nuts/commit/a6d88cd2c2606514367a1a90fa1b9226c3803a1e

L183: You could do the subset just after the get(). This way, you save loading the entire table in memory: cross_walks <- get("cross_walks")[cross_walks$to_version == to_version, ]: https://github.com/AAoritz/nuts/commit/3a8749156bcd474bcc8d40a90c79739b1c739044

convert_nuts_level (suggested name: nuts_aggregate)

In the vignettes, you explain that only aggregation is possible, which makes sense. However, this is not reflected in the help page: https://github.com/AAoritz/nuts/commit/21ad775689eabbb3eb86c3063754f08e2dd06263

L151: This chunk of code is repeated in convert_nuts_version. This is complex enough to avoid repeating yourself. I suggest moving all this chunk to a new function and call this function in both. This makes you code more maintainable: https://github.com/AAoritz/nuts/commit/38f8e17de84b67edbeacf25b747ce4322147f94b

Discussion points

When you check if an argument is one of the accepted values, you should contemplate that users could pass a vector of length > 1 (never underestimate end-users). If this happens, an error is given because if() can only accept 1-length vectors. You should do something like if(!all(nuts_code %in% colnames(data))):

We implemented specific checks for this case when length(argument) > 1 for single length arguments. https://github.com/AAoritz/nuts/commit/f052aa4ffd95bc29b8e5a46c6c69fcb123341db9

group_vars argument could be changed to .by:

This is an interesting development in dplyr. It does circumvent the danger of forgetting to ungroup(). It is still marked as experimental though. Would it make sense to wait with this? https://github.com/tidyverse/dplyr/blob/HEAD/R/by.R

missing_rm could also provide some imputation technique, such as using the mean. In the future, it might be linked to imputation packages such as mice:

This is a very interesting suggestion. However, we tend to think of imputation as a separate process apart from conversion/aggregation. Currently, imputation with the mice package could be done with joining a subset of data(all_nuts_codes) to the user's dataset and imputing across the universe of NUTS regions. Moreover, imputation as with the method used by the mice package (Van Buuren, S. (2018). Flexible Imputation of Missing Data) comes with many different assumptions whose legitimacy are highly context dependent. We would currently be more comfortable to leave this task to the users. We do have some ideas to deal with missing observations. Please see our discussion with @nolwenn below.

Review by @nolwenn

Thank you for your great input @nolwenn! Those commments are really valuable for this package. We adopted all of your suggestions with the exception of two points that we would like to discuss further.

Adopted comments

It should be r pak::pak("AAoritz/nuts") and not r pak::pak("AAoritz/nuts/"): https://github.com/AAoritz/nuts/commit/655460e1af7af0e6ac730c4802226b9613bcca0e

The vignette in HTML version need to be rebuilt and push for online display: https://github.com/AAoritz/nuts/commit/764b0f4381e91303aa4dc401ac3e44d8ce9ddff8

In the vignette, the author did not explicitly call the libraries required for running the data management needed for the examples like the meta-library tidyverse with dplyr for the filter() functions and other data management functions: https://github.com/AAoritz/nuts/commit/ebbb2b8752341c2fbb53eae0e704e1123f8e27dc

In the vignette, maps are displayed with various NUTS over time. Even so it is not the goal of the nuts package, a note mentioning which package(s) (or software) was(were) used to create those maps could be of interest for the reader: https://github.com/AAoritz/nuts/commit/8a1d3dda0af1880a8aa297ae40b6f27c83e0f1ec

The function nuts_test_multiple_versions do not have an examples: https://github.com/AAoritz/nuts/commit/813ad7c006f85232d27b7e17d09061f62d808dba

Discussion points

Error: package or namespace load failed for ‘nuts’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): namespace ‘vctrs’ 0.6.3 is already loaded, but >= 0.6.4 is required:

We currently don't see where this comes from. Does this error persist with the most recent version?

After a nuts_convert_version(), it might be interesting to have a helping function that computes the estimated variations of the values between the different NUTS versions. It could be the difference (increase or decrease) in absolute or relative frequency at the higher NUT level. Like to get the 7.7% that you estimated in your first manure example (Spatial interpolation in a nutshell)

This is an interesting proposal. If we understand your point correctly, we could include an option in nuts_convert_version() that could return e.g. a column in the converted dataset containing more details on the conversion. It could for instance be a string that pastes the path how a NUTS region was converted similar to our explanation of the methodology in the Vignette. For instance, in the case where a region ZZ72 is created from regions ZZ68 and ZZ70, the converted data set would have a column nuts_code equal to ZZ72 and a new variable conversion_path equal to 0.6 * ZZ68 + 0.2*ZZ70. Currently, this type of information could be extracted, by sub-setting and transforming the tibble data(cross_walks).

Along similar lines, we have thought about an option that may report/or make use of the share of conversion/aggregation weights missing due to missing regions. For instance, when region ZZ72 is created mainly from ZZ68 and only marginally from ZZ70, where the latter is missing, e.g. ZZ72 = 0.8 * ZZ68 + 0.01 * NA, setting ZZ70 to 0 by using missing_rm = T would be less dramatic as when the weight placed on ZZ70 was larger. Computing the share of weights missing (in this example 0.01 / 0.81), we could change the missing_rm option to a threshold at which the users feel confident to assume 0 for the missing regions (e.g. missing_rm > 0.02).

#

Many thanks! @krausewe and I are looking forward to both of your replies!

maelle commented 8 months ago

@AAoritz @krausewe thank you! Could you please record your response for the bot? https://devdevguide.netlify.app/bot_cheatsheet#submit-response-to-reviewers

@jospueyo @nolwenn could you please read the response of the authors and respond? Once you are happy with the changes, please use the approval template https://devdevguide.netlify.app/approval2template

jospueyo commented 8 months ago

Reviewer Response

This has been awesome. I enjoyed reviewing your package and monitoring its evolution. I hope you are truly happy with my suggestions, I am with your responses.

Regarding .by argument, I didn't know it was still experimental. Otherwise, I wouldn't recommend the change.

I also agree with your opinion about imputation of missing values.

Final approval (post-review)

[X] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 3.5

AAoritz commented 8 months ago

@ropensci-review-bot submit response https://github.com/ropensci/software-review/issues/623#issuecomment-1976920666

ropensci-review-bot commented 8 months ago

Logged author response!

ropensci-review-bot commented 8 months ago

@AAoritz, @krausewe: please post your response with @ropensci-review-bot submit response <url to issue comment> if you haven't done so already (this is an automatic reminder).

Here's the author guide for response. https://devguide.ropensci.org/authors-guide.html

ropensci / software-review

'nuts': Convert European Regional Data #623

Scope

Technical checks

Publication options

Code of conduct

Checks for nuts (v0.0.0.9000)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

Editor-in-Chief Instructions:

Checks for nuts (v0.0.0.9000)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

Editor-in-Chief Instructions:

Editor checks:

[x] Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Editor comments

Contributing guide

Docs

User interface

Code

Tests

Instead of using set.seed(), in tests please use the withr version: https://withr.r-lib.org/reference/with_seed.html

Contributing guide

Docs

User interface

Code

Tests

Package Review

Documentation

Functionality

Review Comments

General comments

Documentation

Imports

classify_nuts (suggested name: nuts_classify)

convert_nuts_version (suggested name: nuts_convert_version)

convert_nuts_level (suggested name: nuts_aggregate)

nuts package review

Documentation

Review Comments

Installation instructions github

Vignette(s)

Functionality

Review by @jospueyo

Adopted points

General comments

Documentation

Imports

classify_nuts (suggested name: nuts_classify)

convert_nuts_version (suggested name: nuts_convert_version)

convert_nuts_level (suggested name: nuts_aggregate)

Discussion points

Review by @nolwenn

Adopted comments

Discussion points

Reviewer Response

Final approval (post-review)

3. `goodpractice` and other checks

3. `goodpractice` and other checks

Instead of using `set.seed()`, in tests please use the withr version: https://withr.r-lib.org/reference/with_seed.html

`classify_nuts` (suggested name: `nuts_classify`)

`convert_nuts_version` (suggested name: `nuts_convert_version`)

`convert_nuts_level` (suggested name: `nuts_aggregate`)