ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

daiquiri: Data quality reporting for temporal datasets #535

Closed phuongquan closed 1 year ago

phuongquan commented 2 years ago

Date accepted: 2022-10-25

Submitting Author Name: T. Phuong Quan Submitting Author Github Handle: !--author1-->@phuongquan<!--end-author1-- Other Package Authors Github handles: (comma separated, delete if none) Repository: https://github.com/phuongquan/daiquiri Version submitted: 0.7.1 Submission type: Standard Editor: !--editor-->@maurolepore<!--end-editor-- Reviewers: @brad-cannell, @elinw

Due date for @brad-cannell: 2022-07-27 Due date for @elinw: 2022-08-17

Archive: TBD Version accepted: TBD Language: en

Package: daiquiri
Type: Package
Title: Data Quality Reporting for Temporal Datasets
Version: 0.7.1
Authors@R: c(
    person(c("T.", "Phuong"), "Quan", email = "phuong.quan@ndm.ox.ac.uk",
        role = c("aut", "cre"), comment = c(ORCID = "0000-0001-8566-1817")),
    person("Jack", "Cregan", role = "ctb"),
    person(family = "University of Oxford", role = "cph"),
    person(family = "National Institute for Health Research (NIHR)", role = "fnd")
    )
Description: Generate reports that enable quick visual review of 
    temporal shifts in record-level data. Time series plots showing aggregated 
    values are automatically created for each data field (column) depending on its 
    contents (e.g. min/max/mean values for numeric data, no. of distinct 
    values for categorical data), as well as overviews for missing values, 
    non-conformant values, and duplicated rows. The resulting reports are sharable 
    and can contribute to forming a transparent record of the entire analysis process. 
    It is designed with Electronic Health Records in mind, but can be used for 
    any type of record-level temporal data (i.e. tabular data where each row represents 
    a single “event”, one column contains the "event date", and other columns 
    contain any associated values for the event).
URL: https://github.com/phuongquan/daiquiri
BugReports: https://github.com/phuongquan/daiquiri/issues
License: GPL (>=3)
Encoding: UTF-8
Imports:
    data.table (>= 1.12.8),
    readr (>= 1.3.1),
    ggplot2 (>= 3.1.0),
    scales (>= 1.1.0),
    cowplot (>= 0.9.3),
    rmarkdown,
    reactable (>= 0.2.3),
    utils,
    stats
RoxygenNote: 7.1.2
Suggests:
    covr,
    knitr,
    testthat (>= 3.0.0),
    codemetar
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

It takes a generic data frame containing raw, record-level, temporal data, and generates a data quality report that enables quick visual review of any unexpected temporal shifts in measures such as missingness, min/max/mean/distinct values, and non-conformance.

The target audience is all researchers who analyse data from large, temporal datasets, particularly routinely-collected data such as electronic health records. The package helps them to quickly check for temporal biases in their data before embarking on their main analyses. It also helps them to do this in a thorough, consistent and transparent way (since the reports are shareable), hence increasing the quality of their studies as well as trust in the scientific process.

To my knowledge, there are a small number of R packages that generate summary statistics and/or data quality reports, (with the two most similar being dataquieR and DQAstats), but none which assist in identifying temporal changes in the data, nor which are as lightweight to use and consume.

Yes

https://github.com/ropensci/software-review/issues/527

Here is the list of items from version 0.0.3.11. The latest development version (0.0.3.13) is not currently working on Windows.

  1. These functions do not have examples: [daiquiri] - This is not actually a function, daiquiri is the package help page and includes links to the "most useful" function and vignette.
  2. Package uses global assignment operator (‘<<-’). - This is necessary for a withCallingHandlers() call
  3. Package has no continuous integration checks. - The package uses GitHub Actions for R-CMD-check and for covr so I don't know why this is showing as a fail.
  4. Namespace in Imports field not imported from: ‘reactable’ All declared Imports should be used - The 'reactable' package is in fact used but is probably being missed because it is called from an Rmd file in the inst/rmd folder
  5. There are two functions with cyclocomplexity above 15: a) validate_params_type() and b) aggregatefield(). These functions are essentially large switch statements covering a) all arguments to all exported functions (for validating what a user passes to the function, and where some arguments are used across multiple functions), and b) all the different ways that the package can aggregate a data field (e.g. mean, min, max). The only obvious way I can see to reduce the cyclocomplexity score would be to separate out the items into subfunctions, but I judge that would actually make the code harder to follow rather than easier. I am open to other suggestions.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

ropensci-review-bot commented 2 years ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 2 years ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 years ago

Checks for daiquiri (v0.7.1)

git hash: c316aeb1

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>=3)


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:----------|------:| |internal |base | 465| |internal |daiquiri | 120| |internal |graphics | 6| |imports |ggplot2 | 63| |imports |stats | 26| |imports |data.table | 17| |imports |utils | 9| |imports |scales | 8| |imports |readr | 5| |imports |cowplot | 5| |imports |rmarkdown | 1| |imports |reactable | NA| |suggests |covr | NA| |suggests |knitr | NA| |suggests |testthat | NA| |suggests |codemetar | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (83), c (29), format (23), sum (22), length (20), names (20), vapply (17), structure (16), is.na (15), for (12), inherits (11), with (11), character (10), is.nan (10), max (9), suppressWarnings (8), seq_along (7), class (6), labels (6), min (6), paste0 (6), which (6), as.character (5), call (5), logical (5), message (5), ncol (5), options (5), anyNA (4), comment (4), lapply (4), mean (4), quote (4), unique (4), as.integer (3), file (3), nrow (3), vector (3), by (2), ceiling (2), data.frame (2), file.path (2), formals (2), grep (2), nchar (2), strsplit (2), which.max (2), as.Date (1), as.list (1), as.numeric (1), col (1), do.call (1), dQuote (1), emptyenv (1), floor (1), gsub (1), is.symbol (1), log10 (1), missing (1), new.env (1), open (1), paste (1), q (1), rle (1), row.names (1), save (1), seq (1), sort (1), substring (1), sys.calls (1), system.file (1), typeof (1), unlist (1), warning (1), withCallingHandlers (1)

daiquiri

fieldtypes (15), datafield (14), fieldtype (11), get_datafield_max (4), get_datafield_min (4), is.fieldtype_timepoint (3), timepoint_as_aggregationunit (3), aggregate_data (2), aggtype_friendlyname (2), fieldtypes_to_cols (2), ft_allfields (2), ft_ignore (2), ft_timepoint (2), get_datafield_missing (2), identify_duplicaterows (2), is.fieldtype (2), log_initialise (2), plot_overview_heatmap_static (2), plot_overview_totals_static (2), aggregateallfields (1), aggregatefield (1), create_report (1), export_aggregated_data (1), fieldtypes_template (1), fieldtypes_to_string (1), ft_categorical (1), ft_datetime (1), ft_duplicates (1), ft_freetext (1), ft_numeric (1), ft_simple (1), ft_uniqueidentifier (1), get_aggfunctions (1), get_collector (1), get_dataclass (1), get_datafield_basetype (1), get_datafield_count (1), get_datafield_fieldtype_name (1), get_datafield_validation_warnings_n (1), get_datafield_vector (1), get_fieldtype_name (1), is.aggregatedata (1), is.aggregatefield (1), is.datafield (1), is.fieldtype_calculated (1), is.fieldtype_datetime (1), is.fieldtype_ignore (1), is.fieldtype_numeric (1), is.fieldtypes (1), is.sourcedata (1), log_close (1), log_function_end (1), log_function_start (1), log_message (1), plot_overview_combo_static (1), plot_timeseries_static (1), prepare_data (1), report_data (1), summarise_aggregated_data (1), summarise_source_data (1), yscale_breaks (1)

ggplot2

element_text (17), theme (7), ggplot (6), element_blank (5), labs (5), aes_string (4), a (3), element_rect (3), scale_fill_gradient (3), scale_x_date (3), facet_grid (2), geom_point (2), geom_line (1), scale_y_continuous (1), unit (1)

stats

df (10), dt (7), heatmap (6), median (3)

data.table

data.table (10), fifelse (4), as.data.table (2), copy (1)

utils

data (7), object.size (1), packageNa (1)

scales

label_date_short (5), breaks_pretty (3)

graphics

title (6)

cowplot

plot_grid (5)

readr

read_delim (2), col_character (1), cols (1), type_convert (1)

rmarkdown

render (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 8 files) and - 1 authors - 1 vignette - no internal data file - 9 imported packages - 18 exported functions (median 10 lines of code) - 130 non-exported functions in R (median 8 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 8| 50.7| | |files_vignettes | 1| 68.4| | |files_tests | 7| 86.4| | |loc_R | 1314| 75.1| | |loc_vignettes | 143| 37.3| | |loc_tests | 690| 80.9| | |num_vignettes | 1| 64.8| | |n_fns_r | 148| 84.9| | |n_fns_r_exported | 18| 64.2| | |n_fns_r_not_exported | 130| 88.2| | |n_fns_per_file_r | 11| 88.4| | |num_params_per_fn | 2| 10.4| | |loc_per_fn_r | 8| 20.0| | |loc_per_fn_r_exp | 10| 22.2| | |loc_per_fn_r_not_exp | 8| 22.6| | |rel_whitespace_R | 18| 74.6| | |rel_whitespace_vignettes | 45| 47.5| | |rel_whitespace_tests | 21| 80.2| | |doclines_per_fn_exp | 68| 78.4| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 141| 84.4| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice and other checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check](https://github.com/phuongquan/daiquiri/workflows/R-CMD-check/badge.svg)](https://github.com/phuongquan/daiquiri/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:-------------|:----------|:------|----------:|:----------| | 2332879957|R-CMD-check |success |c316ae | 17|2022-05-16 | | 2332879944|test-coverage |success |c316ae | 16|2022-05-16 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking dependencies in R code ... NOTE Namespace in Imports field not imported from: ‘reactable’ All declared Imports should be used. R CMD check generated the following check_fails: 1. cyclocomp 2. rcmdcheck_imports_not_imported_from #### Test coverage with [covr](https://covr.r-lib.org/) ERROR: Test Coverage Failed #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- validate_params_type | 84 aggregatefield | 56 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 550 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 550


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.0.4.30 | |pkgcheck |0.0.3.19 |


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

phuongquan commented 2 years ago

Hi I think there is a problem with pkgcheck.

As you can see from the badge on https://github.com/phuongquan/daiquiri, code coverage is 96%, so I don't know why the Test Coverage check has failed your side.

Also, as I mentioned in the submission, the item "These functions do not have examples: [daiquiri]" does not appear to be appropriate since daiquiri is not a function but is the package itself (following https://r-pkgs.org/man.html#man-packages). If you want me to add examples for a selection of functions in the package I can do so, but it is not clear to me if that is appropriate.

Thanks.

mpadge commented 2 years ago

@phuongquan You're right about the functions not having examples. That was a bug which has now been fixed, so thanks! You can ignore that result. The coverage issue is, however, more problematic. I've tried to run covr::package_coverage() (as well as codecov() in a couple of different docker containers and always encouter this error:

simpleWarning in utils::install.packages(repos = NULL, lib = tmp_lib, pkg$path, : installation of package ‘/daiquiri’ had non-zero exit status

simpleWarning in file(con, "r"): cannot open file '/tmp/RtmpRuQwPt/R_LIBS14f26d8825f2/daiquiri/R/daiquiri': No such file or directory

Error in file(con, "r"): cannot open the connection

This is clearly what causes the fails on our system. You should be able to reproduce by cloning your repo in a clean docker container and trying for yourself. I nevertheless see that the GitHub action works, and that your coverage is indeed as you claim, so you may also choose to ignore that if you like. My guess is that it's related to {renv}. covr installs the package in a temporary library location, but our check system loads the package prior to that and so populates your renv directory. That then does not get copied across to the temporary libary location used by covr, and so installation fails. It works on GitHub actions because that uses a fresh clone of the repo which populates renv in the temporary library location.

phuongquan commented 2 years ago

Hi I just wanted to check if there is anything you are waiting on me for?

emilyriederer commented 2 years ago

Hi @phuongquan - I apologize for the delay on my end. Thank you again for your submission. I will work on finding an editor and follow up shortly.

emilyriederer commented 2 years ago

@ropensci-review-bot assign @maurolepore as editor

emilyriederer commented 2 years ago

@ropensci-review-bot assign @maurolepore as editor

ropensci-review-bot commented 2 years ago

Assigned! @maurolepore is now the editor

maurolepore commented 2 years ago

Hi @phuongquan, I'm pleased to be the handling editor of this submission.

I'll re-run checks now. I see the comments above suggest some checks may fail for reasons beyond your control. I'll explore those issues and walk through the editor's template by the end of this week.

maurolepore commented 2 years ago

@ropensci-review-bot check package

ropensci-review-bot commented 2 years ago

Thanks, about to send the query.

ropensci-review-bot commented 2 years ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 years ago

Checks for daiquiri (v0.7.1)

git hash: c316aeb1

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>=3)


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:----------|------:| |internal |base | 465| |internal |daiquiri | 120| |internal |graphics | 6| |imports |ggplot2 | 63| |imports |stats | 26| |imports |data.table | 17| |imports |utils | 9| |imports |scales | 8| |imports |readr | 5| |imports |cowplot | 5| |imports |rmarkdown | 1| |imports |reactable | NA| |suggests |covr | NA| |suggests |knitr | NA| |suggests |testthat | NA| |suggests |codemetar | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (83), c (29), format (23), sum (22), length (20), names (20), vapply (17), structure (16), is.na (15), for (12), inherits (11), with (11), character (10), is.nan (10), max (9), suppressWarnings (8), seq_along (7), class (6), labels (6), min (6), paste0 (6), which (6), as.character (5), call (5), logical (5), message (5), ncol (5), options (5), anyNA (4), comment (4), lapply (4), mean (4), quote (4), unique (4), as.integer (3), file (3), nrow (3), vector (3), by (2), ceiling (2), data.frame (2), file.path (2), formals (2), grep (2), nchar (2), strsplit (2), which.max (2), as.Date (1), as.list (1), as.numeric (1), col (1), do.call (1), dQuote (1), emptyenv (1), floor (1), gsub (1), is.symbol (1), log10 (1), missing (1), new.env (1), open (1), paste (1), q (1), rle (1), row.names (1), save (1), seq (1), sort (1), substring (1), sys.calls (1), system.file (1), typeof (1), unlist (1), warning (1), withCallingHandlers (1)

daiquiri

fieldtypes (15), datafield (14), fieldtype (11), get_datafield_max (4), get_datafield_min (4), is.fieldtype_timepoint (3), timepoint_as_aggregationunit (3), aggregate_data (2), aggtype_friendlyname (2), fieldtypes_to_cols (2), ft_allfields (2), ft_ignore (2), ft_timepoint (2), get_datafield_missing (2), identify_duplicaterows (2), is.fieldtype (2), log_initialise (2), plot_overview_heatmap_static (2), plot_overview_totals_static (2), aggregateallfields (1), aggregatefield (1), create_report (1), export_aggregated_data (1), fieldtypes_template (1), fieldtypes_to_string (1), ft_categorical (1), ft_datetime (1), ft_duplicates (1), ft_freetext (1), ft_numeric (1), ft_simple (1), ft_uniqueidentifier (1), get_aggfunctions (1), get_collector (1), get_dataclass (1), get_datafield_basetype (1), get_datafield_count (1), get_datafield_fieldtype_name (1), get_datafield_validation_warnings_n (1), get_datafield_vector (1), get_fieldtype_name (1), is.aggregatedata (1), is.aggregatefield (1), is.datafield (1), is.fieldtype_calculated (1), is.fieldtype_datetime (1), is.fieldtype_ignore (1), is.fieldtype_numeric (1), is.fieldtypes (1), is.sourcedata (1), log_close (1), log_function_end (1), log_function_start (1), log_message (1), plot_overview_combo_static (1), plot_timeseries_static (1), prepare_data (1), report_data (1), summarise_aggregated_data (1), summarise_source_data (1), yscale_breaks (1)

ggplot2

element_text (17), theme (7), ggplot (6), element_blank (5), labs (5), aes_string (4), a (3), element_rect (3), scale_fill_gradient (3), scale_x_date (3), facet_grid (2), geom_point (2), geom_line (1), scale_y_continuous (1), unit (1)

stats

df (10), dt (7), heatmap (6), median (3)

data.table

data.table (10), fifelse (4), as.data.table (2), copy (1)

utils

data (7), object.size (1), packageNa (1)

scales

label_date_short (5), breaks_pretty (3)

graphics

title (6)

cowplot

plot_grid (5)

readr

read_delim (2), col_character (1), cols (1), type_convert (1)

rmarkdown

render (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 8 files) and - 1 authors - 1 vignette - no internal data file - 9 imported packages - 18 exported functions (median 10 lines of code) - 130 non-exported functions in R (median 8 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 8| 50.7| | |files_vignettes | 1| 68.4| | |files_tests | 7| 86.4| | |loc_R | 1314| 75.1| | |loc_vignettes | 143| 37.3| | |loc_tests | 690| 80.9| | |num_vignettes | 1| 64.8| | |n_fns_r | 148| 84.9| | |n_fns_r_exported | 18| 64.2| | |n_fns_r_not_exported | 130| 88.2| | |n_fns_per_file_r | 11| 88.4| | |num_params_per_fn | 2| 10.4| | |loc_per_fn_r | 8| 20.0| | |loc_per_fn_r_exp | 10| 22.2| | |loc_per_fn_r_not_exp | 8| 22.6| | |rel_whitespace_R | 18| 74.6| | |rel_whitespace_vignettes | 45| 47.5| | |rel_whitespace_tests | 21| 80.2| | |doclines_per_fn_exp | 68| 78.4| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 141| 84.4| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check](https://github.com/phuongquan/daiquiri/workflows/R-CMD-check/badge.svg)](https://github.com/phuongquan/daiquiri/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:-------------|:----------|:------|----------:|:----------| | 2332879957|R-CMD-check |success |c316ae | 17|2022-05-16 | | 2332879944|test-coverage |success |c316ae | 16|2022-05-16 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking dependencies in R code ... NOTE Namespace in Imports field not imported from: ‘reactable’ All declared Imports should be used. R CMD check generated the following check_fails: 1. cyclocomp 2. rcmdcheck_imports_not_imported_from #### Test coverage with [covr](https://covr.r-lib.org/) ERROR: Test Coverage Failed #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- validate_params_type | 84 aggregatefield | 56 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 550 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 550


4. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following 5 function names are duplicated in other packages: - - `aggregate_data` from simITS - - `create_report` from DataExplorer, prodigenr, reporter - - `log_close` from logr - - `prepare_data` from bbsBayes, bigstep, childsds, corporaexplorer, disaggregation, fHMM, ggasym, multigroup, multimorbidity, mutualinf, nmm, parsnip, PLNmodels, sglOptim, shapr, ssMousetrack - - `read_data` from creditmodel, deaR, deforestable, diverse, ecocomDP, GeodesiCL, logib, metrix, prepdat, qtlpoly, RTextTools, sjlabelled, whippr


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.0.4.55 | |pkgcheck |0.0.3.40 |


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

mpadge commented 2 years ago

Note: We're still working on a fix for the package coverage failure there - see package GitHub workflow for actual coverage in the meantime: 96%. Sorry for an inconvenience @phuongquan.

maurolepore commented 2 years ago

Dear @phuongquan,

Thanks again for this awesome submisison! I'm here to help you pass review smoothly. I looked for opportunities to make the package follow rOpenSci's guidelines as closely as possible, and to make the reviewer's job as easy as possible. After the section "Editor checks" you'll see my comments:

Please let me know what questions you have.

Editor checks:


Editor comments

TODO

Documentation:

Installation instructions:

# install.packages("devtools")
devtools::install_github("phuongquan/daiquiri@v0.7.0")

Or

# install.packages("pak")
pak::pkg_install("phuongquan/daiquiri@v0.7.0")
CONSIDER

Documentation:

Fit:

Installation instructions:

Tests:

semi-transparency is not supported on this device: reported only once per page
``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.0 (2022-04-22) #> os Ubuntu 20.04.4 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Guatemala #> date 2022-06-03 #> pandoc 2.17.1.1 @ /usr/lib/rstudio/bin/quarto/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> brio 1.1.3 2021-11-30 [1] RSPM #> cachem 1.0.6 2021-08-19 [1] RSPM #> callr 3.7.0 2021-04-20 [1] RSPM #> cli 3.3.0 2022-04-25 [1] RSPM #> crayon 1.5.1 2022-03-26 [1] RSPM #> desc 1.4.1 2022-03-06 [1] RSPM #> devtools 2.4.3 2021-11-30 [1] RSPM #> digest 0.6.29 2021-12-01 [1] RSPM #> ellipsis 0.3.2 2021-04-29 [1] RSPM #> evaluate 0.15 2022-02-18 [1] RSPM #> fansi 1.0.3 2022-03-24 [1] RSPM #> fastmap 1.1.0 2021-01-25 [1] RSPM #> fs 1.5.2 2021-12-08 [1] RSPM #> glue 1.6.2 2022-02-24 [1] RSPM #> highr 0.9 2021-04-16 [1] RSPM #> htmltools 0.5.2 2021-08-25 [1] RSPM #> knitr 1.39 2022-04-26 [1] RSPM #> lifecycle 1.0.1 2021-09-24 [1] RSPM #> magrittr 2.0.3 2022-03-30 [1] RSPM #> memoise 2.0.1 2021-11-26 [1] RSPM #> pillar 1.7.0 2022-02-01 [1] RSPM #> pkgbuild 1.3.1 2021-12-20 [1] RSPM #> pkgconfig 2.0.3 2019-09-22 [1] RSPM #> pkgload 1.2.4 2021-11-30 [1] RSPM #> prettyunits 1.1.1 2020-01-24 [1] RSPM #> processx 3.5.3 2022-03-25 [1] RSPM #> ps 1.7.0 2022-04-23 [1] RSPM #> purrr 0.3.4 2020-04-17 [1] RSPM #> R.cache 0.15.0 2021-04-30 [1] RSPM #> R.methodsS3 1.8.1 2020-08-26 [1] RSPM #> R.oo 1.24.0 2020-08-26 [1] RSPM #> R.utils 2.11.0 2021-09-26 [1] RSPM #> R6 2.5.1 2021-08-19 [1] RSPM #> remotes 2.4.2 2021-11-30 [1] RSPM #> reprex 2.0.1 2021-08-05 [1] RSPM #> rlang 1.0.2 2022-03-04 [1] RSPM #> rmarkdown 2.14 2022-04-25 [1] RSPM #> rprojroot 2.0.3 2022-04-02 [1] RSPM #> rstudioapi 0.13 2020-11-12 [1] RSPM #> sessioninfo 1.2.2 2021-12-06 [1] RSPM #> stringi 1.7.6 2021-11-29 [1] RSPM #> stringr 1.4.0 2019-02-10 [1] RSPM #> styler 1.7.0 2022-03-13 [1] RSPM #> testthat 3.1.4 2022-04-26 [1] RSPM #> tibble 3.1.7 2022-05-03 [1] RSPM (R 4.2.0) #> usethis 2.1.6 2022-05-25 [1] RSPM (R 4.2.0) #> utf8 1.2.2 2021-07-24 [1] RSPM #> vctrs 0.4.1 2022-04-13 [1] RSPM #> withr 2.5.0 2022-03-03 [1] RSPM #> xfun 0.31 2022-05-10 [1] RSPM (R 4.2.0) #> yaml 2.3.5 2022-02-21 [1] RSPM #> #> [1] /home/mauro/R/x86_64-pc-linux-gnu-library/4.2 #> [2] /opt/R/4.2.0/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ``` Created on 2022-06-03 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)

License:

Other:

* Project '/cloud/project' loaded. [renv 0.12.2]
Warning message:
Project requested R version '4.1.2' but '4.2.0' is currently being used 
* The project may be out of sync -- use `renv::status()` for more details.

"avoid long code lines, it's bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines shorter than 80 characters"

# devtools::install_github("mangothecat/goodpractice") 
goodpractice::goodpractice()
❯ checking dependencies in R code ... NOTE
  Namespace in Imports field not imported from: ‘reactable’
    All declared Imports should be used.
phuongquan commented 2 years ago

Hi @maurolepore,

Many thanks for your comments. I have pushed the following changes:

Other items:

Lastly, The R CMD check note

Namespace in Imports field not imported from: ‘reactable’ All declared Imports should be used

is erroneous. The 'reactable' package is in fact used (and so needs to be in the DESCRIPTION file) but is probably being missed because it is called from an Rmd file in the inst/rmd folder.

Please let me know if you need anything more from me.

Kind regards,

Phuong

maurolepore commented 2 years ago

Thanks! We are almost there :-)

✔  checking examples (36.8s)
   Examples with CPU (user + system) or elapsed time > 5s
                   user system elapsed
   create_report 17.695  0.679  18.859
   report_data   13.871  0.165  14.737
❯ checking dependencies in R code ... NOTE
  Namespace in Imports field not imported from: ‘reactable’
    All declared Imports should be used.
mpadge commented 2 years ago

@maurolepore We've updated the automated checking in the meantime, so it now flags any submissions which have renv in activated mode, and require that to be de-activated for reviews to proceed.

maurolepore commented 2 years ago

@ropensci-review-bot check package

ropensci-review-bot commented 2 years ago

Thanks, about to send the query.

ropensci-review-bot commented 2 years ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 years ago

Checks for daiquiri (v0.7.2)

git hash: 089bd85e

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 3)


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:----------|------:| |internal |base | 402| |internal |daiquiri | 102| |internal |graphics | 6| |internal |mgcv | 1| |imports |ggplot2 | 37| |imports |stats | 19| |imports |data.table | 14| |imports |scales | 8| |imports |utils | 4| |imports |readr | 3| |imports |cowplot | NA| |imports |rmarkdown | NA| |imports |reactable | NA| |suggests |covr | NA| |suggests |knitr | NA| |suggests |testthat | NA| |suggests |codemetar | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (74), c (28), sum (21), format (20), names (19), vapply (16), for (12), is.na (12), length (12), inherits (11), with (11), message (9), structure (9), character (8), suppressWarnings (8), seq_along (7), class (6), labels (6), max (6), as.character (5), call (5), logical (5), min (5), paste0 (5), which (5), as.double (4), is.nan (4), mean (4), options (4), unique (4), by (3), lapply (3), nrow (3), vector (3), anyNA (2), as.integer (2), ceiling (2), data.frame (2), file.path (2), formals (2), grep (2), nchar (2), which.max (2), as.Date (1), as.list (1), as.numeric (1), col (1), do.call (1), dQuote (1), emptyenv (1), file (1), gsub (1), is.symbol (1), missing (1), ncol (1), new.env (1), open (1), paste (1), q (1), rle (1), row.names (1), seq (1), sort (1), strsplit (1), substring (1), sys.calls (1), system.file (1), typeof (1), unlist (1), warning (1)

daiquiri

fieldtypes (12), datafield (11), fieldtype (11), get_datafield_max (4), get_datafield_min (4), is.fieldtype_timepoint (3), timepoint_as_aggregationunit (3), aggregate_data (2), ft_allfields (2), ft_ignore (2), ft_timepoint (2), get_datafield_missing (2), identify_duplicaterows (2), is.fieldtype (2), aggregateallfields (1), aggregatefield (1), aggtype_friendlyname (1), create_report (1), export_aggregated_data (1), fieldtypes_template (1), fieldtypes_to_string (1), ft_categorical (1), ft_datetime (1), ft_duplicates (1), ft_freetext (1), ft_numeric (1), ft_simple (1), ft_uniqueidentifier (1), get_aggfunctions (1), get_collector (1), get_dataclass (1), get_datafield_basetype (1), get_datafield_count (1), get_datafield_fieldtype_name (1), get_datafield_validation_warnings_n (1), get_datafield_vector (1), get_fieldtype_name (1), is.aggregatedata (1), is.aggregatefield (1), is.datafield (1), is.fieldtype_calculated (1), is.fieldtype_datetime (1), is.fieldtype_ignore (1), is.fieldtype_numeric (1), is.fieldtypes (1), is.sourcedata (1), log_initialise (1), plot_overview_heatmap_static (1), plot_overview_totals_static (1), prepare_data (1), report_data (1), summarise_aggregated_data (1), summarise_source_data (1), yscale_breaks (1)

ggplot2

element_text (10), element_blank (5), ggplot (5), labs (4), aes_string (3), facet_grid (2), geom_point (2), theme (2), element_rect (1), geom_line (1), scale_y_continuous (1), unit (1)

stats

df (7), dt (7), median (3), heatmap (2)

data.table

data.table (9), as.data.table (2), fifelse (2), copy (1)

scales

label_date_short (5), breaks_pretty (3)

graphics

title (5), axis (1)

utils

data (3), object.size (1)

readr

cols (3)

mgcv

s (1)

**NOTE:** Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 8 files) and - 1 authors - 1 vignette - no internal data file - 9 imported packages - 18 exported functions (median 10 lines of code) - 130 non-exported functions in R (median 9 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 8| 50.7| | |files_vignettes | 1| 68.4| | |files_tests | 7| 86.4| | |loc_R | 1914| 83.4| | |loc_vignettes | 143| 37.3| | |loc_tests | 878| 85.1| | |num_vignettes | 1| 64.8| | |n_fns_r | 148| 84.9| | |n_fns_r_exported | 18| 64.2| | |n_fns_r_not_exported | 130| 88.2| | |n_fns_per_file_r | 11| 88.4| | |num_params_per_fn | 2| 10.4| | |loc_per_fn_r | 10| 27.7| | |loc_per_fn_r_exp | 10| 22.2| | |loc_per_fn_r_not_exp | 9| 27.1| | |rel_whitespace_R | 12| 75.0| | |rel_whitespace_vignettes | 45| 47.5| | |rel_whitespace_tests | 17| 79.9| | |doclines_per_fn_exp | 68| 78.4| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 141| 84.4| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check](https://github.com/phuongquan/daiquiri/workflows/R-CMD-check/badge.svg)](https://github.com/phuongquan/daiquiri/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 2462421281|pages build and deployment |success |089bd8 | 3|2022-06-08 | | 2462421408|R-CMD-check |success |089bd8 | 19|2022-06-08 | | 2462421399|test-coverage |success |089bd8 | 18|2022-06-08 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking dependencies in R code ... NOTE Namespace in Imports field not imported from: ‘reactable’ All declared Imports should be used. R CMD check generated the following check_fails: 1. cyclocomp 2. rcmdcheck_imports_not_imported_from #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 96.12 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- validate_params_type | 84 aggregatefield | 56 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 277 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 277


4. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following 5 function names are duplicated in other packages: - - `aggregate_data` from simITS - - `create_report` from DataExplorer, prodigenr, reporter - - `log_close` from logr - - `prepare_data` from bbsBayes, bigstep, childsds, corporaexplorer, disaggregation, fHMM, ggasym, multigroup, multimorbidity, mutualinf, nmm, parsnip, PLNmodels, sglOptim, shapr, ssMousetrack - - `read_data` from creditmodel, deaR, deforestable, diverse, ecocomDP, GeodesiCL, logib, metrix, prepdat, qtlpoly, RTextTools, sjlabelled, whippr


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.0.4.75 | |pkgcheck |0.0.3.60 |


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

phuongquan commented 2 years ago

Hi @maurolepore,

I have pushed the following changes:

Please let me know if you need anything more.

Thanks, Phuong

maurolepore commented 2 years ago

Thanks! I'll run checks one last time and start searching for reviewers shortly.

Could you please name three people you think might be good reviewers and explain why?

(I would appoint only one but having more helps understand what kinds of skill you think are useful.)

Here are some more non-blocking comments.

library(daiquiri)

path <- daiquiri_example("raw_data.csv")
raw_data <- readr::read_csv(path)
raw_data

if (interactive()) {
  create_report(raw_data)
}

Here is a version comments for you

library(daiquiri)

# Maybe write this helper (inspired by `readr::readr_example`)
path <- daiquiri_example("raw_data.csv")
# Maybe create a basic .csv that reads well directly with read_csv()?
# I think read_data() might be helpful but not the main goal of the package
raw_data <- readr::read_csv(path)

# Do show what the data looks like. Users may want to know if the data they
# have is a good fit for this package. Ensure to show just a bit. If it doesn't print
# as a tibble, then use `head()` or `str()`.
raw_data

# It won't run when you render README. It  will run if copy-pasting this code
if (interactive()) {
  # If you add this default: `fieldtypes = fieldtypes()`
  create_report(raw_data)
}
maurolepore commented 2 years ago

@ropensci-review-bot check package

ropensci-review-bot commented 2 years ago

Thanks, about to send the query.

ropensci-review-bot commented 2 years ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 2 years ago

Checks for daiquiri (v0.7.3)

git hash: 339a93e8

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 3)


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:----------|------:| |internal |base | 401| |internal |daiquiri | 102| |internal |graphics | 6| |internal |mgcv | 1| |imports |ggplot2 | 37| |imports |stats | 19| |imports |data.table | 14| |imports |scales | 8| |imports |utils | 4| |imports |readr | 3| |imports |reactable | 1| |imports |cowplot | NA| |imports |rmarkdown | NA| |suggests |covr | NA| |suggests |knitr | NA| |suggests |testthat | NA| |suggests |codemetar | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (74), c (28), sum (21), format (20), names (19), vapply (16), for (12), is.na (12), length (12), with (11), inherits (10), message (9), structure (9), character (8), suppressWarnings (8), seq_along (7), class (6), labels (6), max (6), as.character (5), call (5), logical (5), min (5), paste0 (5), which (5), as.double (4), is.nan (4), mean (4), options (4), unique (4), by (3), lapply (3), nrow (3), vector (3), anyNA (2), as.integer (2), ceiling (2), data.frame (2), file.path (2), formals (2), grep (2), nchar (2), which.max (2), as.Date (1), as.list (1), as.numeric (1), col (1), do.call (1), dQuote (1), emptyenv (1), file (1), gsub (1), is.symbol (1), missing (1), ncol (1), new.env (1), open (1), paste (1), q (1), rle (1), row.names (1), seq (1), sort (1), strsplit (1), substring (1), sys.calls (1), system.file (1), typeof (1), unlist (1), warning (1)

daiquiri

fieldtypes (12), datafield (11), fieldtype (11), get_datafield_max (4), get_datafield_min (4), is.fieldtype_timepoint (3), timepoint_as_aggregationunit (3), aggregate_data (2), ft_allfields (2), ft_ignore (2), ft_timepoint (2), get_datafield_missing (2), identify_duplicaterows (2), is.fieldtype (2), aggregateallfields (1), aggregatefield (1), aggtype_friendlyname (1), create_report (1), dummy_reactable_call (1), export_aggregated_data (1), fieldtypes_template (1), fieldtypes_to_string (1), ft_categorical (1), ft_datetime (1), ft_duplicates (1), ft_freetext (1), ft_numeric (1), ft_simple (1), ft_uniqueidentifier (1), get_aggfunctions (1), get_collector (1), get_dataclass (1), get_datafield_basetype (1), get_datafield_count (1), get_datafield_fieldtype_name (1), get_datafield_validation_warnings_n (1), get_datafield_vector (1), get_fieldtype_name (1), is.aggregatedata (1), is.aggregatefield (1), is.datafield (1), is.fieldtype_calculated (1), is.fieldtype_datetime (1), is.fieldtype_ignore (1), is.fieldtype_numeric (1), is.fieldtypes (1), log_initialise (1), plot_overview_heatmap_static (1), plot_overview_totals_static (1), prepare_data (1), report_data (1), summarise_aggregated_data (1), summarise_source_data (1), yscale_breaks (1)

ggplot2

element_text (10), element_blank (5), ggplot (5), labs (4), aes_string (3), facet_grid (2), geom_point (2), theme (2), element_rect (1), geom_line (1), scale_y_continuous (1), unit (1)

stats

df (7), dt (7), median (3), heatmap (2)

data.table

data.table (9), as.data.table (2), fifelse (2), copy (1)

scales

label_date_short (5), breaks_pretty (3)

graphics

title (5), axis (1)

utils

data (3), object.size (1)

readr

cols (3)

mgcv

s (1)

reactable

colDef (1)

**NOTE:** Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 8 files) and - 1 authors - 1 vignette - no internal data file - 9 imported packages - 18 exported functions (median 10 lines of code) - 132 non-exported functions in R (median 9 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 8| 50.7| | |files_vignettes | 1| 68.4| | |files_tests | 7| 86.4| | |loc_R | 1917| 83.4| | |loc_vignettes | 143| 37.3| | |loc_tests | 878| 85.1| | |num_vignettes | 1| 64.8| | |n_fns_r | 150| 85.2| | |n_fns_r_exported | 18| 64.2| | |n_fns_r_not_exported | 132| 88.4| | |n_fns_per_file_r | 11| 88.5| | |num_params_per_fn | 2| 10.4| | |loc_per_fn_r | 9| 24.3| | |loc_per_fn_r_exp | 10| 22.2| | |loc_per_fn_r_not_exp | 9| 27.1| | |rel_whitespace_R | 13| 75.2| | |rel_whitespace_vignettes | 45| 47.5| | |rel_whitespace_tests | 17| 79.9| | |doclines_per_fn_exp | 69| 79.2| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 141| 84.4| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check](https://github.com/phuongquan/daiquiri/workflows/R-CMD-check/badge.svg)](https://github.com/phuongquan/daiquiri/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 2503730587|pages build and deployment |success |339a93 | 7|2022-06-15 | | 2503730618|R-CMD-check |success |339a93 | 23|2022-06-15 | | 2503730623|test-coverage |success |339a93 | 22|2022-06-15 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following check_fail: 1. cyclocomp #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 96.06 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- validate_params_type | 84 aggregatefield | 56 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 279 potential issues: message | number of times --- | --- Lines should not be more than 80 characters. | 279


4. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following 5 function names are duplicated in other packages: - - `aggregate_data` from simITS - - `create_report` from DataExplorer, prodigenr, reporter - - `log_close` from logr - - `prepare_data` from bbsBayes, bigstep, childsds, corporaexplorer, disaggregation, fHMM, ggasym, multigroup, multimorbidity, mutualinf, nmm, parsnip, PLNmodels, sglOptim, shapr, ssMousetrack - - `read_data` from creditmodel, deaR, deforestable, diverse, ecocomDP, GeodesiCL, logib, metrix, prepdat, qtlpoly, RTextTools, sjlabelled, whippr


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.0.4.75 | |pkgcheck |0.0.3.60 |


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

phuongquan commented 2 years ago

Hi @maurolepore,

I'd like to suggest the following reviewers:

These two people have worked on R packages looking at data quality in the same research field that daiquiri was originally developed for:

And this person comes from a more general area of reviewing data in a friendly/attractive way:

Thanks, Phuong

phuongquan commented 2 years ago

Hi @maurolepore,

I just read the Peer review policies and think the suggestions I made may have conflicts of interest in that they are significant contributors to what may be seen as "competitor projects". I would like to suggest the following reviewers instead, who are more from the user side:

Thanks! Phuong

phuongquan commented 2 years ago

Hi @maurolepore,

In response to your other comments:

  • ml01. Maybe you could minimize your example to somethis like this:
library(daiquiri)

path <- daiquiri_example("raw_data.csv")
raw_data <- readr::read_csv(path)
raw_data

if (interactive()) {
  create_report(raw_data)
}

Here is a version comments for you

library(daiquiri)

# Maybe write this helper (inspired by `readr::readr_example`)
path <- daiquiri_example("raw_data.csv")
# Maybe create a basic .csv that reads well directly with read_csv()?
# I think read_data() might be helpful but not the main goal of the package
raw_data <- readr::read_csv(path)

# Do show what the data looks like. Users may want to know if the data they
# have is a good fit for this package. Ensure to show just a bit. If it doesn't print
# as a tibble, then use `head()` or `str()`.
raw_data

# It won't run when you render README. It  will run if copy-pasting this code
if (interactive()) {
  # If you add this default: `fieldtypes = fieldtypes()`
  create_report(raw_data)
}

I have condensed the example to remove arguments that have defaults and to show the head of the example data. I have also tried to make it clearer that the main purpose of the read_data() function is to read in the data without doing any datatype conversions so that daiquiri can check for any non-conformant values. (If readr is used to read in the data, things like invalid dates get removed automatically and so daiquiri cannot then report on them)

  • ml05: Would it run faster if you make the example raw_data smaller, like 1/4 or 1/10 of the size it has now?

No, reducing the size of the data won't make a difference as the thing that takes all the time is the knitting of the rmarkdown report which I can't make any faster.

Thanks again, Phuong

maurolepore commented 2 years ago

@ropensci-review-bot seeking reviewers

maurolepore commented 2 years ago

@ropensci-review-bot assign @brad-cannell as reviewer

ropensci-review-bot commented 2 years ago

@brad-cannell added to the reviewers list. Review due date is 2022-07-27. Thanks @brad-cannell for accepting to review! Please refer to our reviewer guide.

ropensci-review-bot commented 2 years ago

@brad-cannell: If you haven't done so, please fill this form for us to update our reviewers records.

phuongquan commented 2 years ago

Hi @maurolepore, I've noticed that the label is still showing as 1/editor-checks when it should be 2/seeking-reviewers. I imagine something went wrong with the bot? I just wanted to raise it in case it makes a difference in moving the process along.

Thanks, Phuong

maurolepore commented 2 years ago

Thanks! You're right. I just changed the label manually.

FYI I'm still in search of the second reviewer. Sometimes it's quick; sometimes it's not so quick. Once we reach out to someone we give them a few days to respond before we move to someone else.

rsangole commented 2 years ago

I really appreciate the request to review, and I'd love to do so on this tool too. Unfortunately, Apple (where I work) is pretty strict about open source contributions, so it could be months before I get approval to do so. Your timeline may benefit finding another reviewer.

ropensci-review-bot commented 2 years ago

:calendar: @brad-cannell you have 2 days left before the due date for your review (2022-07-27).

maurolepore commented 2 years ago

@ropensci-review-bot assign @elinw as reviewer

ropensci-review-bot commented 2 years ago

@elinw added to the reviewers list. Review due date is 2022-08-17. Thanks @elinw for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 2 years ago

@elinw: If you haven't done so, please fill this form for us to update our reviewers records.

maurolepore commented 2 years ago

@phuongquan I'm happy we now have two amazing reviewers! Looking forwards to their feedback.

mbcann01 commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 3


Review Comments

Thank you for the opportunity to review this package. It looks really cool! I don't personally use time series data that often, but I can see how this package would be useful if I did. I have a couple of comments below that I hope will be helpful to the authors.

ropensci-review-bot commented 2 years ago

:calendar: @elinw you have 2 days left before the due date for your review (2022-08-17).

maurolepore commented 2 years ago

Dear @elinw, just a friendly reminder about your submission. How is it going?

maurolepore commented 2 years ago

Dear @phuongquan, Sorry for the delay. I'm struggling to reach @elinw.

maurolepore commented 2 years ago

Dear @phuongquan,

I discussed with the editors board and I think it would be best to either a) find a new reviewer or b) review the the package myself. Which one do you prefer?

My review would be faster and I have substantial experience building packages in general. But an external reviewer has greater potential for noticing details specific to your field, and thus enriching your package in a conceptually useful way.

It's very rare for @elinw to be unresponsive, so likely there is a good reason for it, and seems best to release the pressure from her.