ropensci / software-review

rOpenSci Software Peer Review.
286 stars 104 forks source link

qualtdict: Generating Variable Dictionaries and Labelled Data Exports of Qualtrics Surveys #572

Closed lyh970817 closed 1 week ago

lyh970817 commented 1 year ago

Submitting Author Name: Yuhao Lin Submitting Author Github Handle: !--author1-->@lyh970817<!--end-author1-- Repository: https://github.com/lyh970817/qualtdict Version submitted: 0.0.0.9000 Submission type: Standard Editor: !--editor-->@maurolepore<!--end-editor-- Reviewers: TBD

Archive: TBD Version accepted: TBD Language: en

Package: qualtdict
Title: Generating Variable Dictionaries and Labelled Data Exports of Qualtrics
    Surveys
Version: 0.0.0.9000
Authors@R:
    person("Yuhao", "Lin", , "yuhao.lin@kcl.ac.uk", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-6357-5731"))
Description: Provides functions that generate variable dictionaries from
    'Qualtrics' <https://www.qualtrics.com/about/> surveys and labelled
    survey data based on the dictionary. This package is built upon the R
    package 'qualtRics' <https://github.com/ropensci/qualtRics/> which
    provides access to 'Qualtrics' survey data and metadata via the 'Qualtrics' API
    <https://api.qualtrics.com/>.
License: MIT + file LICENSE
URL: https://github.com/lyh970817/qualtdict
BugReports: https://github.com/lyh970817/qualtdict/issues
Imports:
    crul,
    dplyr,
    glue,
    haven,
    magrittr,
    openNLP,
    purrr,
    qualtRics,
    rlang,
    sjlabelled,
    slowraker,
    SnowballC,
    stringi,
    stringr,
    tibble,
    tidyr,
    xml2
Suggests:
    covr,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0),
    vcr (>= 0.6.0)
VignetteBuilder: 
    knitr
Config/testthat/edition: 3
Config/testthat/start-first: dict_generate, dict_validate, get_survey_data
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3

Scope

Qualtrics is an online survey and data collection software platform. While the qualtRics R package implements data retrieval from the Qualtrics platform, this package 'qualtdict' further processes its output to generate variable dictionaries and labelled data designed to be used for data analyses directly.

The target audience is those who use the Qualtrics survey platform to collect data. This package generates variable dictionaries and labelled data designed to be used for data analyses directly.

No, but there is the similar qualtRics R package that retrieves a broader range of data from Qualtrics than this package utilises. The output formats from qualtRics are much less user-friendly, for example, it retrieves survey metadata in a nested-list, json-like format, while this package rearranges essential parts of this metadata (retrieved using quatRics) into a publishable variable dictionary in a table format that can be visually inspected in, for example, excel.

Yes.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

ropensci-review-bot commented 1 year ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 1 year ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 year ago

Checks for qualtdict (v0.0.0.9000)

git hash: d31c0887

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:----------|------:| |internal |base | 179| |internal |qualtdict | 118| |internal |utils | 5| |internal |stats | 1| |imports |magrittr | 70| |imports |rlang | 8| |imports |glue | 7| |imports |qualtRics | 3| |imports |tibble | 3| |imports |openNLP | 2| |imports |sjlabelled | 2| |imports |xml2 | 2| |imports |stringi | 1| |imports |tidyr | 1| |imports |crul | NA| |imports |dplyr | NA| |imports |haven | NA| |imports |purrr | NA| |imports |slowraker | NA| |imports |SnowballC | NA| |imports |stringr | NA| |suggests |covr | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |testthat | NA| |suggests |vcr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

list (66), length (9), names (7), c (6), unique (6), unlist (6), args (4), ifelse (4), is.null (4), max (4), min (4), paste0 (4), all (3), is.na (3), rownames (3), as.matrix (2), colnames (2), factor (2), for (2), grep (2), is.character (2), levels (2), seq_along (2), split (2), structure (2), table (2), vapply (2), which (2), any (1), as.logical (1), character (1), class (1), data.frame (1), do.call (1), if (1), is.function (1), is.logical (1), labels (1), lapply (1), mode (1), numeric (1), q (1), readRDS (1), return (1), sum (1), suppressWarnings (1), tempdir (1), vector (1)

qualtdict

item_or_level_qid (10), rep_level_qid (10), suf_level_qid (9), null_na (7), not_applicable_qid (6), questiontext_qid (6), suf_item_rep_level_qid (6), suf_item_suf_level_qid (6), collapse (5), file_upload_qid (5), rep_level (3), retry (3), calc_keyword_scores (2), check_item (2), check_json (2), check_names (2), easyname_gen (2), label_to_sfx (2), paste_narm (2), qid_recode (2), recode_json (2), rep_item (2), sbs_qid (2), suf_level_suf_item_qid (2), suf_text_qid (2), timing_qid (2), add_text (1), add_text_mc (1), checkarg_isfunction (1), checkarg_isname (1), checkarg_isqualtdict (1), convert_html (1), dict_generate (1), dict_validate (1), get_survey_data (1), is_onetoone (1), order_name (1), suf_nmlabel_qid (1), text (1), which_not_onetoone (1)

magrittr

%>% (70)

rlang

abort (7), hash (1)

glue

glue (7)

utils

txtProgressBar (4), getFromNamespace (1)

qualtRics

fetch_description (1), fetch_survey (1), metadata (1)

tibble

tibble (2), enframe (1)

openNLP

Maxent_POS_Tag_Annotator (1), Maxent_Word_Token_Annotator (1)

sjlabelled

set_label (1), set_labels (1)

xml2

read_html (1), xml_text (1)

stats

setNames (1)

stringi

stri_count_words (1)

tidyr

unite (1)

**NOTE:** Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 10 files) and - 1 authors - 1 vignette - no internal data file - 17 imported packages - 3 exported functions (median 25 lines of code) - 110 non-exported functions in R (median 10 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 10| 59.0| | |files_vignettes | 1| 68.4| | |files_tests | 7| 86.4| | |loc_R | 1152| 71.7| | |loc_vignettes | 118| 30.8| | |loc_tests | 1014| 87.2| | |num_vignettes | 1| 64.8| | |n_fns_r | 113| 79.3| | |n_fns_r_exported | 3| 12.9| | |n_fns_r_not_exported | 110| 85.5| | |n_fns_per_file_r | 6| 75.4| | |num_params_per_fn | 5| 69.6| | |loc_per_fn_r | 11| 32.3| | |loc_per_fn_r_exp | 25| 55.9| | |loc_per_fn_r_not_exp | 10| 31.3| | |rel_whitespace_R | 17| 70.0| | |rel_whitespace_vignettes | 25| 21.4| | |rel_whitespace_tests | 1| 14.7| | |doclines_per_fn_exp | 43| 54.1| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 57| 69.0| | ---

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/lyh970817/qualtdict/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/lyh970817/qualtdict/actions) [![test-coverage.yaml](https://github.com/lyh970817/qualtdict/actions/workflows/test-coverage.yaml/badge.svg)](https://github.com/lyh970817/qualtdict/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:-------------|:----------|:------|----------:|:----------| | 4076045888|R-CMD-check |success |d31c08 | 11|2023-02-02 | | 4076045893|test-coverage |success |d31c08 | 11|2023-02-02 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following check_fail: 1. no_import_package_as_a_whole #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 85.98 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 1


Package Versions

|package |version | |:--------|:--------| |pkgstats |0.1.3 | |pkgcheck |0.1.1.11 |


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

maurolepore commented 1 year ago

Dear @lyh970817, FYI I'm still searching for a handling editor. It shouldn't take much longer. Thanks for your patience.

lyh970817 commented 1 year ago

Dear @lyh970817, FYI I'm still searching for a handling editor. It shouldn't take much longer. Thanks for your patience.

Thank you so much!

maurolepore commented 1 year ago

@ropensci-review-bot assign @maurolepore as editor

ropensci-review-bot commented 1 year ago

Assigned! @maurolepore is now the editor

maurolepore commented 1 year ago

Dear @lyh970817 I'm delighted to announce that I'll be the handling editor of this submission.

Semantic tags for my comments

To help you track my comments I tagged them with "ml" and numbered sequentially: ml01, ml02, and so on. Comments following bullets are for you to consider -- you may or may not respond to them. Comments following check-boxes are requests for some action -- please respond.

Reviewers

Checks

Here I list a few things that caught my attention. They are not blockers but the sooner we address them the better.

Package Dependencies

goodpractice and other checks

lyh970817 commented 1 year ago

Thank you so much for taking time to review this. These are my responses.

ml01. Unfortunately I'm not sure if I could name any specific authors. But expertise-wise I thought having someone with a psychology/social science background might be helpful. As qualtdict is centred around creating a variable dictionary giving an intuitive overview of survey data for analysts. The usefulness of such a dictionary is probably best judged by someone who analyses such data on a daily basis (in contrast to a data engineer who implements APIs for such data).

ml02. R CMD Check seems to fail without importing some of the packages that I don't actually use. For instance, without importing haven:

Error in `set_labels_helper(x = .dat, labels = labels, force.labels = forc
e.labels, 
    force.values = force.values, drop.na = drop.na, var.name = NULL)`: Pac
kage 'haven' required for this function. Please install it.

ml03. I use dplyr, purrr and stringr extensively so I import them as a whole. Should I still import functions from them (which will be many) individually?

ml04. I think it comes from this line in the tests:

library(vcr) # *Required* as vcr is set up on loading

which is mandatory for vcr to work.

maurolepore commented 1 year ago
[ FAIL 0 | WARN 591 | SKIP 0 | PASS 4 ]
lyh970817 commented 1 year ago

ml02. I believe this is because in sjlabelled, haven is a package in the Suggets field. The function it calls from haven is not actually haven::read_xpt but I needed to import an arbitrary function from haven for the set_labels function to see and load it.

Please see the DESCRIPTION file for sjlabelled: https://github.com/strengejacke/sjlabelled/blob/master/DESCRIPTION.

Package: sjlabelled
Type: Package
Encoding: UTF-8
Title: Labelled Data Utility Functions
Version: 1.2.0.3
Authors@R: c(
    person("Daniel", "Lüdecke", role = c("aut", "cre"), email = "d.luedecke@uke.de", comment = c(ORCID = "0000-0002-8895-3206")),
    person("avid", "Ranzolin", role = "ctb", email = "daranzolin@gmail.com"),
    person("Jonathan", "De Troye", role = "ctb", email = "detroyejr@outlook.com")
    )
Maintainer: Daniel Lüdecke <d.luedecke@uke.de>
Description: Collection of functions dealing with labelled data, like reading and 
    writing data between R and other statistical software packages like 'SPSS',
    'SAS' or 'Stata', and working with labelled data. This includes easy ways 
    to get, set or change value and variable label attributes, to convert 
    labelled vectors into factors or numeric (and vice versa), or to deal with 
    multiple declared missing values.
License: GPL-3
Depends:
    R (>= 3.4)
Imports:
    insight,
    datawizard,
    stats,
    tools,
    utils
Suggests:
    dplyr,
    haven (>= 1.1.2),
    magrittr,
    sjmisc,
    sjPlot,
    knitr,
    rlang,
    rmarkdown,
    snakecase,
    testthat
URL: https://strengejacke.github.io/sjlabelled/
BugReports: https://github.com/strengejacke/sjlabelled/issues
RoxygenNote: 7.2.1
VignetteBuilder: knitr

And the specific lines where haven is loaded: https://github.com/strengejacke/sjlabelled/blob/548fa397bd013ec7e44b225dd971d19628fdc866/R/set_labels.R#L317.

What would be the best way to deal with this?

ml05-7. I was able to capture the outputs when drafting the package so I should be able to do that in the tests. The warnings are not intended and are due to package versions. I will resolve these and create an RStudio project and then update this comment. Thank you so much!

maurolepore commented 1 year ago

ml02. Thanks for explaining. The best solution will likely vary for each of the "unused" packages.

In the case of heaven, the file you showed me has a single call of the type haven::<some function> so it might be worth looking at the source code of that function and see if you can re-implement it and remove the dependency on haven.

https://github.com/strengejacke/sjlabelled/blob/548fa397bd013ec7e44b225dd971d19628fdc866/R/set_labels.R#L325

More generally, I think a great explanation of the trade-offs in dependencies is that of Jim Hester in his talk "It depends": https://www.youtube.com/watch?v=mum13N7CGUI . So as long as you understand those trade-offs you would be able to make an informed decision for each "unused" package and justify your decision if the reviewers ask.

maurolepore commented 1 year ago

Dear @lyh970817, Just checking. Would you be available to address the comments ml05-ml07? We can also put this submission on hold if you need more time. Let me know.

lyh970817 commented 1 year ago

Dear @lyh970817,

Just checking. Would you be available to address the comments ml05-ml07? We can also put this submission on hold if you need more time. Let me know.

Yes, sorry - would just need a couple more days to address these. Thanks.

maurolepore commented 5 months ago

@ropensci-review-bot put on hold

ropensci-review-bot commented 5 months ago

Submission on hold!

ropensci-review-bot commented 2 months ago

@maurolepore: Please review the holding status

maurolepore commented 2 months ago

@lyh970817, how would you like to proceed?

  1. Resume the submission.
  2. Continue on hold.
  3. Withdrawal the submission.

The holding status will be revisited every 3 months, and after one year the issue will be closed. -- https://devdevguide.netlify.app/softwarereview_policies.html#policiesreviewprocess

maurolepore commented 1 week ago

Dear @lyh970817

I hope all is well. I totally understand priorities change. At this moment I believe this policy applies:

If the author hasn’t requested a holding label, but is simply not responding, we should close the issue within one month after the last contact intent. This intent will include a comment tagging the author, but also an email using the email address listed in the DESCRIPTION of the package which is one of the rare cases where the editor will try to contact the author by email. -- https://devdevguide.netlify.app/softwarereview_policies

FYI my next step is to confirm with the chief editor and if they agree I'll close the issue and let you know by email.

maurolepore commented 1 week ago

Dear @lyh970817 I confirmed with the chief editor and shared my next steps with the entire editorial board. I'll go ahead and close this issue and let you know by email.

Once again, I understand priorities change. Thank a lot for contributing to rOpenSci. We look forward to more contributions whenever it's a good time.