Closed ezraporter closed 8 months ago
A few things here, in general I think the actual code and handling is sound and I'll separate out my thoughts below.
Would it be possible to consolidate the warnings so that one warning encompasses all fields? Or at least each type (the logical versus extra fields checks)? Right now the setup makes a separate warning for each field, which would make for a lot of warnings in most cases where this comes up. I added a second yesno field with an UNK MDC to show below, you can remove it if you like:
read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"),
+ token = Sys.getenv("REDCAPTIDIER_MDC_API"), raw_or_label = "label")$redcap_data[[1]]
# A tibble: 3 × 9
record_id yesno yesno2 text checkbox___1 checkbox___2 checkbox___3 dropdown form_status_complete
<dbl> <lgl> <lgl> <chr> <lgl> <lgl> <lgl> <fct> <fct>
1 1 TRUE NA text TRUE TRUE FALSE C Complete
2 2 NA NA UNK FALSE FALSE FALSE NA Complete
3 3 NA NA NA FALSE FALSE FALSE NA Incomplete
Warning messages:
1: In read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"), token = Sys.getenv("REDCAPTIDIER_MDC_API"), :
! `yesno` is type 'yesno' but contains non-logical values: UNK
ℹ These were converted to `NA` resulting in possible data loss
ℹ Does your REDCap project utilize missing data codes?
ℹ Silence this warning with `options(redcaptidier.allow.mdc = TRUE)` or set `raw_or_label = 'raw'` to access missing data codes
2: In read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"), token = Sys.getenv("REDCAPTIDIER_MDC_API"), :
! `yesno2` is type 'yesno' but contains non-logical values: UNK
ℹ These were converted to `NA` resulting in possible data loss
ℹ Does your REDCap project utilize missing data codes?
ℹ Silence this warning with `options(redcaptidier.allow.mdc = TRUE)` or set `raw_or_label = 'raw'` to access missing data codes
3: In read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"), token = Sys.getenv("REDCAPTIDIER_MDC_API"), :
! `dropdown` contains values with no labels: UNK
ℹ These were converted to `NA` resulting in possible data loss
ℹ Does your REDCap project utilize missing data codes?
ℹ Silence this warning with `options(redcaptidier.allow.mdc = TRUE)` or set `raw_or_label = 'raw'` to access missing data codes
I think aesthetically something similar to the warning we had for mixed-structure data would be what to aim for:
> read_redcap(redcap_uri = Sys.getenv("REDCAP_URI"),
+ token = Sys.getenv("REDCAPTIDIER_MIXED_STRUCTURE_API"))
Error in `clean_redcap_long()` at REDCapTidieR/R/read_redcap.R:278:5:
✖ Instruments detected that have both repeating and nonrepeating instances defined in the project: mixed_structure_1 and
mixed_structure_form_complete
ℹ Set `allow_mixed_structure` to `TRUE` to override. See Mixed Structure Instruments for more information.
Run `rlang::last_trace()` to see where the error occurred.
I might be wrong here, but we may have the raw/label order backwards? See below for the output from REDCapR:
> redcap_read_oneshot(redcap_uri = Sys.getenv("REDCAP_URI"),
+ token = Sys.getenv("REDCAPTIDIER_MDC_API"), raw_or_label = "label")$data
3 records and 10 columns were read from REDCap in 1.7 seconds. The http status code was 200.
record_id yesno yesno2 text checkbox___1 checkbox___2 checkbox___3 checkbox___unk dropdown form_1_complete
1 1 Yes <NA> text Checked Checked Unchecked Unchecked C Complete
2 2 Unknown Unknown Unknown Unchecked Unchecked Unchecked Checked Unknown Complete
3 3 <NA> <NA> <NA> Unchecked Unchecked Unchecked Unchecked <NA> Incomplete
Currently the MDC is set as:
Would you also mind adding something to this PR that mentions this in one of the vignettes/articles? At the moment the only way a user would know about the options()
is to encounter them in the wild from the warning since there's nothing in our supporting documentation. Something short in the "Get Started" at the end should be fine. This will also be a good place to note what isn't supported (text fields, etc.).
@rsh52 give this another look once the CI passes. I consolidated the warnings and updated the vignette. As discussed earlier, there isn't a way of getting "Unknown" instead of UNK without reading the project metadata so I left that.
Description
This PR updates our handling of redcap projects using missing data codes. The new behavior is:
raw_or_label = "raw"
missing data codes kept in the data as it comes back from the APINA
inyesno
,truefalse
, andcheckbox
fields with a warning if this occursNA
indropdown
andradio
fields with a warning if this occursoptions(redcaptidier.allow.mdc = TRUE)
can be set to silence these warningsBenchmarks initially looked like we had increased run time for two of our redcaps but that turned out to be a false positive after increasing the number of microbenchmark iterations.
Proposed Changes
check_field_is_logical()
check and refactormulti_choice_to_labels()
to incorporate it. This handles checking and parsing to logicalcheck_extra_field_values()
check and add tomulti_choice_to_labels()
Screenshots Logical field warning:
Categorical field warning:
Issue Addressed
Relates to #181
PR Checklist
Before submitting this PR, please check and verify below that the submission meets the below criteria:
.RDS
) updated underinst/testdata/create_test_data.R
usethis::use_version()
Code Review
This section to be used by the reviewer and developers during Code Review after PR submission
Code Review Checklist