Closed JBarsotti closed 2 months ago
Just as a follow-up to my previous question, I found that a few of the fields in the REDCap project were not exported to the "data" dataframe, but were exported to the "dictionary" dataframe when I called the redcap_data
function. I'm not totally sure why. There were four fields, all radio buttons with only a single choice option.
Good morning, John.
The error you are experiencing is due to a safeguard within the rd_transform
function, which is triggered when there are more variables in the dictionary than in the data. In these instances, it is not possible to guarantee that the data will be properly split by event therefore the safeguard is triggered.
Can you, please, tell us if you are using the API connection or the exported files from REDCap to import your data into R? And can you also confirm if the names of those 4 variables, by any chance, end with "_complete"?
We will investigate it and attempt to resolve it as soon as possible.
Thank you for your message, João
Thanks for the reply! I am using the API, and yes, they do all end with complete!
We have identified the problem.
There was an error in the code where the redcap_data
function eliminated variables ending in "_complete".
Initially we did this to eliminate the variables that REDCap creates by default related to the completion of each instrument, but then we adapted this process to be an argument of the rd_transform
function (delete_pattern
).
We are so sorry for the inconvenience and we are updating the package version on GitHub and soon on CRAN.
You can install the new version with:
remotes::install_github('bruigtp/REDCapDM')
Thank you for reporting this issue and helping us improve the package.
Please try the new version and if you are able to execute the rd_transform()
function without any problems, we would appreciate it if you would close this issue.
Thank you so much for the aid! Unfortunately, that fix does not seem to help. I still get the same error.
In order to get it to work, I had to delete the variables in the data and data dictionary that contained "_complete" anywhere in the name of the variable. Then the transform would run.
An alternative that also works is to edit the function directly. I edited the code of the function rd_transform
so that it reads like this:
if (!is.null(delete_pattern)) {
for (i in 1:length(delete_pattern)) {
if (delete_pattern[i] == "_complete") {
data <- data %>% dplyr::select(!tidyselect::contains(c("_complete",
"_complete.factor")))
dic <- dic %>% dplyr::filter(!grepl(delete_pattern[i],
.data$field_name))
}
else if (delete_pattern[i] == "_timestamp") {
data <- data %>% dplyr::select(!tidyselect::contains(c("_timestamp",
"timestamp.factor")))
dic <- dic %>% dplyr::filter(!grepl(delete_pattern[i],
.data$field_name))
}
else {
data <- data %>% dplyr::select(!tidyselect::contains(delete_pattern[i]))
dic <- dic %>% dplyr::filter(!grepl(delete_pattern[i],
.data$field_name))
}
}
Good morning John,
You are absolutely right, we only removed the variables with the _complete
pattern from the data, not from both the dictionary and the data.
We have applied your alternative to the version of the package on GitHub with a slight modification:
if(!is.null(delete_pattern)){
for(i in 1:length(delete_pattern)){
if(delete_pattern[i] == "_complete"){
data <- data %>%
dplyr::select(!tidyselect::ends_with(c("_complete", "_complete.factor")))
dic <- dic %>%
dplyr::filter(!grepl("_complete$", .data$field_name))
}else if(delete_pattern[i] == "_timestamp"){
data <- data %>%
dplyr::select(!tidyselect::ends_with(c("_timestamp", "timestamp.factor")))
dic <- dic %>%
dplyr::filter(!grepl("_timestamp$", .data$field_name))
}else{
data <- data %>%
dplyr::select(!tidyselect::contains(delete_pattern[i]))
dic <- dic %>%
dplyr::filter(!grepl(delete_pattern[i], .data$field_name))
}
}
}
Now it will eliminate all variables ending in _complete
or _timestamp
from both the dictionary as well as the data.
Thank you so much for your contribution!!!
First off, this is a great library that using makes data extraction way easier! I have a question about
rd_transform(final_format = "by_event").
For one REDCap database I have, it works fine, but for a different database, I get the error:Error: There're more variables in the dictionary than in the data base so it's not possible to split by event. Transformation stops.
I'm not totally sure what this means and would like to ask about it.
Thanks,
John