vubiostat / redcapAPI

R interface to REDCap (http://www.project-redcap.org/)
21 stars 27 forks source link

Ensuring Checkbox Integrity: Preventing Unwanted ‘Unselected’ Transformations #379

Closed jubilee2 closed 5 months ago

jubilee2 commented 6 months ago

I’ve encountered an issue where checkboxes are transforming from ‘NA’ or an empty string ("") to ‘Unselected’ even when the field is not included in the event. This behavior differs from the original REDCap output, which preserves the ‘NA’ or empty string values.

spgarbet commented 6 months ago

This is a known issue with Checkboxes. A Checkbox always has a value by definition. One could specify a custom casting override that leaves the checkbox in it's raw form.

exportRecordsTyped(rcon, cast=list(checkbox=castRaw))
spgarbet commented 6 months ago

A Checkbox can never have an NA since it always has a value.

jubilee2 commented 6 months ago

It’s not a big deal, but I’d like to point out that the issue pertains to forms that do not exist within the event. Therefore, data that is not present in the event could be skipped during the post-processing to increase performance. This approach would ensure that only relevant forms are processed, thereby optimizing the system’s efficiency.

spgarbet commented 6 months ago

There's a fair bit of that in the existing code, but we're open to pull requests.

jubilee2 commented 6 months ago

Do you want me to create PR for this? if you are willing to.

spgarbet commented 6 months ago

I'd like to understand what you are proposing. There are a lot of concerns around this sort of thing in the library already. Maybe start with pointing out where the modification would go in the code, and give some pseudocode here.

jubilee2 commented 6 months ago

I don't think I know any more about this package than you do. is it related with casting? https://github.com/vubiostat/redcapAPI/blob/main/R/fieldCastingFunctions.R

spgarbet commented 6 months ago

If one uses exportBulkRecords it splits by form by default (and has all the same options as exportRecordsTyped). This makes it a lot easier on the filterEmptyRows. However, with Checkboxes it will still be unable to filter out empty rows on forms with them. #349 is a user requesting a "semantic" missingness around checkboxes because they have a similar issue.

spgarbet commented 6 months ago

The problem is fairly difficult to solve in general because of the nature of checkboxes and the way REDCap treats the data. A checkbox has two states "Checked" and "Unchecked". To convert these to NA would require that all other fields of the form be NA as well and all Checkboxes are "Unchecked". Thus the interpretation of the value of a checkbox is now dependent on the context of the set of all variables in a form. Right now NA/validation/casting is all done on a variable in isolation and the interpretation depends on no other variables but it's immediate value. Checkboxes have been one of the biggest challenges of writing the code. There's a lot of workarounds that have been put in over time. I think the solution to #349 is going to be the best possible unless REDCap changes how it stores and reports them.

spgarbet commented 5 months ago

Continue discussion on #349