ctsit / redcapcustodian

Simplified, automated data management on REDCap systems
Other
12 stars 6 forks source link

Create functions to mirror dataframes into REDCap projects #135

Closed ljwoodley closed 8 months ago

ljwoodley commented 10 months ago

We need to mirror dataframes into REDCap.

Spec for mirror_data_to_redcap_project

Spec for dataframe_to_redcap_dictionary

If we can't find one in another R package, we should write a helper function named something like dataframe_to_redcap_dictionary. The function should output a CSV we can import into an empty REDCap project so that we could write the data of that dataframe into that project. The function should do these things:

If we do this, make this function part of REDCap Custodian.

For reference, see https://github.com/kamclean/collaborator/blob/master/R/data_dict.R which generates a different data dictionary despite being in a redcap-centric package. :-(

This issue was moved from https://github.com/ctsit/rcc.billing/issues/174

ljwoodley commented 10 months ago
  1. Would mirror_data_to_redcap_project be an upsert, overwrite or append? Trying to determine how best to create the record_id.

  2. Should we account for all these validation types?

date_dmy, 
date_mdy,
date_ymd, 
datetime_dmy,
datetime_mdy,
datetime_ymd,
datetime_seconds_dmy,
datetime_seconds_mdy,
datetime_seconds_ymd,
email, 
integer,
alpha_only,
mrn_generic,
number, 
number_1dp,
number_2dp,
number_3dp, 
number_4dp,
phone, 
ssn, 
time_hh_mm_ss, 
time, time_mm_ss,
zipcode 

@pbchase

pbchase commented 10 months ago
  1. Would mirror_data_to_redcap_project be an upsert, overwrite or append? Trying to determine how best to create the record_id.

You may assume that every mirrored table has a PK (aka record_id) and that every write event to that table you are mirroring into REDCap has that PK field in it.

pbchase commented 10 months ago

2. Should we account for all these validation types?

date_dmy, 
date_mdy,
date_ymd, 
datetime_dmy,
datetime_mdy,
datetime_ymd,
datetime_seconds_dmy,
datetime_seconds_mdy,
datetime_seconds_ymd,
email, 
integer,
alpha_only,
mrn_generic,
number, 
number_1dp,
number_2dp,
number_3dp, 
number_4dp,
phone, 
ssn, 
time_hh_mm_ss, 
time, time_mm_ss,
zipcode 

Please support these validations:

Everything else should probably be invalidated text. I say this, because REDCap is willing to do special things with columns if they are these types. The other column types are more to control data entry. I don't plan on any data entry for these forms, so there is no point in the additional complexity.