Create functions to mirror dataframes into REDCap projects

pbchase commented 1 year ago

We need to mirror two rcc.billing tables into REDCap. This will allow REDCap staff and CSBT staff to see some of the data that drives the rcc.billing processes. The tables that need to be written are invoice_line_item and banned_owners.

Please implement this as a set of functions we can call from scripts that update these tables.

Spec for `mirror_data_to_redcap_project`

[ ] The function mirror_data_to_redcap_project take a single parameter, table_name
[ ] The function will read the environment to get the value of the environment variable named <table_name>_pid.
[ ] The function will also read the value of the env variables named CREDENTIALS_DB.
[ ] The function will search CREDENTIALS_DB for the REDCap API credentials for the current user in the PID pointed to by the value of <table_name>_pid.
[ ] The function will use REDCapR::redcap_write() to write the data frame to the REDCap project described by those credentials.
[ ] The function will return the complete results of REDCapR::redcap_write()

Spec for `dataframe_to_redcap_dictionary`

If we can't find one in another R package, we should write a helper function named something like dataframe_to_redcap_dictionary. The function should output a CSV we can import into an empty REDCap project so that we could write the data of that dataframe into that project. The function should do these things:

[ ] Transform the column names into a column of field_name
[ ] Add a column, form_name with a value of table_name
[ ] Add a column, section_header with a value of as.character(NA)
[ ] Add a column field_type with a value of "text"
[ ] Transform the column names into a column of field_label
[ ] Add a column, select_choices_or_calculations with a value of as.character(NA)
[ ] Add a column, field_note with a value of as.character(NA)
[ ] Transform the column classes valid text validation types and write that to a column named text_validation_type_or_show_slider_number
[ ] Return the constructed dataframe

If we do this, make this function part of REDCap Custodian.

For reference, see https://github.com/kamclean/collaborator/blob/master/R/data_dict.R which generates a different data dictionary despite being in a redcap-centric package. :-(

ljwoodley commented 1 year ago

Should the data pull from SQL also be in mirror_data_to_redcap_project. It's a mirror and an altered df will never be used as input for this function right?
```
con <- connect_to_db()
data_to_mirror <- dbGetQuery(con, paste0("SELECT * FROM ", table_name))
dbDisconnect(con)
```
Can we have test project in redcap for one of the tables mentioned?

pbchase commented 1 year ago

Should the data pull from SQL also be in mirror_data_to_redcap_project. It's a mirror and an altered df will never be used as input for this function right?
con <- connect_to_db()
data_to_mirror <- dbGetQuery(con, paste0("SELECT * FROM ", table_name))
dbDisconnect(con)

No, let's not do that. I can see now my spec was leaving out some of my assumptions and maybe "mirror_" is the wrong prefix on this function. My typical use case is to update or insert 1-200 rows of novel table data into the REDCap project corresponding to the table I was updating. e.g., I have a data frame of updates I use to update MySQL and I use that exact same dataframe to update the REDCap project. Each would be an efficient write. So the function would take two parameters: a dataframe and a table name.

So what of the name mirror_data_to_redcap_project? Is it the wrong name? How about write_to_redcap_by_table_name? I feel like this function is part of the credentials management feature @ChemiKyle created for REDCap Custodian. It's kind of a credentials management wrapper for REDCapR::redcap_write(). If that's true, then we should use ... to allow the caller to pass parameters on to redcap_write(). Now I wonder if the table name should be redcapcustodian::redcap_write_by_table_name()

Can we have test project in redcap for one of the tables mentioned?

Are you asking me to make you a test table to write to? If you're looking for that, maybe you should write dataframe_to_redcap_dictionary to allow you to make that target table.

ljwoodley commented 1 year ago

mirror_data_to_redcap_project is still valid as it's mirroring the input dataframe.

Oh right, data can't be uploaded without the table definition. I'll create that first.

Since both these functions are really for redcapcustodian should I move the issue there?

pbchase commented 1 year ago

Since both these functions are really for redcapcustodian should I move the issue there?

I see you already did this. Thanks for making that change. Sometimes it's hard to understand where an idea needs to go. It helps a lot to write it down, but plenty more flaws in the plan emerge when someone else has to read it and make sense of it.

ctsit / rcc.billing