ctsit / rcc.billing

Automated, data-driven service billing implemented on REDCap Custodian
https://ctsit.github.io/rcc.billing/
Apache License 2.0
0 stars 3 forks source link

Create functions to mirror dataframes into REDCap projects #174

Closed pbchase closed 1 year ago

pbchase commented 1 year ago

We need to mirror two rcc.billing tables into REDCap. This will allow REDCap staff and CSBT staff to see some of the data that drives the rcc.billing processes. The tables that need to be written are invoice_line_item and banned_owners.

Please implement this as a set of functions we can call from scripts that update these tables.

Spec for mirror_data_to_redcap_project

Spec for dataframe_to_redcap_dictionary

If we can't find one in another R package, we should write a helper function named something like dataframe_to_redcap_dictionary. The function should output a CSV we can import into an empty REDCap project so that we could write the data of that dataframe into that project. The function should do these things:

If we do this, make this function part of REDCap Custodian.

For reference, see https://github.com/kamclean/collaborator/blob/master/R/data_dict.R which generates a different data dictionary despite being in a redcap-centric package. :-(

ljwoodley commented 1 year ago
  1. Should the data pull from SQL also be in mirror_data_to_redcap_project. It's a mirror and an altered df will never be used as input for this function right?

    con <- connect_to_db()
    data_to_mirror <- dbGetQuery(con, paste0("SELECT * FROM ", table_name))
    dbDisconnect(con)
  2. Can we have test project in redcap for one of the tables mentioned?

pbchase commented 1 year ago
  1. Should the data pull from SQL also be in mirror_data_to_redcap_project. It's a mirror and an altered df will never be used as input for this function right?
con <- connect_to_db()
data_to_mirror <- dbGetQuery(con, paste0("SELECT * FROM ", table_name))
dbDisconnect(con)

No, let's not do that. I can see now my spec was leaving out some of my assumptions and maybe "mirror_" is the wrong prefix on this function. My typical use case is to update or insert 1-200 rows of novel table data into the REDCap project corresponding to the table I was updating. e.g., I have a data frame of updates I use to update MySQL and I use that exact same dataframe to update the REDCap project. Each would be an efficient write. So the function would take two parameters: a dataframe and a table name.

So what of the name mirror_data_to_redcap_project? Is it the wrong name? How about write_to_redcap_by_table_name? I feel like this function is part of the credentials management feature @ChemiKyle created for REDCap Custodian. It's kind of a credentials management wrapper for REDCapR::redcap_write(). If that's true, then we should use ... to allow the caller to pass parameters on to redcap_write(). Now I wonder if the table name should be redcapcustodian::redcap_write_by_table_name()

  1. Can we have test project in redcap for one of the tables mentioned?

Are you asking me to make you a test table to write to? If you're looking for that, maybe you should write dataframe_to_redcap_dictionary to allow you to make that target table.

ljwoodley commented 1 year ago

mirror_data_to_redcap_project is still valid as it's mirroring the input dataframe.

Oh right, data can't be uploaded without the table definition. I'll create that first.

Since both these functions are really for redcapcustodian should I move the issue there?

pbchase commented 1 year ago

Since both these functions are really for redcapcustodian should I move the issue there?

I see you already did this. Thanks for making that change. Sometimes it's hard to understand where an idea needs to go. It helps a lot to write it down, but plenty more flaws in the plan emerge when someone else has to read it and make sense of it.