ctsit / redcapcustodian

Simplified, automated data management on REDCap systems
Other
12 stars 6 forks source link

Scrape `redcap_log_event*` for a project's data #86

Open pbchase opened 1 year ago

pbchase commented 1 year ago

Consider creating functions that could extract a project's redcap_data history from the redcap_log_event* tables and turn it into a valid dataset, ready for import into the same or a similar redcap project. Consider the approach used in https://github.com/OuhscBbmc/REDCapR/issues/450

Make sure to read my comments about the SQLite approach

ChemiKyle commented 1 year ago

Note that if a project is moved to production, all prior data entry logs are wiped from the log.

However, something like this is a good starting point, sql_log contains raw SQL, if things are simple enough, we could simply incrementally run these statements against an in-memory redcap_data table before porting in to REDCap itself. How to properly do the latter is up for discussion, easy would be to just run those INSERTs against prod, but we should be good and try to use the API to get logging.

target_pid <- "123456"
source_conn <- connect_to_redcap_db()

log_tbl_name <- tbl(source_conn, "redcap_projects") %>%
  filter(project_id == target_pid) %>%
  pull(log_event_table)

tbl(source_conn, log_tbl_name) %>%
  filter(project_id == target_pid) %>%
  ## filter(event == "INSERT") %>%
  filter(str_detect(sql_log, "redcap_data")) %>%
  collect() %>%
  mutate(ts = ymd_hms(ts)) %>%
  arrange(ts)
pbchase commented 1 year ago

easy would be to just run those INSERTs against prod

I never want to run the INSERTs against a REDCap DB. We should create files that are ready to be handed to the API for write or written out as a CSV for a human to review and import with great discretion.