ctsit / redcapcustodian

Simplified, automated data management on REDCap systems
Other
12 stars 6 forks source link

Use search parameters in `get_hipaa_disclosure_log_from_ehr_fhir_logs()` #161

Open pbchase opened 1 week ago

pbchase commented 1 week ago

Use the optional parameters start_date and ehr_id to get_hipaa_disclosure_log_from_ehr_fhir_logs(). For each parameter, if it is specified, filter the data returned by the SQL query underlying this code block:

dplyr::tbl(conn, "redcap_ehr_fhir_logs") |>
    dplyr::filter(.data$resource_type == "Patient" & .data$mrn != "") |>
    dplyr::left_join(user_information, by = c("user_id" = "ui_id")) |>
    dplyr::left_join(projects, by = c("project_id")) |>
    dplyr::collect()

The tricky part is making it performant. When you see a code block that starts with dplyr::tbl and ends with dplyr::collect(), the query will be converted into SQL and handed to the SQL server. In my experience, these queries are inefficient if you use more than one filter statement. So I advise you to use only one filter statement and that gets hard.

You'll probably need to separate the tbl-->collect stanza from the code that follows it saving an intermediate result. Then you can use conditional logic statements in R to decide which of a few varying tbl-->collect stanzas to send.