GEMINI-Medicine / Rgemini

A custom R package that provides a variety of functions to perform data analyses with GEMINI data
https://gemini-medicine.github.io/Rgemini/
Other
3 stars 0 forks source link

Only include lab tests with valid results in `n_routine_bloodwork` #141

Open loffleraSMH opened 3 months ago

loffleraSMH commented 3 months ago

New Feature Request

After discussing this with Amol, we decided to revert to the previous version of n_routine_bloodwork, which excludes any blood tests that don't have valid results. The reason for this is that tests without valid results can be tests that were cancelled/not performed/not ordered at certain sites, so if we include those tests, we may be overcounting tests at certain hospitals (depending on whether they send us non-performed tests or not).

In the previous version of the code, the following code was use to extract valid numeric results:

  startwith.any <- function(x, prefix) {
    mat <- matrix(0, nrow = length(x), ncol = length(prefix))
    for (i in 1:length(prefix)) {
      mat[, i] <- startsWith(x, prefix[i])
    }
    return(as.vector(apply(mat, MARGIN = 1, FUN = sum)) > 0)
  }

  lab <-
    lab[, result_value := .(trimws(result_value))] %>%
    .[!is.na(as.numeric(result_value)) | startwith.any(result_value, c("<", ">"))] %>%
    .[, .(n_routine_bloodwork_derived = .N), .(genc_id)]

For the developer: The previous code likely needs to be updated! Please carefully double check which entries in result_value can be converted to valid numeric results for sodium and hemoglobin tests. For hemoglobin tests, this code is used in n_rbc_transfusions:

  hemoglobin[, result_value := as.numeric(
    stringr::str_replace_all(tolower(result_value), "@([a-z0-9]*)|<|>|less than|;", "")
    )]

Please check if this also works for sodium tests, or if this would miss certain entries that can be converted to numeric results with additional clean up.