msberends / AMR

Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by using evidence-based methods, as described in https://doi.org/10.18637/jss.v104.i03.
https://msberends.github.io/AMR/
Other
83 stars 12 forks source link

extra factor level in class SIR (as.sir/is.sir) #151

Closed iDudeRPS closed 2 months ago

iDudeRPS commented 3 months ago

hai Matthijs

what are your thoughts on the following:

we have several lab results that contain an V instead of SIR. This is used for situation where there are no breakpoints for the antibiotic. See: https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Guidance_documents/When_there_are_no_breakpoints_2024-02-29.pdf These are instances where a guidance advice is given, rather than a SIR-conclusion. When using AMR (as.sir) these values are deleted because they cannot be fitted into the SIR class. As a result we end up with a lot of empty cells. The new eucast documents have expanded the instances in which this is the case so we find ourselves with increasing data containing a 'V'. I am sure that other labs have this same problem, maybe different letters are used.

Would it be possible to add a factor level to the SIR class that can contain these extra tokens? Irrespective of the letter used? So that these results can be incorporated into analyses? Maybe a factor level above R?

msberends commented 3 months ago

Yes, by now I think there’s not much of a choice. Especially now that PK/PD breakpoints have been lifted.

I’ll think of something, but an additional factor level makes a lot of sense. Thanks for bringing it up!

msberends commented 3 months ago

I've put something into work here: https://github.com/msberends/AMR/commit/08a27922a8e153c3fc6e8ee0bd4a05e2b59d2eb5

Need to test it still, but the SIR factor has been expanded with SDD (to comply with CLSI, request from others) and N for non-interpretable. Following your nice suggestion, I added the option to define the regex yourself to let the package know what should be considered what. Now defaults to:

S = "^(S|U)+$" # U = Susceptible for urines, common for some lab systems (and nice for e.g. AMC in EUCAST)
I = "^(I|H)+$" # H is used by some widely used systems to make distinction between I CLSI and I EUCAST, maybe should be SDD, not sure yet
R = "^(R)+$"
N = "^(N|V)+$"
SDD = "^(SDD|D)+$" # D for dose-dependent, also used by NCBI

So with that, your 'V' values will be transformed to N which is supported by the package in all other functions.

iDudeRPS commented 3 months ago

thumbs up!

iDudeRPS commented 3 months ago

Maybe add the option to put the U in a (separate) level below the S if needed? We have for each Isolate separate rows of output for the same antibiotic. Eg amc = U and amc = S (with explanations what both mean). If both are assigned in the same level we end up with duplicates, and you loose information. If each are assigned to separate levels is it still possible to aggregate or filter downstream if needed.

msberends commented 2 months ago

I see, but I think it's better to not add additional 'S' levels for different circumstances. The outcome remains the same, it's S.

In your case, I think the better approach would've been to not have AMC = U and AMC = S, but to have AMC_U = S and AMC = S. Also for implementing breakpoints specifically for urine, it's much easier for a LIS to have them as a separate antibiotic. I understand that you cannot easily alter your LIS of course, but I think you can reach your goal for future aggregation and filtering if you map the U as SDD using this package. Or, just for your ad-hoc data analysis, add another column AMC_U to keep them apart:

library(AMR)
library(dplyr)

your_data <- tibble(AMC = c("S", "S", "U", "R", NA, "S"))
your_data
image
your_data %>%
  rename(AMC_raw = AMC) %>%
  mutate(AMC = as.sir(if_else(AMC_raw == "U", "SDD", AMC_raw)),  # or make it NA instead of SDD here
         AMC_U = as.sir(if_else(AMC_raw == "U", "S", NA)))
image