Elixhauser index - Githubissues

royarkaprava commented 3 years ago

Hi, Is there a way to calculate Elixhauser commorbidity index using ricu package?

-Arkaprava

nbenn commented 3 years ago

This depends on the kind of data that is provided by the individual datasets. The HiRID dataset, for example, does not contain any diagnosis information, while AUMC does, but it's all in Dutch. On the MIMIC datasets as well as eICU, on the other hand, this should be fairly straightforward, as they contain ICD coded diagnoses which can for example be fed into the icd package.

royarkaprava commented 3 years ago

I was talking about MIMIC. Thank you so much for letting me know about this package. This will be a very useful package. Thank you. Does there exist a list of packages provided somewhere?

nbenn commented 3 years ago

I'm sorry, I don't understand your question:

Does there exist a list of packages provided somewhere?

What kind of list/packages are you referring to?

royarkaprava commented 3 years ago

Sorry, I am learning these things, so I asked kind of a weird question. I was thinking of a list of packages which can be used to compute all kinds of medical scores like Shock index, comorbidity, SIRS, etc. I know ricu computes some of those. Anyway, now I think, it is very hard to get a comprehensive list on this.

I have another question regarding the function load_concepts(). In the output of this function, there are a lot of NA's which are often due to unit mismatch or something as mentioned in the output. Is there a possibility that due to some "typographical" mistake on the units in the data files, some important data could be discarded? For context, I am here talking about mimic III and mimic IV datasets that I have downloaded from Physionet.

nbenn commented 3 years ago

Can you maybe make a concrete example?

We try to handle unit conversion as best we can and I would be surprised if this causes loss of loads of data (but there might very well be a mistake somewhere, so I'd be curious as to where exactly you see this behavior).

On eICU, units are a bit of a mess, but MIMIC (III and IV) are fairly well behaved in this respect.

royarkaprava commented 3 years ago

Thank you so much for your help with this. Sorry for the long output that follows. I am only showing the variables which went through some removal. This is with mimic IV dataset. The other variables that I tried to track did not have any issue. As you can see that the temperature variable had many entries that got discarded. It is a bit shocking.

temp ( ) removed 1005071 (34.94%) of rows due to out of range (or NA) entries ( ) not all units are in [C], [°C]: Â°C (14.78%)
hr ( ) removed 28 (0%) of rows due to out of range (or NA) entries
resp ( ) removed 429 (0.01%) of rows due to out of range (or NA) entries
pco2 ( ) removed 300 (0.07%) of rows due to out of range (or NA) entries
wbc ( ) removed 2087 (0.27%) of rows due to out of range (or NA) entries
bnd ( ) removed 8 (0.02%) of rows due to out of range (or NA) entries
adm ( ) removed 2 (0%) of rows due to NA values
sbp ( ) removed 130 (0%) of rows due to out of range (or NA) entries
resp ( ) removed 429 (0.01%) of rows due to out of range (or NA) entries
alb ( ) removed 67 (0.06%) of rows due to out of range (or NA) entries
ph ( ) removed 305 (0.06%) of rows due to out of range (or NA) entries
ca ( ) removed 333 (0.05%) of rows due to out of range (or NA) entries
glu ( ) removed 901 (0.09%) of rows due to out of range (or NA) entries
hgb ( ) removed 1968 (0.25%) of rows due to out of range (or NA) entries
mg ( ) removed 608 (0.08%) of rows due to out of range (or NA) entries
ptt ( ) removed 5399 (1.03%) of rows due to out of range (or NA) entries
k ( ) removed 584 (0.07%) of rows due to out of range (or NA) entries
alt ( ) removed 2012 (0.88%) of rows due to out of range (or NA) entries
bun ( ) removed 908 (0.11%) of rows due to out of range (or NA) entries
cl ( ) removed 2019 (0.23%) of rows due to out of range (or NA) entries
bicar ( ) removed 408 (0.05%) of rows due to out of range (or NA) entries
inr_pt ( ) removed 4275 (0.89%) of rows due to out of range (or NA) entries
na ( ) removed 811 (0.09%) of rows due to out of range (or NA) entries
lact ( ) removed 136 (0.05%) of rows due to out of range (or NA) entries
tco2 ( ) removed 475 (0.11%) of rows due to out of range (or NA) entries
crea ( ) removed 929 (0.11%) of rows due to out of range (or NA) entries
cai ( ) removed 462 (0.19%) of rows due to out of range (or NA) entries
pt ( ) removed 4215 (0.87%) of rows due to out of range (or NA) entries
plt ( ) removed 3209 (0.41%) of rows due to out of range (or NA) entries
ast ( ) removed 141 (0.06%) of rows due to out of range (or NA) entries
bili ( ) removed 2023 (0.89%) of rows due to out of range (or NA) entries
wbc ( ) removed 2087 (0.27%) of rows due to out of range (or NA) entries
dbp ( ) removed 794 (0.01%) of rows due to out of range (or NA) entries
sbp ( ) removed 130 (0%) of rows due to out of range (or NA) entries
map ( ) removed 7122 (0.1%) of rows due to out of range (or NA) entries
pco2 ( ) removed 300 (0.07%) of rows due to out of range (or NA) entries
po2 ( ) removed 25190 (5.86%) of rows due to out of range (or NA) entries
fio2 ( ) removed 4313 (0.45%) of rows due to out of range (or NA) entries ( ) not all units are in [%]: NA (93.81%)
resp ( ) removed 429 (0.01%) of rows due to out of range (or NA) entries
temp ( ) removed 1005071 (34.94%) of rows due to out of range (or NA) entries ( ) not all units are in [C], [°C]: Â°C (14.78%)
weight ( ) removed 15 (0.02%) of rows due to out of range (or NA) entries
hr ( ) removed 28 (0%) of rows due to out of range (or NA) entries
o2sat ( ) removed 2977 (0.04%) of rows due to out of range (or NA) entries
dobu_rate ( ) removed 3 (0.01%) of rows due to out of range (or NA) entries
dobu_rate ( ) removed 38 (0%) of rows due to out of range (or NA) entries
dopa_rate ( ) removed 35 (0.12%) of rows due to out of range (or NA) entries
epi_rate ( ) removed 567 (1.31%) of rows due to out of range (or NA) entries
epi_rate ( ) removed 567 (1.31%) of rows due to out of range (or NA) entries
norepi_rate ( ) removed 246 (0.05%) of rows due to out of range (or NA) entries ( ) not all units are in [mcg/kg/min], [mcgkgmin]: mg/kg/min (0%)
dopa_rate ( ) removed 35 (0.12%) of rows due to out of range (or NA) entries
phn_rate ( ) not all units are in [mcg/kg/min]: mcg/min (0%)
norepi_rate ( ) removed 246 (0.05%) of rows due to out of range (or NA) entries ( ) not all units are in [mcg/kg/min], [mcgkgmin]: mg/kg/min (0%)
phn_rate ( ) not all units are in [mcg/kg/min]: mcg/min (0%)

prockenschaub commented 2 years ago

Just in case someone comes across this, the large proportion of missingness in the MIMIC IV temperature concept is caused by the inclusion of itemid==224027 in the concept definition, which relates to a skin temperature measurement (rather than body temperature).

load_id(miiv$d_items, itemid == 224027)

> # An `id_tbl`: 1 x 9
> # Id var:      `itemid`
>   itemid label            abbreviation linksto     category          unitname param_type lownormalvalue highnormalvalue
>    <int> <chr>            <chr>        <chr>       <chr>             <chr>    <chr>               <dbl>           <dbl>
> 1 224027 Skin Temperature Skin Temp    chartevents Skin - Assessment NA       Text                   NA              NA

The value here isn't a numeric C or F measurement, but is a subjective categorical classification into cool, warm, hot, etc. As you can see below, the numeric value is NA and accounts for the high number of missing entries.

load_ts(miiv$chartevents, itemid == 224027) %>% 
   select(itemid:valueuom) %>% 
   slice(1:5)

>    itemid value valuenum valueuom
> 1: 224027  Warm       NA     <NA>
> 2: 224027  Cool       NA     <NA>
> 3: 224027  Warm       NA     <NA>
> 4: 224027  Warm       NA     <NA>
> 5: 224027  Warm       NA     <NA>

@nbenn I would suggest removing itemid==224027 from the definition. I can write a PR if you want.

nbenn commented 2 years ago

@prockenschaub sure, go ahead if you like, sounds reasonable. Thanks for investigating! I'll try to get this merged a bit quicker than last time. This would probably also affect the other mimic datasets, i.e. mimic and potentially also mimic_demo no?

prockenschaub commented 2 years ago

Created the PR. Sorry that it took so long, didn't do it right away and then forgot...

dplecko commented 10 months ago

This seems to be resolved, thanks @prockenschaub and @nbenn.

eth-mds / ricu

Elixhauser index #4