Closed royarkaprava closed 10 months ago
This depends on the kind of data that is provided by the individual datasets. The HiRID dataset, for example, does not contain any diagnosis information, while AUMC does, but it's all in Dutch. On the MIMIC datasets as well as eICU, on the other hand, this should be fairly straightforward, as they contain ICD coded diagnoses which can for example be fed into the icd package.
I was talking about MIMIC. Thank you so much for letting me know about this package. This will be a very useful package. Thank you. Does there exist a list of packages provided somewhere?
I'm sorry, I don't understand your question:
Does there exist a list of packages provided somewhere?
What kind of list/packages are you referring to?
Sorry, I am learning these things, so I asked kind of a weird question. I was thinking of a list of packages which can be used to compute all kinds of medical scores like Shock index, comorbidity, SIRS, etc. I know ricu computes some of those. Anyway, now I think, it is very hard to get a comprehensive list on this.
I have another question regarding the function load_concepts(). In the output of this function, there are a lot of NA's which are often due to unit mismatch or something as mentioned in the output. Is there a possibility that due to some "typographical" mistake on the units in the data files, some important data could be discarded? For context, I am here talking about mimic III and mimic IV datasets that I have downloaded from Physionet.
Can you maybe make a concrete example?
We try to handle unit conversion as best we can and I would be surprised if this causes loss of loads of data (but there might very well be a mistake somewhere, so I'd be curious as to where exactly you see this behavior).
On eICU, units are a bit of a mess, but MIMIC (III and IV) are fairly well behaved in this respect.
Thank you so much for your help with this. Sorry for the long output that follows. I am only showing the variables which went through some removal. This is with mimic IV dataset. The other variables that I tried to track did not have any issue. As you can see that the temperature variable had many entries that got discarded. It is a bit shocking.
NA
) entries
( ) not all units are in [C], [°C]: °C (14.78%)NA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
valuesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entries
( ) not all units are in [%]: NA (93.81%)NA
) entriesNA
) entries
( ) not all units are in [C], [°C]: °C (14.78%)NA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entriesNA
) entries
( ) not all units are in [mcg/kg/min], [mcgkgmin]:
mg/kg/min (0%)NA
) entriesNA
) entries
( ) not all units are in [mcg/kg/min], [mcgkgmin]:
mg/kg/min (0%)Just in case someone comes across this, the large proportion of missingness in the MIMIC IV temperature concept is caused by the inclusion of itemid==224027
in the concept definition, which relates to a skin temperature measurement (rather than body temperature).
load_id(miiv$d_items, itemid == 224027)
> # An `id_tbl`: 1 x 9
> # Id var: `itemid`
> itemid label abbreviation linksto category unitname param_type lownormalvalue highnormalvalue
> <int> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
> 1 224027 Skin Temperature Skin Temp chartevents Skin - Assessment NA Text NA NA
The value here isn't a numeric C or F measurement, but is a subjective categorical classification into cool, warm, hot, etc. As you can see below, the numeric value is NA
and accounts for the high number of missing entries.
load_ts(miiv$chartevents, itemid == 224027) %>%
select(itemid:valueuom) %>%
slice(1:5)
> itemid value valuenum valueuom
> 1: 224027 Warm NA <NA>
> 2: 224027 Cool NA <NA>
> 3: 224027 Warm NA <NA>
> 4: 224027 Warm NA <NA>
> 5: 224027 Warm NA <NA>
@nbenn I would suggest removing itemid==224027
from the definition. I can write a PR if you want.
@prockenschaub sure, go ahead if you like, sounds reasonable. Thanks for investigating! I'll try to get this merged a bit quicker than last time. This would probably also affect the other mimic datasets, i.e. mimic
and potentially also mimic_demo
no?
Created the PR. Sorry that it took so long, didn't do it right away and then forgot...
This seems to be resolved, thanks @prockenschaub and @nbenn.
Hi, Is there a way to calculate Elixhauser commorbidity index using ricu package?
-Arkaprava