Open wjxgeorge opened 5 years ago
hi Nephalen ,i think here is a clear structure instructions about mimic database https://github.com/alistairewj/sepsis3-mimic
hi Nephalen ,i think here is a clear structure instructions about mimic database https://github.com/alistairewj/sepsis3-mimic
I'm talking about the code in this repository. I've verified mimic-iii installation.
Hi Nephalen, I am experiencing the same issue as you do using the provided MATLAB code.
For example, ICU stay IDs 200035 and 299994 do not have any antibiotic prescriptions in the database (thus do not meet the sepsis criteria), however they are included in patientIDs_MIMIC3.csv
.
Hi, @Nephalen and @shengpu1126, I also have the similar issues as reported by you.
patientIDs_MIMIC3.csv
). I tried to take the intersection between these two cohorts and got 0 common ids (with addtion of the translation of 200,000 as described in README file). icustays
table with the ones in the patientIDs_MIMIC3.csv
. There is also 0 common ids. The possible reason for no common ids could be the translation of 200,000 added to the published IDs. The largest subject_id for patients in MIMIC's icustays table is 99999, which is smaller than the translation 200,000.
patientIDs_MIMIC3.csv
. This time I got 14965 common ids, compared to 20944 unique icustays I got from running the provided code on MIMIC-III dataset. This means that some ids in the provided csv files are problematic.icustays
table with the ones in the patientIDs_MIMIC3.csv
. There are 17803 common ids which indicate that these values could really be icustay ids.In short, the PatientID refers to icustayids. And running the provided code on the current MIMIC-III database results in a slightly different selection in my experience. Please correct me in case I did something wrong.
I'm currently working on a python version data preprocessing code and I'm actually having problem reproducing the cohort as indicated by patientIDs_MIMIC3.csv.
Some hadmid corresponding to icustayid in patientIDs_MIMIC3.csv actually are not in abx.csv file in the first place. For example, icustayid 55 corresponding to hadmid 147080, which won't be returned even I directly query physionet's mimic-iii database.
Anyone can reproduce it using MATLAB code?