Closed Ackension closed 3 years ago
We did complete case analysis. To reduce the number of subjects being excluded, for the covariates which the majority of subjects did not have (cvp, bnp, troponin and creatinine kinase) we used a flag indicating whether the subject had that covariate recorded or not in the models instead of the measurement.
In R for some models like glm
you can pass na.exclude
(recommended) or na.omit
to the na.action
argument for complete case analysis.
https://github.com/nus-mornin-lab/echo-mimiciii/blob/master/notebooks/02_primary.ipynb
Thanks for your reply. It is really hard for me to understand these statistics, and I look back your sql code at the same time, it still confuses me why set these lab results into four categories? Actually I can't even fully understand your sql code..............
What 4 categories?
I mean why set 'labname' as two-level flag(0 and 1), and then classify it into first, min, max and abnormal?
Those are not categories. first, min, max mean the first, min, max measurements of a lab item for a certain patient. abnormal means whether the measurement is considered abnormal based on the FLAG
column. Only cvp, bnp, troponin and creatinine kinase has the two-level flag 0 or 1. What they mean was explained in my previous comment.
The sql
folder is not a good place to start. You should start from notebooks
instead then you'll have a better idea of what the sql files mean and their context, for example this file 01_run_sql.ipynb
.
Some parts of the code might look complicated. You could look at the outputs to get an idea of what it does.
thank you so much!
So excited to finishing your paper published on ICM,and I was wondering how to do with the missing data? I am a college student and very interested in medical Data processing ,but python and R is hard to understand for me by now. Hope to get details on the question, specified R code (just get started learing) is much better.appreciate it.