sepsis_target vs pseudo_target

BorgwardtLab / mgp-tcn

Sepsis Prediction on MIMIC

BSD 3-Clause "New" or "Revised" License

69 stars 25 forks source link

Hello,

First of all, thank you very much for this repo! It has been extremely helpful and practical to open-source something like this. I do have a couple of questions where I couldn't find explanations to from the paper and hopefully they can be addressed here.

I was able to extract the sepsis cohort and the necessary files from running the scripts. From the cases_55h_hourly_vitals table, I can see that for sepsis patients (cases), their sepsis_target starts from 0, then becomes 1 after onset was identified, then eventually becomes 2. What does 2 represent in this case? Lastly, for patients without sepsis (controls), they have a column named pseudo_target which also consists of 0, 1, 2. What does this column represent for control group and why do they also have the same labels as the cases group?

Please shed some light! Thank you very much!

@antranttu: From reading the paper, my impression is as follows: for cases, 2 represents recovered from sepsis. Regarding control cases, it helps to understand the procedure by which the controls were generated. First, 10 control cases are matched to 1 sepsis case. Then the onset time of sepsis for the sepsis case, say 12 hours after admission to the ICU, is used as the onset of the "pseudo_target" for the control cases. Conceptually, a "pseudo_target" value of 1 represents the time at which the control case would have had sepsis if he had developed sepsis at all. That sounds weird, but I think it is necessary to prevent the classifier to use e.g., the difference between the end of the ICU stay for a control case vs. the beginning of the ICU stay for a sepsis case.

In other words, if you are training a classifier that uses the last 8 hours before sepsis onset to predict sepsis, which 8 hours do you use for control cases that never developed sepsis? That is what you use the pseudo_target label for.

All of this is just my understanding of the paper and the code, I welcome any corrections.

BorgwardtLab / mgp-tcn

sepsis_target vs pseudo_target #14