havakv / pycox

Survival analysis with PyTorch
BSD 2-Clause "Simplified" License
781 stars 180 forks source link

features in sac_admin5 #93

Closed berkuva closed 2 years ago

berkuva commented 2 years ago

Hello,

Thank you for the open source project!

I was reading one of the notebooks, and I noticed that the sac_admin5 dataset contains event_true and event features. Can you please explain what is the difference between the two? Thank you!

havakv commented 2 years ago

So, i try to explain using the docs https://github.com/havakv/pycox/blob/master/pycox/datasets/from_simulations.py#L97

duration and event are just the regular potentially right-censored event time and event indicator. These are the target variables one typically encounter in a survival dataset.

duration_true and event_true on the other hand are the true uncensored counterparts to these variables. So duration_true is the simulated event time, censoring_true is the simulated censoring time and we just set duration = min(duration_true, censoring_true) and event = duration_true <= censoring_true.

Finally, we can have censored duration_true but that only occurs at the end of the experiment (max-time of 100). So event_true should be 1 for all duration_true that are less than 100.