sebp / scikit-survival

Survival analysis built on top of scikit-learn
GNU General Public License v3.0
1.13k stars 216 forks source link

Ipcw estimation: Add small value for numerical stability #415

Closed juliusge closed 10 months ago

juliusge commented 1 year ago

I frequently run into situations where the censoring distribution estimator predicts very few probabilities of zero and hence end up with "censoring survival function is zero at one or more time points". https://github.com/sebp/scikit-survival/blob/32ae49b51865e1aecf72d0bf87c64e735bf5d548/sksurv/nonparametric.py#L583

I suggest to add a small value to prevent this, sth. like: EPSILON = 1e-6 Ghat = np.minimum(Ghat+EPSILON, 1.0)

I can confirm that for the data I use (real world data: conversion to an eye disease), this does not alter IBS or dynamic AUC metrics for models trained and evaluated with or without the change.

Thank you very much for this great package!

sebp commented 1 year ago

I'm against ignoring a zero probability of being censored, because this indicates that estimation is not possible. Adding a small epsilon would indicate everything is fine, which it is not.