ryantibs / conformal

Tools for conformal inference in regression
GNU General Public License v2.0
214 stars 52 forks source link

Adding conformal extensions to right-censored data #22

Open ariane-cwi opened 11 months ago

ariane-cwi commented 11 months ago

Application of conformal inference techniques to right-censored data. Details in "A Comprehensive Framework for Evaluating Time to Event Predictions using the Restricted Mean Survival Time" (Cwiling et al., 2023)

The extensions concern the computation of prediction intervals with the split and ROO algorithms and variable importance measures with the LOCO methodology.

Some R files are completed with new functions (check.R, split.R, roo.R, loco.R, loco.roo.R) and two new files are added (ipcw.R, rmst.pred.R).

The code for the illustrations of the article is in the file fig.loco.surv.R, in the folder cwiling23.

No automatic tests are implemented. The package runs on the code given in the file fig.loco.surv.R.

check.R

New function: check.args.surv With standard data, the outputs are combined in a single vector y. For censored data, the outputs are collected in two vectors t and d, the first containing the observed times, the second containing the censorship indicators. In addition to the tests performed in check.args, the function check.args.surv tests the validity of the supplementary entry d, along with the horizon of time tau. The latter is the value needed in the definition of the restricted mean survival time.

ipcw.R (new file)

New functions: ipcw,ipcw.km,ipcw.cox,ipcw.rfsrc Computation of Inverse Probability Censoring Weights, for the need of right-censored data. The censoring survival function can either be estimated with Kaplan-Meier (ipcw.km), Cox (ipcw.cox) or Random Survival Forests (ipcw.rfsrc). ipcw wraps these three functions.

split.R

New function: conformal.pred.split.surv Similar to conformal.pred.split, adapted to right-censored data. The function weighted.quantile is used to incorporate censoring weights in the estimation of the quantile. A difference is that more than one value of alpha can be passed to the function, in order to provide prediction intervals at different confidence levels at the same time.

roo.R

New function: conformal.pred.roo.surv Similar to conformal.pred.roo, adapted to right-censored data. The function weighted.quantile is used to incorporate censoring weights in the estimation of the quantile. A difference is that more than one value of alpha can be passed to the function, in order to provide prediction intervals at different confidence levels at the same time.

loco.R

New functions: loco.surv, print.loco.surv, my.surv.sign.test loco.surv is similar to loco, adapted to right-censored data. However, only one test can be performed, the one implemented in my.surv.sign.test (details in Cwiling et al., 2023). The print function print.loco.surv is adapted to fit the corresponding class of object.

loco.roo.R

New function: loco.roo.surv loco.roo.surv is similar to loco.roo, adapted to right-censored data. It calls the function conformal.pred.roo.surv for the construction of prediction intervals.

rmst.pred (new file)

New functions: rmst.pred, print.rmst.pred, plot.rmst.pred, wrss The function rmst.pred is a wrapper for a comprehensive evaluation of a restricted mean survival time estimation model. First, the mean squared error is estimated with the WRSS estimator (see Cwiling et al., 2023) implemented in wrss, with or without cross-validation. Then prediction intervals are computed for all data with the ROO algorithm (conformal.pred.roo.surv). Finally, variable importance is studied with the LOCO methodology. For local variable importance, loco.roo.surv is called. For global variable importance, loco.surv is called. Results can be printed or plotted with the functions print.rmst.pred and plot.rmst.pred.