ModelOriented / survex

Explainable Machine Learning in Survival Analysis
https://modeloriented.github.io/survex
GNU General Public License v3.0
97 stars 10 forks source link

errors "Error in Ops.Surv(observed, predicted) : Invalid operation on a survival time" #59

Closed MosquitoFan closed 1 year ago

MosquitoFan commented 1 year ago

I used the mlr3 and mlr3proba with surv.xgboost as follows, since xgboost cannot output survival matrix so I used a pipeline, `xgb_task = TaskSurv$new("train", backend = data.frame(train), time = "time", event = "event")

xgb_lrn = as_learner(ppl(c("distrcompositor"), lrn("surv.xgboost", objective = "survival:cox", nrounds=300L, eta=0.1), form = "ph")) `

then I try to explain the model, I set the predict_survival_function and predict_cumulative_hazard_function, which could be successfully conducted when I run them, explainer <- explain(model = xgb_lrn, data = train, y = Surv(time[!fold == i1],event[!fold == i1]), predict_function = function(model,newdata){ predict(model, newdata, predict_type = "<Prediction>")$crank }, predict_survival_function = function(model,newdata,times){ t(predict(model, newdata, predict_type = "<Prediction>")$distr$survival(times)) }, predict_cumulative_hazard_function = function(model,newdata,times){ t(predict(model, newdata, predict_type = "<Prediction>")$distr$cumHazard(times)) } )

Preparation of a new explainer is initiated -> model label : R6 ( default ) -> data : 1498 rows 62 cols -> target variable : 1498 values -> predict function : function(model, newdata) { predict(model, newdata, predict_type = "")$crank } -> predicted values : No value for predict function target column. ( default ) -> model_info : package mlr3 , ver. 0.14.0 , task regression ( default ) -> predicted values : numerical, min = -7.026592 , mean = -0.8300239 , max = 11.02893
-> residual function : difference between y and yhat ( default ) -> residuals : the residual_function returns an error when executed ( WARNING )

However, I meet error when adding output_type

Error in predict_function(model, newdata, ...) : unused argument (output_type = "chf")

Error in Ops.Surv(observed, predicted) : Invalid operation on a survival time

model_performance(explainer)

Error in Ops.Surv(y, predict_function(model, data)) : Invalid operation on a survival time

I don't know what's wrong with my code.

mikolajsp commented 1 year ago

Hello @MosquitoFan,

I think the issue is that the wrong kind of explainer is created (from the DALEX package) as making the pipeline doesn't give the correct class to the pipeline object. (We would expect it to be LearnerSurv).

However, the possible fix is simple:

I've tested it on some of my data and it works. Let me know if it solved your issue :)

P.S. If you see residuals ... in the last line of output when creating an explainer you're creating a DALEX not a survex explainer.

MosquitoFan commented 1 year ago

Hi mikolaj,

Thank you so much, it works and the plot looks beautiful!