tidymodels / workflows

Modeling Workflows
https://workflows.tidymodels.org/
Other
201 stars 19 forks source link

plumbing post-processing for survival analysis #229

Open simonpcouch opened 1 month ago

simonpcouch commented 1 month ago

From Hannah in the linked discussion:

Okay so here are my thoughts on what plumbing post-processing for survival analysis would need.

Basic assumptions

I have not yet validated either assumption. Max, do you have a sense of whether they are valid?

For the implementation, this implies

Specifying the predictions

Specifying eval_time

Originally posted by @hfrick in https://github.com/tidymodels/workflows/pull/225#pullrequestreview-2035569467

simonpcouch commented 1 month ago

More thoughts from Hannah in a follow-up comment on the same PR:

After chatting with Max:

  • Max agrees with the basic assumptions except for the single eval time point. He thinks we might want to calibrate/post-process at multiple time points. I agree we might want to do that in general, just not sure about whether or not that is specified/done in one calibration operation (if it requires multiple calibration models to be fitted). Either way, it doesn't change where we need eval time values, just how many. How many is a decision we can make later on.
  • In light of Simon's and my thoughts on specifying the information for the data split needed to fit a workflow with a post-processor that needs fitting on a separate dataset, I've considered adding eval_time there (in add_tailor()) but I think specifying eval_time in tailor() directly is still the right move: it's needed for fitting the post-processor so can't only be in a workflow.
  • Max and I agree we should have an idea of what infrastructure we'd need for post-processing for survival but not include any placeholder arguments at this point. Hence We can remove the time argument in tailors tailor#16