tidymodels / probably

Tools for post-processing class probability estimates
https://probably.tidymodels.org/
Other
111 stars 12 forks source link

int_conformal_split(): Residuals should be sorted? #126

Open mdancho84 opened 11 months ago

mdancho84 commented 11 months ago

I checked out int_conformal_split() function. One thing that I did differently was I take the absolute value and sort the residuals in modeltime's implementation. This ensures that the range of residuals are below the selected level.

https://github.com/tidymodels/probably/blob/c46326651109fb2ebd1b3762b3cb086cfb96ac88/R/conformal_infer_split.R#L107

Probably Implementation:

image

What I implementing in Modeltime:

image

brshallo commented 4 months ago

@mdancho84 that seems consistent with what Angelopoulos, Bates do in Gentle Introduction to Conformal Prediction...

image

Same issue looks like it is in predict.int_conformal_quantile():

https://github.com/tidymodels/probably/blob/1d4b4c1f09bfa5a870dc6a7c6fb4334a34d67b4b/R/conformal_infer_quantile.R#L129

topepo commented 4 months ago

They are "pre-absoluted" and pre-sorted for split inference and only pre-sorted for quantile inference.

For the quantile method, I was working off of Ryan Tibshirani's notes (page 13 section 4.2), which does not use the absolute value (and probably doesn't need to since it operated in a completely different way).

If it helps, I did do a pretty thorough simulation of these functions to check their coverage. Not that there are no errors in the code, but the current version does seem to do what it is intended to do (statistically, at least).

brshallo commented 3 months ago

They are "pre-absoluted" and pre-sorted for split inference and only pre-sorted for quantile inference.

Ahh, I see now, thanks. (For quantile based method, the Angelopoulos & Bates paper I referenced also just sorts and does not take absolutes -- consistent with your implementation / Tibshirani.)