scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.2k stars 99 forks source link

A measure of uncertainty for multioutput regression #437

Open AndreaPi opened 2 months ago

AndreaPi commented 2 months ago

I often need to perform active learning in the context of multioutput regression. I thus need a measure of uncertainty for my regressor. However, MapieRegressor is not compatible with MultiOutputRegressor

Describe the solution you'd like I would like to be able to predict either valid prediction intervals for each output (thus, PIs with formal coverage for each variable marginally, but not jointly), or even better, valid prediction regions (thus, hyperrectangles with formal coverage for all variables jointly), but I don't know if the latter case is covered by conformal prediction. The implementation should be compatible with MultiOutputRegressor. In other words, I'm looking for a MultiOutputMapieRegression where estimator can accept an instance of the MultiOutputRegressor class.

LacombeLouis commented 2 months ago

Hey @AndreaPi, Thank you for the issue raised. This is indeed something we have thought about, #97 #431. Our roadmap is quite busy at the moment, but we will have a half year review soon.

Are there any papers you've identified that could be implemented into MAPIE that describe the solution you want? Thank you!

AndreaPi commented 1 month ago

Hi @LacombeLouis,

thanks for the answer! Regarding papers, I'm definitely not an expert on this topic, but here are a few papers that seem relevant to me:

https://proceedings.mlr.press/v162/angelopoulos22a/angelopoulos22a.pdf https://arxiv.org/abs/2110.00816 https://arxiv.org/abs/2101.12002

Alternatively, I wonder if you're in contact with Ryan Tibshirani? He didn't cover multioutput regression in his course this year https://www.stat.berkeley.edu/~ryantibs/statlearn-s24/ but I'm sure either he or Yaniv Romano may have tips on this. Finally, I may ask Anastasios Angelopoulos, though I guess he'll point me back to his paper above.