scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.3k stars 111 forks source link

Coverage validity not verified in MapieRegressor when the number of calibration data is very small #452

Closed thibaultcordier closed 5 months ago

thibaultcordier commented 6 months ago

Describe the bug In MapieRegressor, the quantile calculation does not allow us to check the coverage validity when the number of calibration data is very small.

To Reproduce A rigorous approach to testing the validity of the coverage was carried out by Mahdi Torabi Rad (see his notebook: https://github.com/mtorabirad/MLBoost/blob/main/Episode15/Episode15Main.ipynb) in the special case where the number of calibration data is very small (6 in the following illustration).

Expected behavior Use the corrected quantile ($|(n+1)(1−α)|/n$) to check the validity of the coverage for any number of calibration data.

Screenshots image

thibaultcordier commented 6 months ago

Issue #451 provides a link to the video presenting the notebook.