scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.2k stars 99 forks source link

Add CatBoostRegressor to MapieQuantileRegressor class #217

Open fjpa121197 opened 1 year ago

fjpa121197 commented 1 year ago

Is your feature request related to a problem? Please describe. Currently, I'm using CatBoost regressor for a regression problem, where this model seems to give best performance. Right now, I'm looking to produce prediction intervals, and would like to stick with the same model, and at the same time, make use of MAPIE and not make separate models in code.

Describe the solution you'd like Use CatBoost regressor with CQR.

Describe alternatives you've considered Solution is not using MAPIE and create 2 additional models and follow same approach to calculate q, and then form the sets. Other is to modify code base and add CatBoost regressor support, but not really good at coding, still understanding quantile_regression.py file, to see how to add CatBoost, since I think, the only way to use quantile loss function and alpha, is by setting them both values in the same string like this: model = catboost.CatBoostRegressor(loss_function='Quantile:alpha=0.95', ...)

*I can maybe take a look at it and work on it, but will take time. It would be a good first contribution to do for me :)

Thanks in advance,

vtaquet commented 1 year ago

Hi @fjpa121197 ! Indeed, MapieQuantileRegressor does not accept CatBoostRegressor at the moment. A workaround is to modify the quantile_estimator_params attribute to include this class as a new key. The problem is that both the "quantile" objective and the "alpha" value are defined in the same loss_function argument, which is not taken into account in MAPIE at the moment. @LacombeLouis , do you think it can be easily added ?

LacombeLouis commented 1 year ago

Hey @vtaquet and @fjpa121197, we are currently creating a cv="prefit" method where you will be able to pre-train your models and then use those directly with MapieQuantileRegressor() (PR#214). For the moment, having a built-in solution will not be our main focus, but if you want to contribute, I suggest you follow the guidelines for contributing and we will be delighted to help you with this process.