scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.3k stars 110 forks source link

[ENHANCEMENT] Covariate shift conformal #72

Open gmartinonQM opened 3 years ago

gmartinonQM commented 3 years ago

As described in this paper : https://arxiv.org/abs/1904.06019

RudrakshTuwani commented 2 years ago

Hi, I recently implemented this paper as part of a project and would love to discuss how it could be added to MAPIE.

gmartinonQM commented 2 years ago

Hi @RudrakshTuwani , thanks for your interest in MAPIE ! Maybe the cleanest way to do this would be to

  1. share a github repo with your implementation so that we can give you feedbacks
  2. init a pull request on MAPIE, making a separate class of your own in a dedicated module, following the template provided by the regression module
  3. Don't forget to write unit tests so as to maintain 100% coverage and to write a pedagogical example
RudrakshTuwani commented 2 years ago

Awesome, thanks for the guidance. I will keep you posted!

gmartinonQM commented 2 years ago

Any news about this @RudrakshTuwani ?

RudrakshTuwani commented 2 years ago

Hey @gmartinonQM, I had implemented the density ratio estimation class and a basic API for weighted conformal before life got in the way. Is it possible for us to meet virtually next week? I have some high-level questions and I feel like we can progress faster that way.

RudrakshTuwani commented 2 years ago

Hello @gmartinonQM, I have some good news! I have implemented the split conformal variant in my fork here: https://github.com/RudrakshTuwani/MAPIE?organization=RudrakshTuwani&organization=RudrakshTuwani

Changes in files:

  1. mapie/regression.py - Added class MapieCovShiftRegressor.
  2. mapie/dre.py - Added classes DensityRatioEstimator and ProbClassificationDRE.
  3. mapie/utils.py - Added a function empirical_quantile for calculating weighted empirical quantile.

The replication of paper results is at examples/regression/4-covariate-shift/paper_replication.ipynb. There are a couple of differences but they are mostly due to different model defaults between Python and R. In general, we see that MapieCovShiftRegressor is able to adapt to covariate shift with oracle as well as estimated density ratios.

There's still work to be done. Right now I have just limited the functionality of MapieCovShiftRegressor to cv="prefit" and method="base". I can try working on adapting the class to other scenarios. It will likely be non-trivial due to density ratio estimation and I think I would need to spend some time thinking about how to do it in a way that doesn't violate exchangeability. If you feel the current limitations are not significant, I can also spend time on writing tests and documentation.

Let me know what you think! Thank you :)

gmartinonQM commented 2 years ago

Hi @RudrakshTuwani , thanks for your contribution ! Could you create a pull request of your fork toward master ?

nilslacroix commented 2 years ago

Tested this out of curiosity. Does not work anymore unfortunately, I think some fitting parameters have to be added to the model since it checks for residuals_

vtaquet commented 2 years ago

Hi @RudrakshTuwani, thanks again for your great work and sorry about our silence over the past few weeks.

You may have noticed that we implemented a new type of class ConformityScore that allows the user to define "custom" conformity scores (instead of residuals). Following this update, I adapted your implementation of the covariate shift method so the method can directly be used in MapieRegressor through a CovariateShiftConformityScore class. I pushed a branch here : https://github.com/scikit-learn-contrib/MAPIE/tree/add-covshift-in-conformityscore. The branch includes the modified files and your notebook adapted to my suggested modifications (I obtain the same results).

Please let me know what you'd like us to do. If you have time in the following weeks, I would be happy to help you finishing the pull request you started by considering the suggested modifications (and adding some unit tests to make sure everything is tested and controlled). If you don't have time, I can also keep working on your PR and add you as a reviewer.

Thank you once again !

RudrakshTuwani commented 2 years ago

Hey @vtaquet , thanks for getting back to me and no worries! Yes, I would love to finish the pull request and make a contribution to MAPIE. :)

I can take a look at finishing up the tasks this weekend. Thanks!

RudrakshTuwani commented 2 years ago

Hey, I apologize for the radio silence on this! What's the cleanest way to do this? Should I just clone @vtaquet's branch and add a few tests? Thanks!

vtaquet commented 2 years ago

Hi @RudrakshTuwani ! Yes, you can clone my branch, double-check it and add the unit tests to keep the 100% test coverage. Please let me know if you have some questions about it. Thanks again !