scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.26k stars 102 forks source link

[ENHANCEMENT] time series conformal prediction #74

Closed gmartinonQM closed 2 years ago

gmartinonQM commented 3 years ago

As described in this paper : https://arxiv.org/abs/1802.06300

Or the EnbPI method proposed by Xu & Xie (2021) : http://proceedings.mlr.press/v139/xu21h.html https://arxiv.org/pdf/2010.09107.pdf

hamrel-cxu commented 3 years ago

Hi, I recommend taking a look at our ICML 2021 Oral Paper: Conformal Prediction Interval for Dynamic Time-series. It builds upon the ideas of the Jackknife+ algorithm (e.g. Jackknife+-after-bootstrap, my collaboration with Prof. Rina Barber) and is easy to implement/use.

Moreover, I have established a GitHub repository for this paper, and I look forward to discussing with you if you are interested in incorporating this in the module.

vtaquet commented 3 years ago

Hi Chen,

Thanks for sharing your paper and your github repo ! I just read it and found it really exciting. It's a really smart adaptation of the Jackknife+ to sequential data. As this method seems to model-agnostic, it could definitely be included in MAPIE, also because adapting MAPIE to time-series is key for us.

I have one little question regarding the bootstrap: does a B value of 20-30 as suggested in the paper assures you to have at least one "leave-one-out" model for every training point (in order to be able to compute $\hat{f}_i^{\phi}$, line 7 of algorithm 1) ?

My colleagues are currently on vacation but we'll get back to you by the end of August regarding your great method.

hamrel-cxu commented 3 years ago

Hi Vianney,

Thanks for your message and I am very glad that you found my work possible to be included in MAPIE; this is really a great tool and I look forward to its growth over the coming years.

Regarding your question: Yes, having B = 20-30 is sufficient. More precisely, if we build B bootstrap models, then the LOO ensemble model for the i-th training datum aggregates a random B_i ~ Binomial(B, (1-1/T)^T) number of bootstrap models. The binomial distribution has an expected value roughly around B*e^{-1} and by the concentration of binomial distributions, it is unlikely that any B_i is zero.

Meanwhile, I am refining some parts of the method so that the intervals are shorter without losing coverage guarantee. Therefore, it will be great to collaborate near the end of August, by which I will likely have more refinements on my end as well.

vtaquet commented 3 years ago

Hi Chen,

Thanks for your reply and your explanation regarding the B parameter, that makes sense. I look forward to seeing your update on this method.

Ethan-Harris0n commented 2 years ago

Hey there! Any update on this? I am using MAPIE for some time series prediction intervals of panel data and this seems ideal! If not I may give it a go at extending the package with an implementation of the above.

Cheers!

vtaquet commented 2 years ago

Hi Ethan, thanks for your message. We have been implementing the Jackknife+-after-Bootstrap, the method at the basis of the EnbPI proposed by Chen, and it should be merged into the main branch in the next coming days. We plan to start implementing the EnbPI method into MAPIE in the next coming weeks. We'll let you know when the implementation is ready, hopefully before the end of the year.

Ethan-Harris0n commented 2 years ago

Amazing! Thanks a bunch and big thanks as well for putting such a great package together!

valeman commented 2 years ago

Folks, any news on EnbPI by chance? Would be great to play with it.

please also note link to Awesome Conformal Prediction - the most comprehensive resource list on the subject

https://github.com/valeman/awesome-conformal-prediction