scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.3k stars 111 forks source link

Does MAPIE support multi-step regression? I am using a Tensorflow Model with KerasRegressor #431

Closed dani-vu closed 7 months ago

dani-vu commented 7 months ago

I want to use MAPIE to get prediction intervals. This works for a single step output approach. However, when I want to look e.g. 10 timestamps into the future, I get the "ValueError: y should be a 1d array, got an array of shape (8190, 10) instead."

I am using LSTM for this example. When having a single step output, my model looks like this:

model = Sequential([layers.Input(shape=(X_train.shape[1], X_train.shape[2])), layers.LSTM(64, return_sequences=True), layers.Flatten(), layers.Dense(32, activation='relu'), layers.Dense(32, activation='relu'), layers.Dense(1)])

For the multi step approach, I changed my last Dense layer parameter to the number of timestamps that I want to look into the future, in this case 10:

model = Sequential([layers.Input(shape=(X_train.shape[1], X_train.shape[2])), layers.LSTM(64, return_sequences=True), layers.Flatten(), layers.Dense(32, activation='relu'), layers.Dense(32, activation='relu'), layers.Dense(10)])

But as mentioned, the "ValueError: y should be a 1d array, got an array of shape (8190, 10) instead." occurs.

The shape of my X is (8190, 32, 23) and the shape of y is(8190, 10)).

LacombeLouis commented 7 months ago

Hey @dani-vu,

That's something we have not yet implemented (see multi-output regression: #97). We only make for one time step ahead. Just to get more information about your issue, is this something that you would want to be implemented in MapieTimeSeriesRegressor?

For the moment, the alternative would be to iteratively do one prediction at a time.

Thank you, Louis

dani-vu commented 7 months ago

Hey @LacombeLouis ,

thank you for the quick response. Yes, this is what I would want to be implemented in MapieTimeSeriesRegressor.

All the best, Daniel