salesforce / CoST

PyTorch code for CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting (ICLR 2022)
BSD 3-Clause "New" or "Revised" License
214 stars 43 forks source link

How to use your approach for downstream forecasting tasks #3

Closed StatMixedML closed 2 years ago

StatMixedML commented 2 years ago

Summary

Thanks for making the code available. I really like the idea of first learning the embeddings in a self-supervised manner and then using a simpler model for forecasting. However, I am struggling how to use the learned embeddings for the forecasting part.

Problem Description

Say you are tasked with forecasting a monthly univariate time series Y = (y1, ..., yT), which is historically available from January.2010 until December.2020. The task is to forecast 2021, with the forecasting horizon being h=12 months. Based on the CoST framework, we are using the TCN-Encoder (f) to learn the embeddings, V=f(Y), where V =[V_Trend, V_Seasonality] for January.2010 until December.2020. For training of the downstream forecasting model, say a Ridge Regression Model, we are using the final timestamp of the learned representations. So far so good.

@gorold My questions is now: given the representations and the trained Ridge model, how do we forecast 2021, since the representations are available until end of 2020 only? More specifically, what are the features for the Ridge model used for forecasting 2021?

gorold commented 2 years ago

Thanks for your interest in our work! The ridge regression model should have been trained for multi-horizon forecasting, i.e. p(x_{t+h}|x_t, ...). For further horizons, you can generate new features with the forecasts. Hope this helps.

StatMixedML commented 2 years ago

@gorold Thanks for your response.

I am afraid I still don‘t have a good overview of how it is actually working. Can you please relate your explanation to the example I describe above:

given the representations and the trained Ridge model, how do we forecast 2021, since the representations are available until end of 2020 only.

Even if the model has been trained for multi-horizons, how do we forecast 2021, since none of the representations are available for 2021 that can be used for the Ridge model to forecast 2021?

gorold commented 2 years ago

If the ridge regression model has been trained to forecast 12 steps ahead, then it would take as input CoST features of Dec 2020 and is able to output forecasts for Jan 2021 - Dec 2021. If you would like to forecast into 2022, then you could create new CoST features by simply using the obtained forecasts of Jan 2021 - Dec 2021.

StatMixedML commented 2 years ago

@gorold Thanks for the clarification.

If the ridge regression model has been trained to forecast 12 steps ahead, then it would take as input CoST features of Dec 2020 and is able to output forecasts for Jan 2021 - Dec 2021.

Does this mean that for each timestamp in the train data (January.2010 until December.2020), the dimension of the representations would be equal to the forecast-horizon (h)? To be more specific, if we are to forecast entire 2021 and use December.2020 representations, would these then be of length 12(=h)?

Also looking at the forecasting code, do you use the entire data-set (train+valid+test) to create the representations? If so, does this mean you show the actual test-data to the model? Isn't this a simple interpolation of the test-data, using the sliced representations based on the entire history instead of the actuals, rather than forecasting?

gorold commented 2 years ago

Hey @StatMixedML, the dimension of the representation for each time step is independent of the forecast horizon. Instead, it is a hyperparameter, denoted repr_dims in the code.

Indeed, we use the entire dataset to create representations for the purpose of evaluation. The model is given the test data to construct forecasts for evaluation, not for training.

StatMixedML commented 2 years ago

@gorold Thanks for the clarification.

Indeed, we use the entire dataset to create representations for the purpose of evaluation. The model is given the test data to construct forecasts for evaluation, not for training.

Even though the test data is not used to train the Ridge-model, the representations of the test set are used for forecasting. Hence, isn't there some sort of information leakage for the Ridge-model, since it is using the learned representations of the test data as input features to forecast / interpolate the test-data.

I still don't get how to forecast Jan 2021 - Dec 2021 using train data January.2010 until December.2020 from the above example. Is there any way you can create a simulated time-series dataset and show how to forecast 2021 in an example notebook? I can also provide the data if needed.

I would greatly appreciate your help on this.

gorold commented 2 years ago

Apologies for the late reply, unfortunately I do not have the bandwidth to assist in creating an example notebook. I note that you have asked a similar question on the TS2Vec repo and have received assistance regarding your issue, and thus will close this issue.