yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.61k stars 275 forks source link

Self Supervised vs Supervised #26

Closed ikvision closed 1 year ago

ikvision commented 1 year ago

The paper is very dense and super informative, but I want to make sure I understand it and use the code correctly. If I have the following multivariate forecasting task: multivariate predict univariate

  1. Multiple past metrics (~100+ metrics over ~96 past time stamps) including the past of the target as input feature
  2. Single output forecasting target (1 metric over future ~24 time stamps)

If I read correctly table Table 4 self-supervised embedding should be probably better than supervised embedding. Question:

  1. Is patchtst_pretrain with features='MS' the correct starting point to train your model to my dataset?
  2. Any other code parameter you would suggest me to consider setting for a my first experiments?
namctin commented 1 year ago

Hi @ikvision, thank you for your interest! Yes, that should be your setup. You can pretrain the model with multiple time series and then use the embedding for single output forecasting task. You can also do supervised training where the input is multiple time series and output the forecasting of a single series. You would need to modify the head to make sure it outputs the right shape. Please feel free to let us know how it goes.

ikvision commented 1 year ago

Thank you @namctin I will start by the pre-train method and keep you posted