Baseline setting - Githubissues

oops343 commented 1 day ago

          Thank you for your interest in our work! We've updated our experiments based on feedback received during the peer-review process. Our focus has shifted towards a time series foundation model with a zero-shot setting. This repo is published for reproducing our camera-ready version's results and for further research purposes.

Best regards.

Originally posted by @idevede in https://github.com/DC-research/TEMPO/issues/3#issuecomment-2284787546

oops343 commented 1 day ago

sry about the open&reopen, dont know how to refer a comment :( good paper, thank u for your excellent work and congrats! But I have some questions, hope u can help me with them...

does "zero-shot", as I quoto from the paper and both the comment from issue#3, mean for all the baselines, they are also trained on all the other datasets and tested on the target dataset?
and if so, did u optimize the hyper-parameters for these models for fair comparison or followed the original setting?
what are your insights about the size of a foundation model? do we need to scale them when more data used?

idevede commented 6 hours ago

Hi oops343,

Thanks for your interest!

Regarding your questions:

The term "zero-shot" in the context of this study refers to the evaluation methodology for all baseline models. These models are trained on a comprehensive dataset excluding the target dataset, and then evaluated on the target dataset without further fine-tuning. For additional comparative analysis, Table 7 in the appendix presents results where models are specifically trained on individual target datasets (ETTh1, ETTh2).
To ensure a fair comparison, we employed a systematic approach to hyperparameter optimization:

Transformer-based models: We aligned shared parameters, such as the number of attention heads, across all transformer-based models.
Model-specific parameters: For models with unique hyperparameters, we conducted a small-scale grid search centered around their reported optimal configurations. This methodology strikes a balance between maintaining model integrity and ensuring fair comparative analysis.

The scalability of our foundation model is primarily attributed to the transformer architecture's inherent capabilities. The depth of transformer blocks demonstrates a significant impact on model performance. Given the growing interest in foundation models and access to larger datasets, we are continually conducting comprehensive scaling experiments. Specifically, as the volume of available data increases, there may be a need to scale model size accordingly to fully leverage the additional information. We plan to release more model checkpoints with different model scales along with some analysis of Time Series Foundation Model (TSFM) scaling laws in the very near future!

DC-research / TEMPO

Baseline setting #5