Item representations in M/MS task

KimMeen / Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

https://arxiv.org/abs/2310.01728

Apache License 2.0

1.02k stars 179 forks source link

Item representations in M/MS task #117

Closed jexterliangsufe closed 6 hours ago

jexterliangsufe commented 1 week ago

Great works. When I dive into data processing codes for Traffic or Electricity datasets, I find that seq_x length is (seq_len, ), which means multi variates are split to univariate and considered by model seperately. Then I noticed that the model actually accept (B, T, C) input where C=1, which means the model doesn't know which variate is feeded into. The question is, have you considered about extracting the item representation and feeding them into a new embedding layer like static embedding layer?

kwuking commented 5 days ago

Thank you very much for your interest in our work. Your understanding is correct. Currently, Time-LLM adopts a channel-independent approach and does not consider the interactions between multivariate variables. The use of a static embedding layer you mentioned is indeed a very interesting method. We have also explored this approach. However, there is a small issue that we have not yet been able to solve adequately: how to handle unseen time series during inference, as new time series are not included in the original static encoding. This is something that may require further thought.

jexterliangsufe commented 4 days ago

Thanks for your reply. In my application scenario, there is no unseen time series during inference. Interestingly, mae from TimeLLM is lower than the one from TemporalFusionTransformer which is fed into multi features and using static embedding layer. So the necessity of item representations should be re-considered.

kwuking commented 6 hours ago

Thanks for your reply. In my application scenario, there is no unseen time series during inference. Interestingly, mae from TimeLLM is lower than the one from TemporalFusionTransformer which is fed into multi features and using static embedding layer. So the necessity of item representations should be re-considered.

TimeLLM is based on llama 7b as the base model. If the dataset is relatively small, training can be challenging and may not be suitable for a large model. If you find a reasonable solution to address your issue, that would be great.