amazon-science / chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
https://arxiv.org/abs/2403.07815
Apache License 2.0
2.02k stars 238 forks source link

How to increase the context length? #70

Open Riofd opened 1 month ago

Riofd commented 1 month ago

Thanks for your great work. Chronos gets good performance on our own dataset by zero-shot prediction. But it seems that Chronos is unable to capture patterns of long periods, for example, I have a feature with a cycle of weeks, and my data is 15 minutes per step, with a sequence length of 96 * 7=602 for a single cycle, which exceeds the context_length of the model. Is there any way for the model to capture such periodic features, or can I only retrain the model?

lostella commented 1 month ago

Hi @Riofd. Indeed, the model internally limits the context length (currently happens here).

You can increase the model context length after loading it, as in the example below. However the models were trained with windows of data of limited length, so they may not be able to make sense of the increased context: it's something for experiments to verify. I believe that for proper handling of longer context, we will need to pretrain the models with longer windows of data, but let us know if the following works!

Here I use pipeline.embed to be able to inspect the encoder output shape, but this would have the same effect if you do pipeline.predict instead:

import torch
from chronos import ChronosPipeline

pipeline = ChronosPipeline.from_pretrained("amazon/chronos-t5-tiny")
context = torch.ones(size=(2000,))

# get encoder embeddings
embedding, _ = pipeline.embed(context=context)
print(embedding.shape)

# patch the context length
pipeline.tokenizer.config.context_length = 1024

# get encoder embeddings again
embedding, _ = pipeline.embed(context=context)
print(embedding.shape)

outputs

torch.Size([1, 513, 256])
torch.Size([1, 1025, 256])

where 513 is 512 (the original model context length) + the EOS token embedding, and similarly 1025 after the patching.

Riofd commented 1 month ago

Thank you for your reply. I try your way and successfully get different result comparing with context_length=512, but the forecasting performance decreases significantly. Maybe I should try to fine-tune it?

lostella commented 1 month ago

Yes, fine-tuning may be the best way to go. We added code and some instructions here, let me know if you run into any issue

abdulfatir commented 1 month ago

Let's keep this open as a FAQ.