time-series-foundation-models / lag-llama

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Apache License 2.0
1.08k stars 121 forks source link

Tips of Fine-tuning #46

Open SpeeeedLee opened 2 months ago

SpeeeedLee commented 2 months ago

I have tried the zero-shot prediction on my own dataset, which is related to Healthcare. However, I am not satisfied enough with the performance and perform fine-tuning. Then things get weird as the performance drops a lot compared to zero-shot. Is this because of one(or both) of the following?

  1. I only use data from a single univariate time series to fine-tune. (training data is around 200~300)
  2. I finetune the whole Lag-Llama's parameters

For cases that there is only 1 time series available and needed to be predicted, should one only fine-tune the outer layers? Or maybe prepare more related domain time series(in my case Healthcare) to perform the fine-tuning?

Any suggestion would be helpful, thanks a lot!

ashok-arjun commented 2 months ago

Hi! It's definitely weird that performance drops when finetuned. It could be that the model overfits on the training data. Are you using a training-validation-test split? That is recommended, so you can monitor for overfitting with the validation set.

If that is already done, you could definitely try only finetuning the last layers of the model. That might overfit lesser.

SpeeeedLee commented 2 months ago

@ashok-arjun Thanks for your reply. Would you please kindly point out where can I adjust the code so that I only fine-tune the last few layers?

Thanks a lot.