Open arlorostirolla opened 7 months ago
Hi, thanks for the kind words.
Firstly, having a train-validation-test split is very important. That way, you can monitor for overfitting with the validation set.
Second, the number of epochs depends on the learning rate you use. We used a tiny learning rate, hence we just let the model train for as long as possible, and set 50
epochs as the early stopping criterion arbitrarily, based on the average validation loss. You may try lesser numbers for the early stopping criterion too.
Hi! and thankyou for your excellent contribution to the world of time series!
I am currently using lag llama for finetuning, and was wondering if you had any rules of thumb for fine tuning yet? I have read that transformers generally require many epochs, and noticed your early stopping patience is fifty. Does this mean we should generally train for many epochs? Or was that early stopping patience set on a very small dataset?
For context, my dataset has about 3 years with of price/energy demand/air temp/solar output data at 5 minute intervals. I have set a long context length to try and capture seasonal effects. Wondering how many epochs I should train for. The base foundation model did not work very on my data