Open tniveej opened 4 months ago
Hi @tniveej. Here some general info about why you get different results on the validation set between training and your prediction loop:
Your model uses output_chunk_length=1
, meaning the model is trained to only predict 1 step in one forward pass. This means when evaluating during training, the samples are made from all possible input (input_chunk_length=10
) /
output (output_chunk_length = 1
) windows. So the model only evaluates on these 1 step forecasts. This would be equivalent to performing historical forecasts on the validation set with forecast_horizon=1
and stride=1
. In your example you call predict()
with n=len(val_ts)
which is far larger than 1
. The model then performs auto-regression - consuming its own predictions and future values of covariates to generate new predictions further in the future. So the performance is expected to be worse with this since your model hasn't been trained to do it.
Also some other tipps:
val_check_interval
in the pl_trainer_kwargs
dict. Also, the PyTorch Lightning Trainer offers a lot of customization of the training process (see all parameters here).Hello @dennisbader . Thank you very much for the insight and tips.
From my data, I am trying to get the model to predict a single Timeseries' (I have many series') in its entirety from a cold start (essentially with an input_chunk_length = 0
) using only the provided covariate data. So basically, transfer learning from the training data on new Timeseries's. However, I realize this is not possible with the torch forecasting models in the darts library (p.s. check edit below). Therefore, the idea was to first train a model with a short enough input_chunk_length
and all the features (covariates). Then when it comes to prediction time, I would provide an average value for the first input_chunk_length
(e.g. average value over the training set of the Timeseries to be predicted) and have the model auto-regressively predict the rest in hopes that the information from the covariate data would be enough to correct the model predictions.
I realize that my model is very quickly overfitting therefore I've tried a few of the things on top of your suggestions:
input_chunk_length
and output_chunk_length
to no avail. future_covariates
from 44 features to 12 features that I felt were the most important to the model prediction (also based off SHAP values that I obtained from training a basic MLP on the same dataset). Same resultDo you think there is anything more I could be doing to improve how the training goes? Or would this indicate that the covariate data provided just does not have enough information to predict the desired variable? And is my goal a bit too ambitious and un-achievable with current methods?
Edit: I just found that covariate only prediction is possible with RegressionModels
from your comment in #2473. I will be trying that out next.
Hey guys, I'm facing a problem that's been driving me nuts. I am a beginner so please forgive me if there are any fundamental mistakes here. Any help is appreciated.
I am using the
TiDE model
to try and do some prediction (regression) on Timeseries'. I believe what I'm trying to do is transfer learning. I have multiple Timeseries' that I want to train the model on and make a prediction on a different set of similar Timeseries'. When I run the training, the model has a MAE ≈ 0.01. However, when I make the model predict the validation sets of each Timeseries from training and manually calculate the MAE its more like MAE ≈ 0.18. The model also struggles to make any proper predictions (as I show near the end).The nature of my data is as follows :
Now getting to the code. This is what I've done :
Model parameters:
Then I create the timeseries from a csv and fill in the missing values using forward fill manually before turning them into a list of timeseries. I then normalized the data using the default MinMax scaler. Here is an example of a timeseries plotted. timeseries to predict :
covariates for said Timeseries :
Next I find a learning rate. The reason it's a function that iterates until it doesn't fail is because I reused the code from where I did hyperparameter tuning to find the optimal model parameters and I didn't want the trial to fail because it couldn't find a suitable learning rate:
Next I go ahead to the training :
Thre training is whacky because it seems like there is no decrease in loss, just a fluctuation. Also, for some reason it always stops at the number of epochs set by the
patience
parameter of theEarlyStopping
callback; 10 epoch in this case. But I think we can ignore that for now (?) Here's are pictures showing the training loss and the validation loss + MAEBecause I noticed even with a low MAE, the model is not able to make any meaningful predictions, I went ahead and tried to do the prediction on the validation sets used doing training on the trained model :
I get
MAE = 0.1881085145800395
which is way off the values obtained during training. An example of the prediction made by the model (the same Timeseries from the example dataset shown above)I've been at this for some time and I still can't figure out what's going wrong. Can someone explain to me what I'm doing wrong here?