unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
7.91k stars 857 forks source link

[Question] Cannot overfit TCN and Transformer model #828

Closed leyp1 closed 2 years ago

leyp1 commented 2 years ago

I have been trying overfit the darts models on my dataset by only training on one sample of the dataset. This worked for both the RNN model and the NBEATS model, however the TCN and Transformer models have not followed in the same behavoir. I am training the models using the fit() function using a validation/training split based on a set date. TCN Model on one datasample image Transformer Model on one datasample image

What I expected: (RNN Model) image

System:

dennisbader commented 2 years ago

TransformerModel and TCNModel have a default dropout > 0., whereas RNNModel has a default of 0..

I assume that setting the dropout to 0. for the two models should result in higher overfitting

leyp1 commented 2 years ago

Thank you for your quick answer. I get an "RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation" error when setting dropout to 0 for the TCN model, but when setting the dropout to a very small value I indeed manage to overfit it.

However the transformer model still has trouble: image I am only training it on one time series so I don't think it could be a capacity problem. But I can't really explain the behavior ...

dennisbader commented 2 years ago

You need to upgrade your darts version, then it should work (the TCN issue)