unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
8.04k stars 874 forks source link

Dirichlet Likelihood TFT model #1813

Open matteoarcangeli99 opened 1 year ago

matteoarcangeli99 commented 1 year ago

Hello, I'm tring to use DirichletLikelihood in TFT model, but I got this issue. How can I fix it?

`ValueError: Expected value argument (Tensor of shape (32, 12, 1)) to be within the support (Simplex()) of the distribution Dirichlet(concentration: torch.Size([12, 1])), but found invalid values:
tensor([[[0.5691],
         [0.8910],
         [0.6211],
         [0.0000],
         [0.3685],
         [0.3022],
         [0.5626],
         [0.3967],
         [0.6753],
         [0.6197],
         [0.4106],
         [0.6041]],
...
        [[0.0000],
         [0.0000],
         [0.0000],
         [0.0000],
         [0.0000],
         [0.0000],
         [0.4241],
         [0.3950],
         [0.3232],
         [0.2716],
         [0.3947],
         [0.2650]]])`

Thanks

dennisbader commented 1 year ago

Hi @Matteoa99 , and thanks for writing. Could you add a minimum reproducible example, so we can investigate this issue? Thanks.

madtoinou commented 1 year ago

Hi @Matteoa99, as mentioned on gitter, this can happen if the model performance is very poor and generate values outside the likelihood support : distribution is not defined for the output of the model, blocking the forward pass.

The solution would be to improve the model performance by tweaking the parameters, adding covariates or change the model type altogether. Using another distribution, potentially more appropriate to your data, could also solve the problem.

fmorgens commented 1 year ago

I am having the same issue with DirichletLikelihood and the TCN model. The additional dimension seems the be the batch size.

With this example:

from darts.models import TCNModel
from darts.utils.likelihood_models import DirichletLikelihood
from darts.datasets import AirPassengersDataset

series = AirPassengersDataset().load()

model = TCNModel(input_chunk_length=30,
                 output_chunk_length=12,
                 batch_size=8,
                 likelihood=DirichletLikelihood())

model.fit(series)

I get this error:

ValueError: Expected value argument (Tensor of shape (8, 30, 1)) to be within the support (Simplex()) of the distribution Dirichlet(concentration: torch.Size([30, 1])), but found invalid values:
tensor([[[284.],

The shape (8, 30, 1) seems to be (batch_size, input_chunk_length, variant_count) while the Dirichlet Likelihood only expects (input_chunk_length, variant_count) Since this happens with TFT and TCN this could be an issue with all models that use batched training.

madtoinou commented 1 year ago

Hi @fmorgens,

As mentioned above, the problem should come from the values of the forward pass output, not the size of the batch.

According to the wikipedia article, the Dirichlet distribution is defined for $xi \in [0,1]$ and $\sum{i} x_i = 1$ where $i \in [1,N]$ correspond is the component of the series . In the error message, you can see that the values predicted by the model are around 200.

Nevertheless, it would probably be great to either systematically normalize the output of the model before passing it to the Dirichlet distribution (or eventually disable the argument validation at the pytorch level).

matteoarcangeli99 commented 1 year ago

Hi @Matteoa99 , and thanks for writing. Could you add a minimum reproducible example, so we can investigate this issue? Thanks.

Hello @dennisbader

`TFTModel( input_chunk_length=input_chunk_length, output_chunk_length=forecast_horizon, hidden_size=64, lstm_layers=1, num_attention_heads=4, batch_size=32, n_epochs=350, add_relative_index=False,
likelihood=DirichletLikelihood(), optimizer_cls = torch.optim.RAdam, use_static_covariates= True, # optimizer_kwargs = {'lr': 2e-3}, save_checkpoints=True, force_reset=True, dropout=0.1, full_attention=True,

add_encoders={

'datetime_attribute': {'past': ['year']},
'cyclic': {
    'future': ['month', 'quarter', 'dayofweek'],
    'past': ['month', 'quarter', 'dayofweek', 'dayofyear'],
    },
'transformer': Scaler()   
}

)`

matteoarcangeli99 commented 1 year ago

Hi @Matteoa99, as mentioned on gitter, this can happen if the model performance is very poor and generate values outside the likelihood support : distribution is not defined for the output of the model, blocking the forward pass.

The solution would be to improve the model performance by tweaking the parameters, adding covariates or change the model type altogether. Using another distribution, potentially more appropriate to your data, could also solve the problem.

Is there any guidance for choosing the most appropriate probability distribution?