unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
7.87k stars 851 forks source link

[BUG] NBEATs predict() requires future_covariates #2068

Closed joshuajnoble closed 9 months ago

joshuajnoble commented 9 months ago

Describe the bug

When calling NBEATs predict() I get the following error:

ERROR:main_logger:ValueError: The model has been trained with future covariates. Some matching future_covariates have to be provided to `predict()`.

Stack trace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-26-85a6cccd8de7>](https://localhost:8080/#) in <cell line: 10>()
      8     nbeats_test.append(ts.slice(split_point_2, len(ts) - test_length))
      9 
---> 10 tcn_predict_output = model_nbeats.predict(test_length, past_covariates=cov_series, series=nbeats_test)

3 frames
[/usr/local/lib/python3.10/dist-packages/darts/utils/torch.py](https://localhost:8080/#) in decorator(self, *args, **kwargs)
    110         with fork_rng():
    111             manual_seed(self._random_instance.randint(0, high=MAX_TORCH_SEED_VALUE))
--> 112             return decorated(self, *args, **kwargs)
    113 
    114     return decorator

[/usr/local/lib/python3.10/dist-packages/darts/models/forecasting/torch_forecasting_model.py](https://localhost:8080/#) in predict(self, n, series, past_covariates, future_covariates, trainer, batch_size, verbose, n_jobs, roll_size, num_samples, num_loader_workers, mc_dropout, predict_likelihood_parameters)
   1315                 future_covariates=future_covariates,
   1316             )
-> 1317         super().predict(
   1318             n,
   1319             series,

[/usr/local/lib/python3.10/dist-packages/darts/models/forecasting/forecasting_model.py](https://localhost:8080/#) in predict(self, n, series, past_covariates, future_covariates, num_samples, verbose, predict_likelihood_parameters)
   2207             )
   2208         if self.uses_future_covariates and future_covariates is None:
-> 2209             raise_log(
   2210                 ValueError(
   2211                     "The model has been trained with future covariates. Some matching future_covariates "

[/usr/local/lib/python3.10/dist-packages/darts/logging.py](https://localhost:8080/#) in raise_log(exception, logger)
    127     logger.error(exception_type + ": " + message)
    128 
--> 129     raise exception
    130 
    131 

ValueError: The model has been trained with future covariates. Some matching future_covariates have to be provided to `predict()`.

To Reproduce

The model configuration is as shown here:

from darts.models import NBEATSModel

model_nbeats = NBEATSModel(
    input_chunk_length=30,
    output_chunk_length=7,
    generic_architecture=False,
    num_blocks=3,
    num_layers=4,
    layer_widths=512,
    n_epochs=100,
    nr_epochs_val_period=1,
    batch_size=800,
    model_name="nbeats_interpretable_run",
)

Fitting is done as so:

model_nbeats.fit(series=past_target_series, 
                 past_covariates=past_cov_series,
                 val_series=future_target_series,
                 val_past_covariates=cov_series,
                 epochs=100)`

Expected behavior A clear and concise description of what you expected to happen.

Since NBEATs models don’t support future_covariates this behavior is unexpected

System (please complete the following information):

Additional context Add any other context about the problem here.

dennisbader commented 9 months ago

Hi @joshuajnoble, I believe you might call the wrong model in your code.

From your error message:

---> 10 tcn_predict_output = model_nbeats.predict(test_length, past_covariates=cov_series, series=nbeats_test)

Could it be that you actually trained a TCNModel (just assuming from the tcn_predict_output variable name)?

For me everything runs fine with the code below:

from darts.models import NBEATSModel
from darts.utils.timeseries_generation import sine_timeseries

series = sine_timeseries(length=30)
model_nbeats = NBEATSModel(
    input_chunk_length=10,
    output_chunk_length=10,
    n_epochs=1,
)
model_nbeats.fit(
    series=series,
    past_covariates=series,
    val_series=series,
    val_past_covariates=series
)
model.predict(
    n=10,
    series=series,
    past_covariates=series
)
joshuajnoble commented 9 months ago

Thank you very much for getting back to me. The tcn_predict_output was just a copy/paste error, it’s only an array. Your code works for me and I’ve tracked down the source of the bug: I was training on a array of TimeSeries generated from TimeSeries.from_group_dataframe() and that seems to be the source of the error. Fitting the model using NBEATSModel.fit() on one TimeSeries from the array generated by TimeSeries.from_group_dataframe() solves the error. I don’t think this is a bug necessarily but just an unclear error message. Should I close the bug?

dennisbader commented 9 months ago

Hi @joshuajnoble, TimeSeries.from_group_dataframe() should also work fine. NBEATSModel supports training/prediction on multiple TimeSeries:

import pandas as pd

from darts import TimeSeries
from darts.models import NBEATSModel
from darts.utils.timeseries_generation import sine_timeseries

# generate a DataFrame that has 2 groups with the same time index
df = pd.DataFrame(
    {
        "a": [i for i in range(60)],
        "groups": [0] * 30 + [1] * 30,
    },
    index=pd.date_range("2000-01-01", periods=30, freq="D").tolist() * 2
)

# returns a list of 2 time series
series = TimeSeries.from_group_dataframe(df, group_cols="groups")

model_nbeats = NBEATSModel(
    input_chunk_length=10,
    output_chunk_length=10,
    n_epochs=1,
)
# for simplicity we just past `series` as past covariates (in reality those should be different features)
# `past_covariates` expects a list of TimeSeries of the same length as `series`
model_nbeats.fit(
    series=series,
    past_covariates=series,
    val_series=series,
    val_past_covariates=series
)
# same here, returns list of with 2 forecasted TimeSeries
preds = model_nbeats.predict(
    n=10,
    series=series,
    past_covariates=series
)
joshuajnoble commented 9 months ago

Hmm, interesting. I guess I’m setting up my arrays of TimeSeries incorrectly for NBEATS then. The same arrays of TimeSeries objects are fine with TCN, TFT, RegressionModel, and CatBoost models so perhaps I need to try re-reading the documentation for NBEATS more closely to see how it might be different.

I guess at this point the bug is really just about the error message that the model is spitting out, e.g. The model has been trained with future covariates.

dennisbader commented 9 months ago

Can you double check the type of model that raised this error just before calling predict? It really sounds like it's not actually an NBEATSModel.

isinstance(model, NBEATSModel)
# or
type(model)

Because NBEATSModel cannot be trained with future covariates -> prediction can never get executed.

It will raise the following error already when calling model.fit(..., future_covariates=series):

ValueError: Some future_covariates have been provided to a PastCovariates model. These models support only past_covariates.

All the other models you listed except TCNModel support future covariates for training and prediction. After being trained with future covariates, the prediction expects future covariates as well and will raise the error you mentioned if none were provided.

Let me know if it is indeed an NBEATSModel. If yes, then it is a bug on our side. Could you then provide a minimal reproducible example so that we can investigate this?