AssertionError: Encoder and decoder variables have to be the same apart from target variable

CarloLucibello commented 3 years ago

PyTorch-Forecasting version: master
PyTorch version: 1.8.1
Python version: 3.9.4
Operating System: Manjaro Linux

Hi, I'm trying to adapt the stallion example to use a RecurrentNetwork. As long as the target and the time_varying_unknown_reals are matched, as in

target=["volume"]
time_varying_unknown_reals = ["volume"]
# OR
target=["volume", "log_volume"]
time_varying_unknown_reals = ["volume", "log_volume"]

everything is fine. When the inputs differ from the targets though, as in

target=["volume"]
time_varying_unknown_reals = ["volume", "log_volume"]

I get the following error

Traceback (most recent call last):
  File "/home/carlo/Git/pytorch-forecasting/examples/stallion_rnn.py", line 123, in <module>
    model = RecurrentNetwork.from_dataset(
  File "/home/carlo/miniconda3/lib/python3.9/site-packages/pytorch_forecasting/models/rnn/__init__.py", line 157, in from_dataset
    return super().from_dataset(
  File "/home/carlo/miniconda3/lib/python3.9/site-packages/pytorch_forecasting/models/base_model.py", line 1355, in from_dataset
    return super().from_dataset(dataset, **new_kwargs)
  File "/home/carlo/miniconda3/lib/python3.9/site-packages/pytorch_forecasting/models/base_model.py", line 1637, in from_dataset
    return super().from_dataset(dataset, **kwargs)
  File "/home/carlo/miniconda3/lib/python3.9/site-packages/pytorch_forecasting/models/base_model.py", line 907, in from_dataset
    net = cls(**kwargs)
  File "/home/carlo/miniconda3/lib/python3.9/site-packages/pytorch_forecasting/models/rnn/__init__.py", line 99, in __init__
    assert set(self.encoder_variables) - set(to_list(target)) - set(lagged_target_names) == set(
AssertionError: Encoder and decoder variables have to be the same apart from target variable

So there is no way to perform single target regression without dropping all others covariates. Is this expected or a bug?

Click to expand code!

```python from pathlib import Path import pickle import warnings import numpy as np import pandas as pd from pandas.core.common import SettingWithCopyWarning import pytorch_lightning as pl from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor from pytorch_lightning.loggers import TensorBoardLogger import torch from pytorch_forecasting import GroupNormalizer, RecurrentNetwork, TimeSeriesDataSet from pytorch_forecasting.data.examples import get_stallion_data from pytorch_forecasting.metrics import MAE, RMSE, SMAPE, PoissonLoss, QuantileLoss from pytorch_forecasting.models.temporal_fusion_transformer.tuning import optimize_hyperparameters from pytorch_forecasting.utils import profile warnings.simplefilter("error", category=SettingWithCopyWarning) data = get_stallion_data() data["month"] = data.date.dt.month.astype("str").astype("category") data["log_volume"] = np.log(data.volume + 1e-8) data["time_idx"] = data["date"].dt.year * 12 + data["date"].dt.month data["time_idx"] -= data["time_idx"].min() data["avg_volume_by_sku"] = data.groupby(["time_idx", "sku"], observed=True).volume.transform("mean") data["avg_volume_by_agency"] = data.groupby(["time_idx", "agency"], observed=True).volume.transform("mean") # data = data[lambda x: (x.sku == data.iloc[0]["sku"]) & (x.agency == data.iloc[0]["agency"])] special_days = [ "easter_day", "good_friday", "new_year", "christmas", "labor_day", "independence_day", "revolution_day_memorial", "regional_games", "fifa_u_17_world_cup", "football_gold_cup", "beer_capital", "music_fest", ] data[special_days] = data[special_days].apply(lambda x: x.map({0: "", 1: x.name})).astype("category") training_cutoff = data["time_idx"].max() - 6 max_encoder_length = 36 max_prediction_length = 6 training = TimeSeriesDataSet( data[lambda x: x.time_idx <= training_cutoff], time_idx="time_idx", # target="volume", target=[ "volume", # "log_volume", # "industry_volume", # "soda_volume", # "avg_max_temp", # "avg_volume_by_agency", # "avg_volume_by_sku", ], group_ids=["agency", "sku"], min_encoder_length=max_encoder_length // 2, # allow encoder lengths from 0 to max_prediction_length max_encoder_length=max_encoder_length, min_prediction_length=1, max_prediction_length=max_prediction_length, # static_categoricals=["agency", "sku"], # static_reals=["avg_population_2017", "avg_yearly_household_income_2017"], # time_varying_known_categoricals=["special_days", "month"], # variable_groups={"special_days": special_days}, # group of categorical variables can be treated as one variable # time_varying_known_reals=["time_idx", "price_regular", "discount_in_percent"], # time_varying_unknown_categoricals=[], time_varying_unknown_reals=[ "volume", "log_volume", # "industry_volume", # "soda_volume", # "avg_max_temp", # "avg_volume_by_agency", # "avg_volume_by_sku", ], # target_normalizer=GroupNormalizer( # groups=["agency", "sku"], transformation="softplus", center=False # ), # use softplus with beta=1.0 and normalize by group # add_relative_time_idx=True, # add_target_scales=True, # add_encoder_length=True, ) validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True) batch_size = 64 train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=0) val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=0) # # save datasets # training.save("training.pkl") # validation.save("validation.pkl") early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min") lr_logger = LearningRateMonitor() # logger = TensorBoardLogger(log_graph=True) trainer = pl.Trainer( max_epochs=100, gpus=0, weights_summary="top", gradient_clip_val=0.1, limit_train_batches=30, # val_check_interval=20, # limit_val_batches=1, # fast_dev_run=True, # logger=logger, # profiler=True, callbacks=[lr_logger, early_stop_callback], ) model = RecurrentNetwork.from_dataset( training, learning_rate=0.03, hidden_size=16, # attention_head_size=1, dropout=0.1, # hidden_continuous_size=8, # output_size=7, # loss=QuantileLoss(), # log_interval=10, # log_val_interval=1, # reduce_on_plateau_patience=3, ) print(f"Number of parameters in network: {model.size()/1e3:.1f}k") trainer.fit( model, train_dataloader=train_dataloader, val_dataloaders=val_dataloader, ) # make a prediction on entire validation set preds, index = model.predict(val_dataloader, return_index=True, fast_dev_run=True) ```

jdb78 commented 3 years ago

Most recurrent networks require known covariates apart from the target (you can lag the covariates manually to make them "known" in the future). Otherwise you need to train a different encoder and decoder (e.g. in the TFT). So this is expected behaviour.

CarloLucibello commented 3 years ago

right, makes sense, I was thinking of my use case of one-step-ahead prediction but of course for longer horizons you need the covariates as inputs

CarloLucibello commented 3 years ago

I'll leave this open in case you think is worth making an exception for the one-step-ahead case, otherwise feel free to close

jdb78 commented 3 years ago

The targets are automatically lagged by 1 while other covariates are not. I suggest lagging manually in your dataframe (groupby()[name].shift()).

owoshch commented 2 years ago

Hi @CarloLucibello and @jdb78! Thank you for a helpful conversation!

@CarloLucibello Would you mind to share the code that uses groupby()[name].shift() to produce the valid dataframe to be passed as a dataset?

My question arises because I don't understand what step should be used to shift the target in the dataset if I use the following settings max_encoder_length = 36; max_prediction_length = 6

Thank you!

owoshch commented 2 years ago

Hi @CarloLucibello and @jdb78! Thank you for a helpful conversation! I'm still struggling with getting a RecurrentModel to run with a single real target and three covariates. I would be extremely grateful if you share a code snippet that made you able to train a RecurrentModel. Thank you!

@CarloLucibello Would you mind to share the code that uses groupby()[name].shift() to produce the valid dataframe to be passed as a dataset?

My question arises because I don't understand what step should be used to shift the target in the dataset if I use the following settings max_encoder_length = 36; max_prediction_length = 6

Thank you!

tomtomtom995 commented 2 years ago

Has anyone got a recurrent network running with covariates? If so, please share your code, i got stuck for too long :(

sktime / pytorch-forecasting

AssertionError: Encoder and decoder variables have to be the same apart from target variable #553