Closed CarloLucibello closed 3 years ago
Most recurrent networks require known covariates apart from the target (you can lag the covariates manually to make them "known" in the future). Otherwise you need to train a different encoder and decoder (e.g. in the TFT). So this is expected behaviour.
right, makes sense, I was thinking of my use case of one-step-ahead prediction but of course for longer horizons you need the covariates as inputs
I'll leave this open in case you think is worth making an exception for the one-step-ahead case, otherwise feel free to close
The targets are automatically lagged by 1 while other covariates are not. I suggest lagging manually in your dataframe (groupby()[name].shift()
).
Hi @CarloLucibello and @jdb78! Thank you for a helpful conversation!
@CarloLucibello Would you mind to share the code that uses groupby()[name].shift()
to produce the valid dataframe to be passed as a dataset?
My question arises because I don't understand what step should be used to shift the target in the dataset if I use the following settings max_encoder_length = 36; max_prediction_length = 6
Thank you!
Hi @CarloLucibello and @jdb78!
Thank you for a helpful conversation! I'm still struggling with getting a RecurrentModel
to run with a single real target and three covariates. I would be extremely grateful if you share a code snippet that made you able to train a RecurrentModel
. Thank you!
@CarloLucibello Would you mind to share the code that uses groupby()[name].shift() to produce the valid dataframe to be passed as a dataset?
My question arises because I don't understand what step should be used to shift the target in the dataset if I use the following settings max_encoder_length = 36; max_prediction_length = 6
Thank you!
Has anyone got a recurrent network running with covariates? If so, please share your code, i got stuck for too long :(
Hi, I'm trying to adapt the stallion example to use a RecurrentNetwork. As long as the
target
and thetime_varying_unknown_reals
are matched, as ineverything is fine. When the inputs differ from the targets though, as in
I get the following error
So there is no way to perform single target regression without dropping all others covariates. Is this expected or a bug?
Click to expand code!
```python from pathlib import Path import pickle import warnings import numpy as np import pandas as pd from pandas.core.common import SettingWithCopyWarning import pytorch_lightning as pl from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor from pytorch_lightning.loggers import TensorBoardLogger import torch from pytorch_forecasting import GroupNormalizer, RecurrentNetwork, TimeSeriesDataSet from pytorch_forecasting.data.examples import get_stallion_data from pytorch_forecasting.metrics import MAE, RMSE, SMAPE, PoissonLoss, QuantileLoss from pytorch_forecasting.models.temporal_fusion_transformer.tuning import optimize_hyperparameters from pytorch_forecasting.utils import profile warnings.simplefilter("error", category=SettingWithCopyWarning) data = get_stallion_data() data["month"] = data.date.dt.month.astype("str").astype("category") data["log_volume"] = np.log(data.volume + 1e-8) data["time_idx"] = data["date"].dt.year * 12 + data["date"].dt.month data["time_idx"] -= data["time_idx"].min() data["avg_volume_by_sku"] = data.groupby(["time_idx", "sku"], observed=True).volume.transform("mean") data["avg_volume_by_agency"] = data.groupby(["time_idx", "agency"], observed=True).volume.transform("mean") # data = data[lambda x: (x.sku == data.iloc[0]["sku"]) & (x.agency == data.iloc[0]["agency"])] special_days = [ "easter_day", "good_friday", "new_year", "christmas", "labor_day", "independence_day", "revolution_day_memorial", "regional_games", "fifa_u_17_world_cup", "football_gold_cup", "beer_capital", "music_fest", ] data[special_days] = data[special_days].apply(lambda x: x.map({0: "", 1: x.name})).astype("category") training_cutoff = data["time_idx"].max() - 6 max_encoder_length = 36 max_prediction_length = 6 training = TimeSeriesDataSet( data[lambda x: x.time_idx <= training_cutoff], time_idx="time_idx", # target="volume", target=[ "volume", # "log_volume", # "industry_volume", # "soda_volume", # "avg_max_temp", # "avg_volume_by_agency", # "avg_volume_by_sku", ], group_ids=["agency", "sku"], min_encoder_length=max_encoder_length // 2, # allow encoder lengths from 0 to max_prediction_length max_encoder_length=max_encoder_length, min_prediction_length=1, max_prediction_length=max_prediction_length, # static_categoricals=["agency", "sku"], # static_reals=["avg_population_2017", "avg_yearly_household_income_2017"], # time_varying_known_categoricals=["special_days", "month"], # variable_groups={"special_days": special_days}, # group of categorical variables can be treated as one variable # time_varying_known_reals=["time_idx", "price_regular", "discount_in_percent"], # time_varying_unknown_categoricals=[], time_varying_unknown_reals=[ "volume", "log_volume", # "industry_volume", # "soda_volume", # "avg_max_temp", # "avg_volume_by_agency", # "avg_volume_by_sku", ], # target_normalizer=GroupNormalizer( # groups=["agency", "sku"], transformation="softplus", center=False # ), # use softplus with beta=1.0 and normalize by group # add_relative_time_idx=True, # add_target_scales=True, # add_encoder_length=True, ) validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True) batch_size = 64 train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=0) val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=0) # # save datasets # training.save("training.pkl") # validation.save("validation.pkl") early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min") lr_logger = LearningRateMonitor() # logger = TensorBoardLogger(log_graph=True) trainer = pl.Trainer( max_epochs=100, gpus=0, weights_summary="top", gradient_clip_val=0.1, limit_train_batches=30, # val_check_interval=20, # limit_val_batches=1, # fast_dev_run=True, # logger=logger, # profiler=True, callbacks=[lr_logger, early_stop_callback], ) model = RecurrentNetwork.from_dataset( training, learning_rate=0.03, hidden_size=16, # attention_head_size=1, dropout=0.1, # hidden_continuous_size=8, # output_size=7, # loss=QuantileLoss(), # log_interval=10, # log_val_interval=1, # reduce_on_plateau_patience=3, ) print(f"Number of parameters in network: {model.size()/1e3:.1f}k") trainer.fit( model, train_dataloader=train_dataloader, val_dataloaders=val_dataloader, ) # make a prediction on entire validation set preds, index = model.predict(val_dataloader, return_index=True, fast_dev_run=True) ```