Unable to format tensors correctly when using multiple GPU's

PyTorch-Forecasting version: 0.9.2
PyTorch version: 1.10.2
Python version: 3.9.7
Operating System: Ubuntu Server

Expected behavior

I executed code in order to fit model and expected to train model.

Actual behavior

However, result was an error message stating that the tensor had no dimensions. I think it has to do with multiple GPU's being used because I had no issue before training on one GPU. The only reason I switched to multiple was for fasting training, and because my dataset size increased.

Code to reproduce the problem

Train command

trainer.fit(
    tft,
    train_dataloader=train_dataloader,
    val_dataloaders=val_dataloader,
    ckpt_path="./checkpoints/model-checkpoint.ckpt",
)

Result

IndexError: dimension specified as -1 but tensor has no dimensions

TimeSeriesDataSet code

max_prediction_length = 2048
max_encoder_length = 512
training_cutoff = data["time_idx"].max() - max_prediction_length

training = TimeSeriesDataSet(
    data[lambda x: x.time_idx <= training_cutoff],
    time_idx="time_idx",
    target="Offer_price",
    group_ids=['Asset_id'],
    # keep encoder length long (as it is in the validation set)
    min_encoder_length=max_encoder_length // 2,
    max_encoder_length=max_encoder_length,
    min_prediction_length=1,
    max_prediction_length=max_prediction_length,
    static_categoricals=[],
    static_reals=[],
    time_varying_known_categoricals=[],
    # group of categorical variables can be treated as one variable
    variable_groups={},
    time_varying_known_reals=["time_idx"],
    time_varying_unknown_categoricals=[],
    time_varying_unknown_reals=[
        "Offer_price",
        "Bid_price",
        "Bid_size",
        "TotalTradeVolume_size",
        "Offer_size",
        "LowPrice_price",
        "Trade_price",
        "Trade_size",
        "OpenInterest_size",
        "OpeningPrice_price",
        "HighPrice_price",
        "SettlementPrice_price"
    ],
    target_normalizer=None,
    # GroupNormalizer(
    #     groups=["agency", "sku"], transformation="softplus"
    # ),  # use softplus and normalize by group
    add_relative_time_idx=True,
    add_target_scales=True,
    add_encoder_length=True,
)

# create validation set (predict=True) which means to predict the last max_prediction_length points in time
# for each series
validation = TimeSeriesDataSet.from_dataset(
    training, data, predict=True, stop_randomization=True)

# create dataloaders for model
batch_size = 128  # set this between 32 to 128
train_dataloader = training.to_dataloader(
    train=True, batch_size=batch_size, num_workers=0)
val_dataloader = validation.to_dataloader(
    train=False, batch_size=batch_size * 10, num_workers=0)

Model Code

# configure network and trainer
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min")
lr_logger = LearningRateMonitor()  # log the learning rate
logger = TensorBoardLogger("lightning_logs")  # logging results to a tensorboard

checkpoint_callback = ModelCheckpoint(
     dirpath='./checkpoints/',
     filename='model-checkpoint'
)

trainer = pl.Trainer(
    max_epochs=30,
    strategy='dp',
    accelerator='gpu',
    gpus=4,
    weights_summary="top",
    gradient_clip_val=0.1,
    limit_train_batches=30,  # coment in for training, running valiation every 30 batches
    # fast_dev_run=True,  # comment in to check that networkor dataset has no serious bugs
    callbacks=[lr_logger, early_stop_callback, checkpoint_callback],
    logger=logger,
)

tft = TemporalFusionTransformer.from_dataset(
    training,
    learning_rate=6.30957344,
    hidden_size=16,
    attention_head_size=1,
    dropout=0.1,
    hidden_continuous_size=8,
    output_size=7,  # 7 quantiles by default
    loss=QuantileLoss(),
    log_interval=10,  # uncomment for learning rate finder and otherwise, e.g. to 10 for logging every 10 batches
    reduce_on_plateau_patience=4,
)

jdb78 / pytorch-forecasting

Unable to format tensors correctly when using multiple GPU's #866

Expected behavior

Actual behavior

Code to reproduce the problem