I executed code in order to fit model and expected to train model.
Actual behavior
However, result was an error message stating that the tensor had no dimensions. I think it has to do with multiple GPU's being used because I had no issue before training on one GPU. The only reason I switched to multiple was for fasting training, and because my dataset size increased.
IndexError: dimension specified as -1 but tensor has no dimensions
TimeSeriesDataSet code
max_prediction_length = 2048
max_encoder_length = 512
training_cutoff = data["time_idx"].max() - max_prediction_length
training = TimeSeriesDataSet(
data[lambda x: x.time_idx <= training_cutoff],
time_idx="time_idx",
target="Offer_price",
group_ids=['Asset_id'],
# keep encoder length long (as it is in the validation set)
min_encoder_length=max_encoder_length // 2,
max_encoder_length=max_encoder_length,
min_prediction_length=1,
max_prediction_length=max_prediction_length,
static_categoricals=[],
static_reals=[],
time_varying_known_categoricals=[],
# group of categorical variables can be treated as one variable
variable_groups={},
time_varying_known_reals=["time_idx"],
time_varying_unknown_categoricals=[],
time_varying_unknown_reals=[
"Offer_price",
"Bid_price",
"Bid_size",
"TotalTradeVolume_size",
"Offer_size",
"LowPrice_price",
"Trade_price",
"Trade_size",
"OpenInterest_size",
"OpeningPrice_price",
"HighPrice_price",
"SettlementPrice_price"
],
target_normalizer=None,
# GroupNormalizer(
# groups=["agency", "sku"], transformation="softplus"
# ), # use softplus and normalize by group
add_relative_time_idx=True,
add_target_scales=True,
add_encoder_length=True,
)
# create validation set (predict=True) which means to predict the last max_prediction_length points in time
# for each series
validation = TimeSeriesDataSet.from_dataset(
training, data, predict=True, stop_randomization=True)
# create dataloaders for model
batch_size = 128 # set this between 32 to 128
train_dataloader = training.to_dataloader(
train=True, batch_size=batch_size, num_workers=0)
val_dataloader = validation.to_dataloader(
train=False, batch_size=batch_size * 10, num_workers=0)
Model Code
# configure network and trainer
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min")
lr_logger = LearningRateMonitor() # log the learning rate
logger = TensorBoardLogger("lightning_logs") # logging results to a tensorboard
checkpoint_callback = ModelCheckpoint(
dirpath='./checkpoints/',
filename='model-checkpoint'
)
trainer = pl.Trainer(
max_epochs=30,
strategy='dp',
accelerator='gpu',
gpus=4,
weights_summary="top",
gradient_clip_val=0.1,
limit_train_batches=30, # coment in for training, running valiation every 30 batches
# fast_dev_run=True, # comment in to check that networkor dataset has no serious bugs
callbacks=[lr_logger, early_stop_callback, checkpoint_callback],
logger=logger,
)
tft = TemporalFusionTransformer.from_dataset(
training,
learning_rate=6.30957344,
hidden_size=16,
attention_head_size=1,
dropout=0.1,
hidden_continuous_size=8,
output_size=7, # 7 quantiles by default
loss=QuantileLoss(),
log_interval=10, # uncomment for learning rate finder and otherwise, e.g. to 10 for logging every 10 batches
reduce_on_plateau_patience=4,
)
Expected behavior
I executed code in order to fit model and expected to train model.
Actual behavior
However, result was an error message stating that the tensor had no dimensions. I think it has to do with multiple GPU's being used because I had no issue before training on one GPU. The only reason I switched to multiple was for fasting training, and because my dataset size increased.
Code to reproduce the problem
Train command
Result
TimeSeriesDataSet code
Model Code