time_idx in TimeSeriesDataset should automatically be marked as time_varying_known_reals variable.

PyTorch-Forecasting version: 1.0.0
PyTorch version: 2.1.0
Python version: 3.10.13
Operating System: Ubuntu 22.04.2 LTS

Expected behavior

I have a dataset with several time series (various real-numbered climatic variables such as temperature, wind_speed, etc. and a static categorical identification of a weather station station_id) and I create a TimeSeriesDataSet with a time index of time_idx. I set the climatic variables as time varying unknown reals and the station_id as a static categorical variable. I did not think about setting the time_idx variable as a time varying known real, because since the class definition requires a time index and its column name, I assumed it knows what it is.

Here is the code:

training = TimeSeriesDataSet(
        ts[lambda x: x.time_idx <= training_cutoff],
        group_ids=["station_id"],
        target=["temperature", "pressure", "wind_dir", "wind_speed", "relHumid1"],
        time_idx="time_idx",
        min_encoder_length=max_encoder_length // 2, 
        max_encoder_length=max_encoder_length,
        min_prediction_length=1,
        max_prediction_length=max_prediction_length,
        static_categoricals=["station_id"],
        time_varying_unknown_reals=["temperature", "pressure", "wind_dir", "wind_speed", "relHumid1"],
        allow_missing_timesteps=True,
        add_relative_time_idx=True,
)

batch_size = 128 
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=16)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10, num_workers=16)

tft = TemporalFusionTransformer.from_dataset(
        training,
        learning_rate=0.03,
        hidden_size=16,
        attention_head_size=2,
        dropout=0.1,
        hidden_continuous_size=8,
        loss=MultiLoss(metrics=[RMSE(), RMSE(), RMSE(), RMSE(), RMSE()], weights=[1.0, 1.0, 1.0, 1.0, 1.0]),
        optimizer="Ranger",
        reduce_on_plateau_patience=4,
    )

trainer.fit(
        tft,
        train_dataloaders=train_dataloader,
        val_dataloaders=val_dataloader, # val_datal
    )

predictions = tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="gpu"))
print(predictions)

I expected this would train the model correctly and I will be able to use the model for predictions.

Actual behavior

However, the training looked weird (although epochs were gradually increasing, in none of them was the progress bar moving, it always suddenly skipped from 0 % right to the next epoch). Additionally, when running the predict method, the output was an empty list [].

When I defined the time_idx within the time_varying_known_reals argument, it all started working, the epochs were gradually increasing with the progress bar working and the prediction outputs some results.

I would either expect that the time_idx would automatically be considered one of the time varying known reals (I didn't notice this anywhere in the docs), or perhaps there is a good reason not to do this, but in that case it would be nice to get an informative error message when the predict function propagates to the TemporalFusionTransformer method where after this line:

embeddings_varying_decoder = {
            name: input_vectors[name][:, max_encoder_length:] for name in self.decoder_variables  # select decoder
        }

the code abruptly ends without any message or exception. Maybe there could be a sanity check that self.decoder_variables is not empty.

Perhaps I am missing something, if so, I apologize and kindly ask for an explanation. Thank you!

sktime / pytorch-forecasting

time_idx in TimeSeriesDataset should automatically be marked as time_varying_known_reals variable. #1405

Expected behavior

Actual behavior