sktime / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
4.03k stars 640 forks source link

time_idx in TimeSeriesDataset should automatically be marked as time_varying_known_reals variable. #1405

Open chododom opened 1 year ago

chododom commented 1 year ago

Expected behavior

I have a dataset with several time series (various real-numbered climatic variables such as temperature, wind_speed, etc. and a static categorical identification of a weather station station_id) and I create a TimeSeriesDataSet with a time index of time_idx. I set the climatic variables as time varying unknown reals and the station_id as a static categorical variable. I did not think about setting the time_idx variable as a time varying known real, because since the class definition requires a time index and its column name, I assumed it knows what it is.

Here is the code:

training = TimeSeriesDataSet(
        ts[lambda x: x.time_idx <= training_cutoff],
        group_ids=["station_id"],
        target=["temperature", "pressure", "wind_dir", "wind_speed", "relHumid1"],
        time_idx="time_idx",
        min_encoder_length=max_encoder_length // 2, 
        max_encoder_length=max_encoder_length,
        min_prediction_length=1,
        max_prediction_length=max_prediction_length,
        static_categoricals=["station_id"],
        time_varying_unknown_reals=["temperature", "pressure", "wind_dir", "wind_speed", "relHumid1"],
        allow_missing_timesteps=True,
        add_relative_time_idx=True,
)

batch_size = 128 
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=16)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10, num_workers=16)

tft = TemporalFusionTransformer.from_dataset(
        training,
        learning_rate=0.03,
        hidden_size=16,
        attention_head_size=2,
        dropout=0.1,
        hidden_continuous_size=8,
        loss=MultiLoss(metrics=[RMSE(), RMSE(), RMSE(), RMSE(), RMSE()], weights=[1.0, 1.0, 1.0, 1.0, 1.0]),
        optimizer="Ranger",
        reduce_on_plateau_patience=4,
    )

trainer.fit(
        tft,
        train_dataloaders=train_dataloader,
        val_dataloaders=val_dataloader, # val_datal
    )

predictions = tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="gpu"))
print(predictions)

I expected this would train the model correctly and I will be able to use the model for predictions.

Actual behavior

However, the training looked weird (although epochs were gradually increasing, in none of them was the progress bar moving, it always suddenly skipped from 0 % right to the next epoch). Additionally, when running the predict method, the output was an empty list [].

When I defined the time_idx within the time_varying_known_reals argument, it all started working, the epochs were gradually increasing with the progress bar working and the prediction outputs some results.

I would either expect that the time_idx would automatically be considered one of the time varying known reals (I didn't notice this anywhere in the docs), or perhaps there is a good reason not to do this, but in that case it would be nice to get an informative error message when the predict function propagates to the TemporalFusionTransformer method where after this line:

embeddings_varying_decoder = {
            name: input_vectors[name][:, max_encoder_length:] for name in self.decoder_variables  # select decoder
        }

the code abruptly ends without any message or exception. Maybe there could be a sanity check that self.decoder_variables is not empty.

Perhaps I am missing something, if so, I apologize and kindly ask for an explanation. Thank you!

tRosenflanz commented 1 year ago

The error message is indeed hard to decode in the case when no time_varying values are provided. Note that there are some extra features that DataLoader can add like relative_time_idx which are not part of the time_varying_known_reals argument by itself so just checking that argument isn't sufficient. But always adding time_idx isn't necessarily desirable because it can be just an aux feature that can make training less stable or lead to overfit