Open Serge9744 opened 1 year ago
Your loss looks to be very big. Is ur normalization done properly. Just a side comment
Hi,
Yes for the exogenous variables I used 👍 from sklearn.preprocessing import StandardScaler
sc =StandardScaler()
df_train[[col for col in exog_var if col != 'crisis' ]] = sc.fit_transform(df_train[[col for col in exog_var if col != 'crisis' ]])
Then for the target directly in the TimeSerieDataSet. Constant is the name of the column with 1 in every row as there are no specific groups.
MaybeI got it wrong
training = TimeSeriesDataSet(
df_train.loc[:,[endog_var] +["time_index","constant"]+ exog_var],
time_idx="time_index",
target=endog_var,
group_ids=["constant"],
min_encoder_length=max_encoder_length ,
max_encoder_length=max_encoder_length,
min_prediction_length=max_prediction_length,
max_prediction_length=max_prediction_length,
time_varying_unknown_categoricals=["crisis"],
time_varying_unknown_reals= [endog_var]+[col for col in exog_var if col != 'crisis' ],
target_normalizer=TorchNormalizer(),
add_relative_time_idx=True,
add_target_scales=True,
add_encoder_length=True,
)
Hi @Serge9744 , can you share how you retrieve the best model with the optimal hyperparameters after hyper parameter study. Can it be retrieved from 'study'?
Hi,
The optimization didn't work. So I just added the line :
"metrics_callback.on_validation_end( trainer)" after line 209 .
I also modified the class :
class MetricsCallback(Callback): """PyTorch Lightning metric callback."""
By suppressing the pl argument in the on validation signature which didn't serve
I have a weird beahviour though , as the best parameters study val loss doesn't correspond to the one I train without the optimization .
For examples during experiments I have:
Q_Loss = QuantileLoss([0.05,0.5,0.95])
checkpoint_callback = ModelCheckpoint( dirpath=".", filename="best-checkpoint", save_top_k=1, verbose=True, monitor="val_loss", mode="min" ) early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min")
from pytorch_forecasting.models.temporal_fusion_transformer.tuning import optimize_hyperparameters
create study
study = optimize_hyperparameters( train_dataloader, val_dataloader, model_path="optuna_test", n_trials=200, max_epochs=200, gradient_clip_val_range=(0.01, 1.0), hidden_size_range=(8, 512), hidden_continuous_size_range=(8, 512), attention_head_size_range=(1, 4), lstm_layers_range=(1,8), learning_rate_range=(1e-6, 0.1), dropout_range=(0.1, 0.3), trainer_kwargs=dict(limit_train_batches=30,callbacks=[early_stop_callback, checkpoint_callback], ), output_size=3, # 7 quantiles by default loss=Q_Loss, reduce_on_plateau_patience=4, use_learning_rate_finder=False, # use Optuna to find ideal learning rate or use in-built learning rate finder )
Trial 125 finished with value: 3930406.75 and parameters: {'gradient_clip_val': 0.15578493164317458, 'hidden_size': 69, 'lstm_layers': 5, 'dropout': 0.2761985832861955, 'hidden_continuous_size': 32, 'attention_head_size': 4, 'learning_rate': 0.07180574537769648}. Best is trial 125 with value: 3930406.75.
However when I train it with same hyper parameters:
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min") lr_logger = LearningRateMonitor() # log the learning rate logger = TensorBoardLogger("lightning_logs") # logging results to a tensorboard
trainer = pl.Trainer( max_epochs=200, gpus=0, enable_model_summary=True, gradient_clip_val=best_params['gradient_clip_val'],
limit_train_batches=10, # coment in for training, running valiation every 30 batches
tft = TemporalFusionTransformer.from_dataset( training,
not meaningful for finding the learning rate but otherwise very important
)
I have : Epoch 27: 100% 39/39 [00:12<00:00, 3.23it/s, loss=2.76e+06, v_num=29, train_loss_step=2.89e+6, val_loss=1.06e+7, train_loss_epoch=2.78e+6]