I executed code with optimize_hyperparameters function to find optimal parameters of a TFT network for a multiclass classification task using CrossEntropy() loss function:
tft = TemporalFusionTransformer.from_dataset(
training,
learning_rate=0.03,
lstm_layers = 2,
hidden_size=64,
attention_head_size=4,
dropout=0.15,
hidden_continuous_size=48,
output_size=7,
**loss=CrossEntropy(),**
log_interval=10, # uncomment for learning rate finder and otherwise, e.g. to 10 for logging every 10 batches
reduce_on_plateau_patience=4,
)
study = optimize_hyperparameters(
train_dataloader,
val_dataloader,
model_path="optuna_test",
n_trials=200,
max_epochs=20,
gradient_clip_val_range=(0.01, 1.0),
hidden_size_range=(16, 96),
hidden_continuous_size_range=(16, 64),
attention_head_size_range=(1, 4),
learning_rate_range=(0.001, 0.1),
dropout_range=(0.1, 0.3),
trainer_kwargs=dict(limit_train_batches=20),
reduce_on_plateau_patience=4,
use_learning_rate_finder=False, # use Optuna to find ideal learning rate or use in-built learning rate finder
)
Actual behavior
However, result was optimize_hyperparameters() function uses QuantileLoss() but I expected CrossEntropy() I see it in TensorBoard.
Any guess how to pass other loss function.
Expected behavior
I executed code with optimize_hyperparameters function to find optimal parameters of a TFT network for a multiclass classification task using CrossEntropy() loss function: tft = TemporalFusionTransformer.from_dataset( training, learning_rate=0.03, lstm_layers = 2, hidden_size=64, attention_head_size=4, dropout=0.15, hidden_continuous_size=48,
output_size=7,
study = optimize_hyperparameters( train_dataloader, val_dataloader, model_path="optuna_test", n_trials=200, max_epochs=20, gradient_clip_val_range=(0.01, 1.0), hidden_size_range=(16, 96), hidden_continuous_size_range=(16, 64), attention_head_size_range=(1, 4), learning_rate_range=(0.001, 0.1), dropout_range=(0.1, 0.3), trainer_kwargs=dict(limit_train_batches=20), reduce_on_plateau_patience=4, use_learning_rate_finder=False, # use Optuna to find ideal learning rate or use in-built learning rate finder )
Actual behavior
However, result was optimize_hyperparameters() function uses QuantileLoss() but I expected CrossEntropy() I see it in TensorBoard. Any guess how to pass other loss function.