optuna hyperparameter optimization for NER task on knowledge distillation - Githubissues

nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book

https://transformersbook.com/

Apache License 2.0

3.7k stars 1.13k forks source link

optuna hyperparameter optimization for NER task on knowledge distillation #115

Open Venkatesh3132003 opened 10 months ago

Venkatesh3132003 commented 10 months ago

Information

The problem arises in chapter:

[ ] Making Transformers Efficient in Production

Describe the bug

while training i am getting proper F1 score of 0.755940

while finding best fit value of alpha and temperature value for NER task f1 score is 0.096029 which is less than 0.1

To Reproduce

Steps to reproduce the behavior:

1.compute metric is same as chapter 4 of NER 2.Hyperparameter are for alpha and temperature

def hp_space(trial): return {"alpha": trial.suggest_float("alpha", 0, 1), "temperature": trial.suggest_int("temperature", 2, 20)}

best_run = distil_roberta_trainer.hyperparameter_search(
n_trials=12, direction="maximize",backend="optuna", hp_space=hp_space)

Expected behavior

After the hyperparameter search the F1 score should be higher than baseline.

Venkatesh3132003 commented 10 months ago

When alpha is 1 F1 score is good and for any value of alpha between 0 and 1 F1 score is less than 0.1