boun-tabi-LMG / turkish-lm-tuner

Turkish LM Tuner
https://boun-tabi-lmg.github.io/turkish-lm-tuner/
MIT License
76 stars 6 forks source link

Finetuning: Adaptive training strategy #6

Closed onurgu closed 10 months ago

onurgu commented 10 months ago

In the initial phase of fine-tuning, we've standardized the learning rate adjustment by using the AdaFactor scheduler for all tasks. This strategy aims to avoid testing various learning rates and schedulers at this stage. The AdaFactor scheduler's implementation and configuration are controlled by the adafactor_scheduler parameter, as shown in the configuration file: AdaFactor Scheduler in default.yaml

Moreover, the fine-tuning process will span 10 epochs and include an early stopping mechanism. We'll monitor the model’s performance on the validation set during training, and the model checkpoint showing the best validation performance will be saved.

gokceuludogan commented 10 months ago

The implementation has been introduced in #2 and fixed in #10.

gokceuludogan commented 10 months ago

By default, the number of training epochs is set to 10 in https://github.com/boun-llm/t5-tuner/commit/9b693b0ec8fabc9c3f6aa8d05035324d647bbe2b