Finetuning: Adaptive training strategy

onurgu commented 10 months ago

In the initial phase of fine-tuning, we've standardized the learning rate adjustment by using the AdaFactor scheduler for all tasks. This strategy aims to avoid testing various learning rates and schedulers at this stage. The AdaFactor scheduler's implementation and configuration are controlled by the adafactor_scheduler parameter, as shown in the configuration file: AdaFactor Scheduler in default.yaml

Moreover, the fine-tuning process will span 10 epochs and include an early stopping mechanism. We'll monitor the model’s performance on the validation set during training, and the model checkpoint showing the best validation performance will be saved.

gokceuludogan commented 10 months ago

The implementation has been introduced in #2 and fixed in #10.

gokceuludogan commented 10 months ago

By default, the number of training epochs is set to 10 in https://github.com/boun-llm/t5-tuner/commit/9b693b0ec8fabc9c3f6aa8d05035324d647bbe2b

boun-tabi-LMG / turkish-lm-tuner

Finetuning: Adaptive training strategy #6