In the initial phase of fine-tuning, we've standardized the learning rate adjustment by using the AdaFactor scheduler for all tasks. This strategy aims to avoid testing various learning rates and schedulers at this stage. The AdaFactor scheduler's implementation and configuration are controlled by the adafactor_scheduler parameter, as shown in the configuration file: AdaFactor Scheduler in default.yaml
Moreover, the fine-tuning process will span 10 epochs and include an early stopping mechanism. We'll monitor the model’s performance on the validation set during training, and the model checkpoint showing the best validation performance will be saved.
In the initial phase of fine-tuning, we've standardized the learning rate adjustment by using the AdaFactor scheduler for all tasks. This strategy aims to avoid testing various learning rates and schedulers at this stage. The AdaFactor scheduler's implementation and configuration are controlled by the
adafactor_scheduler
parameter, as shown in the configuration file: AdaFactor Scheduler in default.yamlMoreover, the fine-tuning process will span 10 epochs and include an early stopping mechanism. We'll monitor the model’s performance on the validation set during training, and the model checkpoint showing the best validation performance will be saved.