boun-tabi-LMG / turkish-lm-tuner

Turkish LM Tuner
https://boun-tabi-lmg.github.io/turkish-lm-tuner/
MIT License
73 stars 6 forks source link

Implement Adam Optimizer #33

Closed gokceuludogan closed 6 months ago

gokceuludogan commented 6 months ago

This pull request introduces changes to the optimizer and linear scheduler configurations in the trainer.py and finetune.py

The current implementation supports Adafactor optimizer, with optional scheduler and the default transformers trainer optimizer, AdamW with a learning rate of 5e-5 and a linear scheduler.

Usage instructions:

To employ Adafactor with a scheduler, configure as follows:

optimizer_params:
  optimizer_type: adafactor  # Options: adafactor or adam
  scheduler: True

To use Adafactor without a scheduler, configure as follows:

optimizer_params:
  optimizer_type: adafactor  # Options: adafactor or adam
  scheduler: False

For default parameters (utilizing AdamW and a linear scheduler), any setting other than adafactor is required:

optimizer_params:
  optimizer_type: adamw
  scheduler: linear
gokceuludogan commented 6 months ago

The implementation was tested by fine-tuning mBART on paraphrasing datasets. The loss decreased to a value comparable to that of T5 models, showing a reasonable curve.

Issue #34 has been opened for the implementation of custom optimizer, scheduler and parameters.