Implement Adam Optimizer

This pull request introduces changes to the optimizer and linear scheduler configurations in the trainer.py and finetune.py

The current implementation supports Adafactor optimizer, with optional scheduler and the default transformers trainer optimizer, AdamW with a learning rate of 5e-5 and a linear scheduler.

Usage instructions:

To employ Adafactor with a scheduler, configure as follows:

optimizer_params:
  optimizer_type: adafactor  # Options: adafactor or adam
  scheduler: True

To use Adafactor without a scheduler, configure as follows:

optimizer_params:
  optimizer_type: adafactor  # Options: adafactor or adam
  scheduler: False

For default parameters (utilizing AdamW and a linear scheduler), any setting other than adafactor is required:

optimizer_params:
  optimizer_type: adamw
  scheduler: linear

boun-tabi-LMG / turkish-lm-tuner

Implement Adam Optimizer #33