deepseek-ai / DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself
https://coder.deepseek.com/
MIT License
5.99k stars 431 forks source link

Align Scheduler Configuration with Finetuning Script #143

Open richardodliu opened 3 months ago

richardodliu commented 3 months ago

Summary

This PR updates the scheduler configuration in finetune/configs/ds_config_zero3.json to be consistent with the provided sample shell script for finetuning. This alignment ensures that users following the script guidelines will experience the expected behavior and avoid any confusion.

Changes

Rationale

The previous version of finetune/configs/ds_config_zero3.json specified the "scheduler" as "WarmupLR", which conflicts with the parameters in the sample shell script. This inconsistency could lead to the "cosine" lr_scheduler_type being inadvertently overridden during the actual fine-tuning process. I believe this contradicts the original intent of providing the configuration file, as it could mislead users about the intended method of updating the learning rate. Therefore, I suggest updating the configuration file according to the changes I've proposed, to ensure consistency in the learning rate scheduling and avoid any confusion.

Testing

Notes

Please review the changes and provide any feedback or approval at your earliest convenience. Thank you for considering this update to improve the finetuning process.