modelscope / ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 80+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
3.4k stars 292 forks source link

lr_scheduler_type #1807

Open tbwang-clound opened 3 weeks ago

tbwang-clound commented 3 weeks ago

Describe the feature MiNiCPW 提出了WSD的学习率调度策略,比consine的效果要好,建议加上。因为框架耦合太严重,加起来很费劲。

Jintao-Huang commented 3 weeks ago

transformers trainer支持,ms-swift就支持啊

tastelikefeet commented 2 weeks ago

对lr_scheduler的定制牵扯到框架重构和插件化,这个会尽快处理