shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
3.24k stars 492 forks source link

sft的时候加入shift_attn 窗口长度可以增加多少 #269

Closed sunshineyg2018 closed 10 months ago

sunshineyg2018 commented 10 months ago

还是启用shift_attn 窗口长度与自己的语料相关?