baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology
https://huggingface.co/baichuan-inc
Apache License 2.0
4.08k stars 293 forks source link

baichuan2 lora微调模型不收敛 #79

Open tumanshu opened 1 year ago

tumanshu commented 1 year ago

同样的数据,使用 llama-efficient-tunning 进行lora微调,baichuan1代正常收敛,baichuan2的loss不下降 ,参数设置如下 image
baichuan1的loss图如下

image

baichuan2的loss如下图

image

是baichuan2有什么特别的设置吗?

ly19970621 commented 1 year ago

我也碰到了同样的问题,loss一直维持在2.7附近,不收敛

jeffchy commented 1 year ago

eval loss我也这样 train收敛(关掉xformers)

cccgw commented 1 year ago

请问你微调用的是多大的显卡

GuoxingY commented 1 year ago

eval loss我也这样 train收敛(关掉xformers)

请问一下xformers怎么关呀,我这也出现loss不收敛的问题