baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology
https://huggingface.co/baichuan-inc
Apache License 2.0
4.08k stars 293 forks source link

请问在微调模型时,报错RuntimeError: mat1 and mat2 shapes cannot be multiplied (8192x4096 and 1x25165824)是什么原因 #218

Open Admiraljj opened 1 year ago

Admiraljj commented 1 year ago

用的模型是Baichuan2-7B-Chat-4bits,用lora微调,双3090,不用lora会oom 训练数据是官方给的belle_chat_ramdon_10k.json

rucieryi369 commented 11 months ago

试试 pip uninstall peft pip install -U git+https://github.com/huggingface/peft.git

luqy671 commented 10 months ago

你是用的jupyter+peft+DataParallel吗? 我用.py+peft+DataParallel是正常的,迁移到jupyter时出现了和你一样的报错。后来把jupyter里面的.cuda()删除后,就正常了。 具体原因我现在也不太清楚,但怀疑是和jupyter调用cuda的方式有关