RuntimeError: mat1 and mat2 shapes cannot be multiplied

dsheng commented 1 year ago

尝试在12G卡上训练 python qlora.py --model_name="chinese_alpaca" --model_name_or_path="./model_hub/chinese-alpaca-7b" --trust_remote_code=False --dataset="msra" --source_max_len=128 --target_max_len=64 --do_train --save_total_limit=1 --padding_side="right" --per_device_train_batch_size=8 --do_eval --bits=4 --save_steps=10 --gradient_accumulation_steps=1 --learning_rate=1e-5 --output_dir="./output/alpaca/" --lora_r=8 --lora_alpha=32 出错： File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 1012, in train() File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 973, in train train_result = trainer.train(resume_from_checkpoint=checkpoint_dir)

result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1536x4096 and 1x8388608) 可能是什么原因导致? 谢谢。

taishan1994 commented 1 year ago

尝试在12G卡上训练 python qlora.py --model_name="chinese_alpaca" --model_name_or_path="./model_hub/chinese-alpaca-7b" --trust_remote_code=False --dataset="msra" --source_max_len=128 --target_max_len=64 --do_train --save_total_limit=1 --padding_side="right" --per_device_train_batch_size=8 --do_eval --bits=4 --save_steps=10 --gradient_accumulation_steps=1 --learning_rate=1e-5 --output_dir="./output/alpaca/" --lora_r=8 --lora_alpha=32 出错： File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 1012, in train() File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 973, in train train_result = trainer.train(resume_from_checkpoint=checkpoint_dir)

result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1536x4096 and 1x8388608) 可能是什么原因导致? 谢谢。

peft版本要为0.4.0.dev0

dsheng commented 1 year ago

谢谢，解决没问题了。

zlh1992 commented 1 year ago

/opt/conda/envs/tch/lib/python3.9/site-packages/peft/tuners/lora.py:619 in forward │ │ │ │ 616 │ │ │ │ self.unmerge() │ │ 617 │ │ │ result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self. │ │ 618 │ │ elif self.r[self.active_adapter] > 0 and not self.merged: │ │ ❱ 619 │ │ │ result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self. │ │ 620 │ │ │ │ │ 621 │ │ │ x = x.to(self.lora_A[self.active_adapter].weight.dtype) │ │ 622 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: mat1 and mat2 shapes cannot be multiplied (1600x4096 and 1x8388608)

peft 0.4.0.dev0 bitsandbytes 0.39.0 deepspeed 0.9.3 transformers 4.30.0.dev0 我还是会遇到peft的这个报错请问我这个版本配置如何解决这个错误呢？

taishan1994 / qlora-chinese-LLM

RuntimeError: mat1 and mat2 shapes cannot be multiplied #2