hiyouga / LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
31.79k stars 3.9k forks source link

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 华为910 命令行推理报错 #4622

Open apachemycat opened 3 months ago

apachemycat commented 3 months ago

Reminder

System Info

llamafactory-cli env

Reproduction

File "/usr/local/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 326, in forward query_states = self.q_proj(hidden_states) File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: "addmm_implcpu" not implemented for 'Half'**

Expected behavior

正常

Others

Assistant: Exception in thread Thread-9: Traceback (most recent call last): File "/usr/lib64/python3.9/threading.py", line 973, in _bootstrap_inner self.run() File "/usr/lib64/python3.9/threading.py", line 910, in run self._target(*self._args, self._kwargs) File "/usr/local/lib64/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1914, in generate result = self._sample( File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 2651, in _sample outputs = self( File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs)

MengqingCao commented 3 months ago

请问是否 ASCEND_RT_VISIBLE_DEVICES 指定了device 呢?这个报错看起来是跑在cpu上了,cpu貌似不支持fp16

参考 https://github.com/THUDM/ChatGLM3/issues/177