hiyouga / LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
31.14k stars 3.84k forks source link

【Help】使用华为昇腾910B环境如何推理模型 #2848

Closed smilelight closed 6 months ago

smilelight commented 6 months ago

Reminder

Reproduction

对npu代码进行npu注入替换后,还是会报错,貌似是Qwen的代码的问题? 代码修改:

import torch
import torch_npu

#自动映射cuda API到npu的代码
from torch_npu.contrib import transfer_to_npu

from llmtuner import create_web_demo

def main():
    demo = create_web_demo()
    demo.queue()
    demo.launch(server_name="0.0.0.0", server_port=7860, share=False, inbrowser=True)

if __name__ == "__main__":
    main()

报错: image

Expected behavior

No response

System Info

No response

Others

No response

hiyouga commented 6 months ago

是的,建议换 Qwen 1.5