hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs
Apache License 2.0
25.52k stars 3.16k forks source link

预测推理特别慢,跑完GPU利用率为0了一直卡在那里好像是构建generation #4638

Closed Harryjun closed 2 days ago

Harryjun commented 2 days ago

Reminder

System Info

image

Reproduction

如图

Expected behavior

No response

Others

No response

hiyouga commented 2 days ago

推理速度慢正常