QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

计算推理速度的profile.py不能运行 #1037

Closed rabum closed 7 months ago

rabum commented 8 months ago

测的是Qwen-72B-Chat-Int4,加载完后卡在这里: image

jklj077 commented 8 months ago

环境配置(几张卡、什么卡、transformers、pytorch、cuda、autogptq版本)请提供下 另外profile.py加载不了量化模型才对,复现脚本提供下

rabum commented 7 months ago

不好意思是能运行的,图上不动就是在计算了