Closed bltcn closed 1 month ago
profile_restful_api.py 中默认 num_prompts 是 5000,不知道为何,你这边的结果,显示的是10 4卡2080ti,qwen2-72b-4bits模型,还是不要期待太多了 😂
这拓扑有点奇怪,每张卡分了多少 PCIe lane?
profile_restful_api.py 中默认 num_prompts 是 5000,不知道为何,你这边的结果,显示的是10 4卡2080ti,qwen2-72b-4bits模型,还是不要期待太多了 😂
因为我修改了源码,否则5000直接超时了
0.5.3版本解决此问题
Checklist
Describe the bug
使用benchmark/profile_restful_api.py进行测试,速度非常缓慢
Reproduction
cd /opt/lmdeploy/benchmark ; /usr/bin/env /opt/pyroot@11433bdefe94:/opt/lmdeploy/benchmark# cd /opt/lmdeploroot/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 44565 -- /opt/lmdeploy/benchmark/profile_restful_api.py http://127.0.0.1:23333 Qwen/Qwen2-72B-Instruct-AWQ ./ShareGPT_V3_unfiltered_cleaned_split.json
batch,num_prompts,RPS,RPM,FTL(ave)(s),FTL(min)(s),FTL(max)(s),throughput(out tok/s),throughput(total tok/s) 128,10,0.009,0.533,-,-,-,2.287,5.186
Environment
Error traceback
No response