Open fengyang95 opened 3 months ago
I encounterd the same problem when inference DeepSeekCoder-V2 on 8 A100s with the latest vLLM docker.
same problem
solved by reinstall the triton from source
Maybe use triton==2.2.0. For me, it solved this problem.
Maybe use triton==2.2.0. For me, it solved this problem.
It works for me.
Also encounter this problem, triton==2.2.0 solves it. But it seems there are some version mismatch between torch. Don't know if that will cause some problems.
@jdf-prog #6140 has addressed this issue, you can update the vllm version to try it out
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
🐛 Describe the bug
When using L40 4-card inference, the following error occurs probabilistically: