Open allenhaozi opened 9 months ago
Hi,
Do you solve this problem? I encounter the same problem too.
Hi,
Do you solve this problem? I encounter the same problem too.
This issue hasn't been resolved, I encounter it occasionally.
I think there's an issue with cupy backend that's used for tensor parallelism.
If you use enforce_eager=True, it is likely resolved (though it will affect performance). Regarding error itself, I think https://github.com/cupy/cupy is probably a better place to report an error.
I think there's an issue with cupy backend that's used for tensor parallelism.
If you use enforce_eager=True, it is likely resolved (though it will affect performance). Regarding error itself, I think https://github.com/cupy/cupy is probably a better place to report an error.
Hi,
Thanks for your suggestion. I will report on cupy. However, since the vllm must run successfully somehow in multi-node environment before release while I fail to deploy every time, I think there must be something that is related to environment hinder the deployment.
I think the cupy backend has been introduced lately for cuda graph (which is disabled by force_eager=True). My guess is this backend is not working well in some environments, but it is pretty difficult to troubleshoot without reproducing the issue. If you can tell me your instance details, I can try repro
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Occasionally encounter errors