Open yihong1120 opened 9 months ago
Torch has officially supports Metal for a while now. Would adding support in vLLM be as simple as changing device="cuda"
to "mps" on Macs? Are there any other dependencies on CUDA?
Torch has officially supports Metal for a while now. Would adding support in vLLM be as simple as changing
device="cuda"
to "mps" on Macs? Are there any other dependencies on CUDA?
Anyone? I'd be happy to rewrite the implementation without the hardcoded device name - just don't want to spend hours down a dead-end.
I'd like to see this work as well, lots of Metal out there
same here please
+1
Wish it could be implemented🥺
My offer still stands if someone on the project can answer the above questions.
@pathorn says they have an implementation that runs on M3 chips in https://github.com/vllm-project/vllm/issues/176#issuecomment-2023827553
Do you think it could be adapted to the new CPU backend that was added in #3634?
@pathorn says they have an implementation that runs on M3 chips in #176 (comment)
Do you think it could be adapted to the new CPU backend that was added in #3634?
FYI for anyone who wants to see that PR: https://github.com/vllm-project/vllm/pull/2244#issuecomment-1868419884. @pathorn did some tremendous work on the PR. However, llama.cpp still performs faster - by a mile. This may not be a fruitful endeavour after all.
Dear vLLM Maintainers,
I hope this message finds you well. I am reaching out to inquire about the potential for integrating Mac Metal API support within the vLLM framework. As an avid user and advocate for vLLM's capabilities, I have been thoroughly impressed with its performance and flexibility across various platforms and hardware configurations.
Given the increasing prevalence of Mac devices in the machine learning community and the performance benefits offered by Apple's Metal API for GPU-accelerated computing, I am curious to know if there are any plans to extend vLLM's compatibility to include Metal support. This would undoubtedly be a significant boon for researchers and developers working on Mac environments who wish to leverage vLLM's impressive suite of features.
Could you please shed some light on the following aspects:
I understand that integrating a new backend such as Metal may present a variety of challenges, but I believe the potential benefits to the user community could be substantial. I am keen to offer my assistance, whether it be through testing, development, or documentation, to help bring this capability to fruition.
Thank you for your time and consideration. I eagerly await your response and am excited about the prospect of further enhancing vLLM's accessibility and performance on Mac platforms.
Best regards, yihong1120