Open khayamgondal opened 2 months ago
First we need to implement a CPU backend for ARM. Currently we only have implementations for x86 and PowerPC backends
Hi @mgoin , I hope you guys will release a version compatible with arm64 machines, such as the Nvidia AGX Orin Developer Kit. Looking forward!!
Yeah especially now since there is more push towards ARM machines by nvidia
To be clear you can use vLLM CUDA on ARM machines i.e. it is easy to get it working on Grace-Hopper. We just don't have a vLLM CPU backend for ARM machines.
Yes, I am aware of that and actually use that. Some part of my research focuses on CPU only - for that I am interested in vllm arm cpu implementation
On Wed, Sep 11, 2024, 2:22 PM Michael Goin @.***> wrote:
To be clear you can use vLLM CUDA on ARM machines i.e. it is easy to get it working on Grace-Hopper. We just don't have a vLLM CPU backend for ARM machines.
— Reply to this email directly, view it on GitHub https://github.com/vllm-project/vllm/issues/8259#issuecomment-2344527564, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATNG3YZPW5S5SJIUVCXPUTZWCJ6JAVCNFSM6AAAAABNZWCFIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBUGUZDONJWGQ . You are receiving this because you authored the thread.Message ID: @.***>
🚀 The feature, motivation and pitch
Please provide a Dockerfile.cpu for aarch64 systems. I have a GH200 and I want to run the CPU only inference
Alternatives
No response
Additional context
No response
Before submitting a new issue...