Open xuedinge233 opened 2 months ago
Duplicate of #1606 and #6066
Thinks for your reply, we are ready to develop Ascend ADAPTS for the vllm-project, after completion, we will provide you with a public Ascend prototype
That's great to hear @xuedinge233! Having hardware instances available for CI would be great for maintaining support
Yeah, we‘re working on it, and we will upload it as soon as there is progress in the development
Yeah, we‘re working on it, and we will upload it as soon as there is progress in the development
Any update about this?
Yeah, we‘re working on it, and we will upload it as soon as there is progress in the development
Any update about this?
We are making steady progress and expect to deliver the first version of the code in a couple of weeks.
Is it planned to support NPU+CANN environments where the CPU is the aarch64 architecture?
Looking forward to NPU support on vLLM!
Is it planned to support NPU+CANN environments where the CPU is the aarch64 architecture?
Yes, NPU+MindIE+CANN is in plan, and both aarch64 and x86_64 will be supported.
Yeah, we‘re working on it, and we will upload it as soon as there is progress in the development
Any update about this?
We are making steady progress and expect to deliver the first version of the code in a couple of weeks.
Excuse me, is there any progress?
Is it planned to support NPU+CANN environments where the CPU is the aarch64 architecture?
Yes, NPU+MindIE+CANN is in plan, and both aarch64 and x86_64 will be supported.
I mean NPU+VLLM+CANN, and based on aarch64 CPU architecture. NPU device is Atlas 300I Duo or others.
@xuedinge233 How long will it take to release the vllm version of npu
@xuedinge233 How long will it take to release the vllm version of npu
The latest updates on the progress can be found here https://github.com/vllm-project/vllm/pull/8054
🚀 The feature, motivation and pitch
Background
Currently, the project supports various hardware accelerators such as GPUs, but there is no support for NPUs. Adding NPU support could significationly benefit users who have access to these devices, enabling faster and more efficient computations.
Reference Materials
Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. For more information about Ascend, see Ascend Community.
CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI.
Pytorch has officially announced support for Ascend NPU (through key PrivateUse1), please see the PrivateUse1 tutorial here.
Specific Request
we would like to request the addition of support for NPUs within the project. In order to achieve this goal, we will contributing code and providing feedback. This request may additional resources and effort on your part, but we hope to get your help if possible.
Alternatives
No response
Additional context
No response