vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.45k stars 4.61k forks source link

[Feature]: Any thoughts about MI50 support ? #6565

Open linchen111 opened 4 months ago

linchen111 commented 4 months ago

🚀 The feature, motivation and pitch

MI50 is like 2080ti ,but so much cheaper(1/4), and with 16GB memory.

But when I tried to compile it in MI50 machine, I got this: [ 83%] Building HIP object CMakeFiles/_custom_C.dir/csrc/custom/paged_attention/attention_ll4mi.hip.o /root/vllm/build/temp.linux-x86_64-cpython-310/csrc/custom/custom_kernels.hip:404:17: error: instruction not supported on this GPU 404 | asm("v_dot2c_f32_f16 %0, %2, %3" | ^

:1:2: note: instantiated into assembly here 1 | v_dot2c_f32_f16 v32, v4, v12 | ^ /root/vllm/build/temp.linux-x86_64-cpython-310/csrc/custom/custom_kernels.hip:412:17: error: instruction not supported on this GPU 412 | asm("v_dot2c_f32_f16 %0, %2, %3" | ^ :1:2: note: instantiated into assembly here 1 | v_dot2c_f32_f16 v31, v4, v20 | ^ ### Alternatives _No response_ ### Additional context _No response_
mawong-amd commented 4 months ago

Based on the log here, @linchen111 is building the ROCm fork of vLLM (https://github.com/ROCm/vllm).

It might be worth building the container here while specifying --build-arg "PYTORCH_ROCM_ARCH=gfx906". Do note that MI50 support is has been deprecated since ROCm 6.0, so while vLLM might work for now, future releases of ROCm (and hence vLLM) might not support MI50.

linchen111 commented 4 months ago

Based on the log here, @linchen111 is building the ROCm fork of vLLM (https://github.com/ROCm/vllm).根据此处的日志,正在构建 vLLM 的 ROCm 分支 (https://github.com/ROCm/vllm)。

It might be worth building the container here while specifying --build-arg "PYTORCH_ROCM_ARCH=gfx906". Do note that MI50 support is has been deprecated since ROCm 6.0, so while vLLM might work for now, future releases of ROCm (and hence vLLM) might not support MI50.在指定 --build-arg "PYTORCH_ROCM_ARCH=gfx906" 的同时,可能值得在此处构建容器。请注意,自 ROCm 6.0 起,MI50 支持已被弃用,因此虽然 vLLM 目前可能有效,但 ROCm 的未来版本(以及 vLLM)可能不支持 MI50。

thanks~ trying its docker builder now.

linchen111 commented 4 months ago

[ 83%] Building HIP object CMakeFiles/_custom_C.dir/csrc/custom/paged_attention/attention_ll4mi.hip.o /root/vllm/build/temp.linux-x86_64-cpython-310/csrc/custom/custom_kernels.hip:404:17: error: instruction not supported on this GPU 404 | asm("v_dot2c_f32_f16 %0, %2, %3" | ^ :1:2: note: instantiated into assembly here 1 | v_dot2c_f32_f16 v32, v4, v12 | ^ /root/vllm/build/temp.linux-x86_64-cpython-310/csrc/custom/custom_kernels.hip:412:17: error: instruction not supported on this GPU 412 | asm("v_dot2c_f32_f16 %0, %2, %3" | ^ :1:2: note: instantiated into assembly here 1 | v_dot2c_f32_f16 v31, v4, v20 | ^

failed with this again: [ 83%] Building HIP object CMakeFiles/_custom_C.dir/csrc/custom/paged_attention/attention_ll4mi.hip.o /root/vllm/build/temp.linux-x86_64-cpython-310/csrc/custom/custom_kernels.hip:404:17: error: instruction not supported on this GPU 404 | asm("v_dot2c_f32_f16 %0, %2, %3" | ^ :1:2: note: instantiated into assembly here 1 | v_dot2c_f32_f16 v32, v4, v12 | ^ /root/vllm/build/temp.linux-x86_64-cpython-310/csrc/custom/custom_kernels.hip:412:17: error: instruction not supported on this GPU 412 | asm("v_dot2c_f32_f16 %0, %2, %3" | ^ :1:2: note: instantiated into assembly here 1 | v_dot2c_f32_f16 v31, v4, v20 | ^

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!