chunniunai220ml commented 1 week ago

Your current environment

The output of `python collect_env.py`

Versions of relevant libraries: [pip3] numpy==1.26.3 [pip3] pytorch-triton-rocm==2.3.0 [pip3] sentence-transformers==3.0.1 [pip3] torch==2.3.0+rocm6.0 [pip3] torchaudio==2.3.0+rocm6.0 [pip3] torchvision==0.18.0+rocm6.0 [pip3] transformers==4.40.0 [conda] numpy 1.26.3 pypi_0 pypi [conda] pytorch-triton-rocm 2.3.0 pypi_0 pypi [conda] sentence-transformers 3.0.1 pypi_0 pypi [conda] torch 2.3.0+rocm6.0 pypi_0 pypi [conda] torchaudio 2.3.0+rocm6.0 pypi_0 pypi [conda] torchvision 0.18.0+rocm6.0 pypi_0 pypi [conda] transformers 4.40.0 pypi_0 pypi ROCM Version: 6.0.32830-d62f6a171 Neuron SDK Version: N/A vLLM Version: 0.5.0.post1 vLLM Build Flags: CUDA Archs: Not Set; ROCm: Enabled; Neuron: Disabled GPU Topology: Could not collect

How you are installing vllm

hi, i follow ur instruction build a docker : docker build -f Dockerfile.rocm -t vllm-rocm . , successfully, but i got errors when i use vllm backend, then i tried to modify some vllm codes. soi have to reinstall in a eidtable mode, but failed.

when i tried to install form source: git clone https://github.com/vllm-project/vllm.git cd vllm

export VLLM_INSTALL_PUNICA_KERNELS=1 # optionally build for multi-LoRA capability

pip install -e .

error as follow:

i set .bashrc as follow:

export ROCM_HOME='/opt/rocm-6.0.0/' export PATH="/opt/rocm-6.0.0/bin/:$PATH" export PATH="/opt/rocm-6.0.0/lib/:$PATH"

youkaichao commented 1 week ago

cc @hongxiayang

hongxiayang commented 1 week ago

hi, i follow ur instruction build a docker : docker build -f Dockerfile.rocm -t vllm-rocm . , successfully, but i got errors when i use vllm backend, then i tried to modify some vllm codes. soi have to reinstall in a eidtable mode, but failed. when i tried to install form source: git clone https://github.com/vllm-project/vllm.git cd vllm

Would like to know the exact steps for your issue. Not sure how you did install from source, is it inside the container or outside the container?

The error in your screen shot is "Permission denied". is there any permission issue?

chunniunai220ml commented 1 week ago

i think it not only cuased by permisssion denied, pls see the above compile error

and checked :

RanchiZhao commented 1 week ago

same issue when i do pip install .

hongxiayang commented 1 week ago

sorry, can you list step by step what you did so that I can reproduce? I can not help if I don't know how to reproduce.

chunniunai220ml commented 3 days ago

i have to pip install vllm from source in a rocm6.1 docker image, following installation guides: cd vllm pip install -U -r requirements-rocm.txt pip install -e . then got error as i reported screen shot here is my rocm info: Package: rocm-libs Version: 6.1.2.60102-119~20.04 Priority: optional Section: devel Maintainer: ROCm Dev Support rocm-dev.support@amd.com Installed-Size: 13.3 kB

and i have added export ROCM_HOME=/opt/rocm export PATH="${ROCM_HOME}/bin:$PATH" in ~/.bashrc , that's all details

hongxiayang commented 3 days ago

what is your rocm6.1 image name?

chunniunai220ml commented 3 days ago

rocm 6.0 docker(the first question) i follow https://docs.vllm.ai/en/stable/getting_started/amd-installation.html ur instruction build a docker : docker build -f Dockerfile.rocm -t vllm-rocm . git clone https://github.com/vllm-project/vllm.git cd vll pip install -r requement_rocm.txt pip install -e . rocm6.1 is a private image, i can not give you docker source, but both them got same error, no rocm runtime found in /opt/rocm

hongxiayang commented 3 days ago

There is a comment in the here saying pip install does not work for amd currently when building vllm. Have you tried to use python setup.py install or python setup.py develop?

chunniunai220ml commented 2 days ago

There is a comment in the here saying pip install does not work for amd currently when building vllm. Have you tried to use python setup.py install or python setup.py develop?

yes, pip install doesn't work for some new models, like this issue https://github.com/vllm-project/vllm/issues/5454, author advised pip from source. and i found that, if pip install vllm==0.5.0, it seems based on nvidia defaultly. see the infomation: pip install vllm, but vllm uninstall original torch2.3.1+rocm6.0(pip install from torch officail), reinstall torch2.3.0(based nvidia)

i just want to know how can i build vllm from source for some codes change. i confirm that /opt/rocm/ exits

hongxiayang commented 2 days ago

please run python setup.py install instead of pip install . to build from source.

Say, you are already inside rocm-vllm container (which you mentioned earlier), you make some changes in /vllm-workspace, you can run python setup.py install, which will do build vllm from source. Alternatively for easy experiments, you can change python code directly in vllm installation directory by showing where it is installed using command like "pip show vllm" without needing to run python setup.py install.

vllm-project / vllm

[Installation]: pip install -e failed #5674

Your current environment

How you are installing vllm

export VLLM_INSTALL_PUNICA_KERNELS=1 # optionally build for multi-LoRA capability