huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.36k stars 948 forks source link

Encounter install error when install vllm package. #1862

Open for-just-we opened 2 months ago

for-just-we commented 2 months ago

System Info

Target: x86_64-unknown-linux-gnu Cargo version: 1.75.0 Commit sha: N/A Docker label: N/A nvidia-smi:

  +---------------------------------------------------------------------------------------+  
   | NVIDIA-SMI 535.171.04             Driver Version: 535.171.04   CUDA Version: 12.2     |  
   |-----------------------------------------+----------------------+----------------------+  
   | GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |  
   | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |  
   |                                         |                      |               MIG M. |  
   |=========================================+======================+======================|  
   |   0  NVIDIA RTX A6000               Off | 00000000:01:00.0 Off |                  Off |  
   | 52%   77C    P2             291W / 300W |  47471MiB / 49140MiB |     96%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
   |   1  NVIDIA RTX A6000               Off | 00000000:25:00.0 Off |                  Off |  
   | 38%   67C    P2             266W / 300W |  47527MiB / 49140MiB |     97%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
   |   2  NVIDIA RTX A6000               Off | 00000000:41:00.0 Off |                  Off |  
   | 48%   75C    P2             285W / 300W |  47483MiB / 49140MiB |     95%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
   |   3  NVIDIA RTX A6000               Off | 00000000:61:00.0 Off |                  Off |  
   | 41%   69C    P2             285W / 300W |  47454MiB / 49140MiB |     98%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
   |   4  NVIDIA RTX A6000               Off | 00000000:81:00.0 Off |                  Off |  
   | 49%   75C    P2             297W / 300W |  41417MiB / 49140MiB |    100%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
   |   5  NVIDIA RTX A6000               Off | 00000000:A1:00.0 Off |                  Off |  
   | 39%   66C    P2             295W / 300W |  39745MiB / 49140MiB |    100%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
|   6  NVIDIA RTX A6000               Off | 00000000:C1:00.0 Off |                  Off |  
   | 47%   73C    P2             293W / 300W |  23995MiB / 49140MiB |    100%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  
   |   7  NVIDIA RTX A6000               Off | 00000000:E1:00.0 Off |                  Off |  
   | 39%   66C    P2             293W / 300W |  35729MiB / 49140MiB |    100%      Default |  
   |                                         |                      |                  N/A |  
   +-----------------------------------------+----------------------+----------------------+  

Information

Tasks

Reproduction

TGI version: 2.0.2

create an new conda environment, running:

The commands 1-5 are executed normally. But command 6 ends with below error message, in which vllm is installed successfully but vllm-nccl-cu12 failed:

...
creating 'dist/vllm-0.4.1+cu122-py3.11-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing vllm-0.4.1+cu122-py3.11-linux-x86_64.egg
removing '/server9/cbj/programming/anaconda3/envs/tgi_server/lib/python3.11/site-packages/vllm-0.4.1+cu122-py3.11-linux-x86_64.egg' (and everything under it)
creating /server9/cbj/programming/anaconda3/envs/tgi_server/lib/python3.11/site-packages/vllm-0.4.1+cu122-py3.11-linux-x86_64.egg
Extracting vllm-0.4.1+cu122-py3.11-linux-x86_64.egg to /server9/cbj/programming/anaconda3/envs/tgi_server/lib/python3.11/site-packages
Adding vllm 0.4.1+cu122 to easy-install.pth file

Installed /server9/cbj/programming/anaconda3/envs/tgi_server/lib/python3.11/site-packages/vllm-0.4.1+cu122-py3.11-linux-x86_64.egg
Processing dependencies for vllm==0.4.1+cu122
Searching for vllm-nccl-cu12<2.19,>=2.18
Reading https://pypi.org/simple/vllm-nccl-cu12/
Downloading https://files.pythonhosted.org/packages/41/07/c1be8f4ffdc257646dda26470b803487150c732aa5c9f532dd789f186a54/vllm_nccl_cu12-2.18.1.0.4.0.tar.gz#sha256
=d56535da1b893ac49c1f40be9245f999e543c3fc95b4839642b70dd1d72760c0
Best match: vllm-nccl-cu12 2.18.1.0.4.0
Processing vllm_nccl_cu12-2.18.1.0.4.0.tar.gz
Writing /tmp/easy_install-c0u46qco/vllm_nccl_cu12-2.18.1.0.4.0/setup.cfg
Running vllm_nccl_cu12-2.18.1.0.4.0/setup.py -q bdist_egg --dist-dir /tmp/easy_install-c0u46qco/vllm_nccl_cu12-2.18.1.0.4.0/egg-dist-tmp-r7fzlcup
error: SandboxViolation: mkdir('/server9/cbj/.config/vllm/nccl/cu12', 511) {}

The package setup script has attempted to modify files on your system
that are not within the EasyInstall build area, and has been aborted.

This package cannot be safely installed by EasyInstall, and may not
support alternate installation locations even if you run its setup
script by hand.  Please inform the package's author and the EasyInstall
maintainers to find out if a fix or workaround is available.

After checking, I found the error command is python setup.py install. But I don't understand why vllm_nccl_cu12 will install to /server9/cbj/.config dir instead of my conda environment.

Note, I can successfully execute commands 1-6 in TGI 2.0.1 following Faster-LLM-Survey, installing flash-attention, vllm, ...

Expected behavior

vllm-nccl-cu12 could be installed normally

luoweiwei0908 commented 1 month ago

you need to install vllm by other way! like pip install vllm

github-actions[bot] commented 4 days ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.