PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: Could not collect
CMake version: version 3.30.0
Libc version: glibc-2.31
Python version: 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-187-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 12.3.107
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration:
GPU 0: NVIDIA A40
GPU 1: NVIDIA A40
GPU 2: NVIDIA A40
GPU 3: NVIDIA A40
Nvidia driver version: 535.183.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 57 bits virtual
CPU(s): 96
On-line CPU(s) list: 0-95
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz
Stepping: 6
CPU MHz: 800.012
BogoMIPS: 4800.00
Virtualization: VT-x
.
.
.
How you are installing vllm
pip install -U vllm
It seems like in the recent few weeks a lot of crucial updates has been made to properly use vllm, which exist in version 0.6.1 but lacks in version 0.6.1.post2. However, the available version through pip is the old 0.6.1.post2.
For example, #8157 possibly fix issue #8553, which I am also having.
An update - After installing version 0.6.1 via pip, I am still having the error in issue #8553 when I try to initiate the model (which I downloaded already using Huggingface interface)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/ido.amit/miniconda3/envs/benchmark/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ido.amit/miniconda3/envs/benchmark/lib/python3.11/multiprocessing/spawn.py", line 132, in _main
self = reduction.pickle.load(from_parent)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'transformers_modules.microsoft.Phi-3'
ERROR 09-24 00:33:22 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 3840118 died, exit code: 1
INFO 09-24 00:33:22 multiproc_worker_utils.py:123] Killing local vLLM worker processes
Your current environment
How you are installing vllm
It seems like in the recent few weeks a lot of crucial updates has been made to properly use vllm, which exist in version 0.6.1 but lacks in version 0.6.1.post2. However, the available version through pip is the old 0.6.1.post2.
For example, #8157 possibly fix issue #8553, which I am also having.
An update - After installing version 0.6.1 via pip, I am still having the error in issue #8553 when I try to initiate the model (which I downloaded already using Huggingface interface)
Thanks in advance for the help!