vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.09k stars 3.82k forks source link

unable to run vllm model deployment #6464

Open riyajatar37003 opened 1 month ago

riyajatar37003 commented 1 month ago

Your current environment

Failed to import from vllm._C with ImportError("/usr/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_C.abi3.so)")

INFO 07-16 09:29:50 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager (VllmWorkerProcess pid=658) INFO 07-16 09:29:52 multiproc_worker_utils.py:215] Worker ready; awaiting tasks (VllmWorkerProcess pid=656) INFO 07-16 09:29:52 multiproc_worker_utils.py:215] Worker ready; awaiting tasks (VllmWorkerProcess pid=657) INFO 07-16 09:29:53 multiproc_worker_utils.py:215] Worker ready; awaiting tasks INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=656) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=658) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=657) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=656) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=658) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=657) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=658) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB (VllmWorkerProcess pid=656) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB (VllmWorkerProcess pid=657) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last): (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last): ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last): (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run() ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(args, kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run() (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, kwargs) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run() (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, *kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(args, kwargs) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, kwargs) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, *kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable( (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(args, kwargs) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable( (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(args, kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states, (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states, (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, *kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, *kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(args, kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, kwargs) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, *kwargs) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(args, kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, *kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(args, kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm( (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm( (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, *kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(args, kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError( (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] rank0: Traceback (most recent call last): rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/runpy.py", line 196, in _run_module_as_main rank0: return _run_code(code, main_globals, None, rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/runpy.py", line 86, in _run_code rank0: exec(code, run_globals) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 282, in

rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 224, in run_server rank0: if llm_engine is not None else AsyncLLMEngine.from_engine_args( rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 444, in from_engine_args rank0: engine = cls( rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 373, in init rank0: self.engine = self._init_engine(*args, *kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 520, in _init_engine rank0: return engine_class(args, **kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 263, in init

rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 362, in _initialize_kv_caches

rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/distributed_gpu_executor.py", line 38, in determine_num_available_blocks rank0: num_blocks = self._run_workers("determine_num_available_blocks", ) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_gpu_executor.py", line 135, in _run_workers rank0: driver_worker_output = driver_worker_method(*args, *kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank0: return func(args, **kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks

rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank0: return func(*args, kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run rank0: self.execute_model(model_input, kv_caches, intermediate_tensors) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank0: return func(*args, *kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model rank0: hidden_or_intermediate_states = model_executable( rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward rank0: hidden_states = self.model(input_ids, positions, kv_caches, rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(*args, *kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(args, kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward rank0: hidden_states, residual = layer(positions, hidden_states, rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(*args, kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, *kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward rank0: hidden_states = self.input_layernorm(hidden_states) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, *kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward rank0: return self._forward_method(args, **kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda

rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper rank0: raise e rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper rank0: return fn(*args, **kwargs) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm rank0: torch.ops._C.rms_norm(out, input, weight, epsilon) rank0: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr rank0: raise AttributeError(

ERROR 07-16 09:31:46 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 658 died, exit code: -15 INFO 07-16 09:31:46 multiproc_worker_utils.py:123] Killing local vLLM worker processes /tmp/.conda/envs/vllm_env/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

🐛 Describe the bug

tried to install using pip install vllm

yumaofan commented 1 month ago

The same error when running any model. Install VLLM via pip directly.

riyajatar37003 commented 1 month ago

did the same only

wheresmyhair commented 1 month ago

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

riyajatar37003 commented 1 month ago

i am trying graphrag with vllm deployed model but i am getting this error

ERROR 07-16 12:08:18 api_server.py:247] Error in applying chat template from request: Conversation roles must alternate user/assistant/user/assistant/...

JaheimLee commented 1 month ago

Same issue. And vllm 0.5.1 works well.

Rogersiy commented 1 month ago

Same issue. And vllm 0.5.1 works well.

Thannnnnnk u

WMeng1 commented 1 month ago

Same issue.

vlsav commented 1 month ago

look at https://github.com/vllm-project/vllm/issues/6462#issuecomment-2234006925 it resolves my iisue

rzes commented 1 month ago

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

works well!!

zichaow commented 1 month ago

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

encounter the same issue and can confirm this works for me, too

AlexBlack2202 commented 1 month ago

hello , any one find any solution about this problem?

vlsav commented 1 month ago

hello , any one find any solution about this problem?

https://github.com/vllm-project/vllm/issues/6464#issuecomment-2235595670

heya5 commented 1 month ago

Delete the directory named "vllm" resolves my issue. I find the method from this comment https://github.com/vllm-project/vllm/issues/1814#issuecomment-1837122930

lonngxiang commented 3 weeks ago

0.5.4 same error

DreamerZhang11 commented 2 days ago

why the source build have so many problem, i meet the same error.. Have it fix