Open Lena-Jurkschat opened 8 months ago
I have the same problem (vllm built from source):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/workspace/vllm/vllm/entrypoints/llm.py", line 109, in __init__
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/workspace/vllm/vllm/engine/llm_engine.py", line 371, in from_engine_args
engine = cls(*engine_configs,
File "/workspace/vllm/vllm/engine/llm_engine.py", line 120, in __init__
self._init_workers()
File "/workspace/vllm/vllm/engine/llm_engine.py", line 143, in _init_workers
from vllm.worker.worker import Worker
File "/workspace/vllm/vllm/worker/worker.py", line 11, in <module>
from vllm.model_executor import set_random_seed
File "/workspace/vllm/vllm/model_executor/__init__.py", line 2, in <module>
from vllm.model_executor.model_loader import get_model
File "/workspace/vllm/vllm/model_executor/model_loader.py", line 10, in <module>
from vllm.model_executor.weight_utils import (get_quant_config,
File "/workspace/vllm/vllm/model_executor/weight_utils.py", line 18, in <module>
from vllm.model_executor.layers.quantization import (get_quantization_config,
File "/workspace/vllm/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
from vllm.model_executor.layers.quantization.awq import AWQConfig
File "/workspace/vllm/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
from vllm._C import ops
ModuleNotFoundError: No module named 'vllm._C'
I have the same problem (vllm built from source):
Had the same issue too, it turned out because i had a folder named vllm in my working directory.
So whenever I import vllm, I get this issue, so the solution is to run/import vllm when you are not in a directory that contains a folder named vllm
.
Unfortunately I had another issue, so now gave up.
Installing vllm==0.2.6 solves at least the No module named 'vllm._C'
error but the downgrade is not nice.
When building vllm without precompiled (VLLM_USE_PRECOMPILED=false), it leads to thevllm._C
unfound. Is there something missing in the setup.py then and a way to fix that?
I installed using "python setup.py install" and got this error. I fixed it with "python setup.py develop"
@Lena-Jurkschat @george-kuanli-peng Did you find the solution?
@Lena-Jurkschat @george-kuanli-peng Did you find the solution?
Unfortunately, not. "python setup.py develop" did not work either in combination with VLLM_USE_PRECOMPILED=false.
@liangfu
I successfully installed vllm-0.4.0.post1+neuron213
.
In setup.py
, there is this function:
if not _is_neuron():
ext_modules.append(CMakeExtension(name="vllm._C"))
and
cmdclass={"build_ext": cmake_build_ext} if not _is_neuron() else {},
So, vllm._C
won't be created. This results in ModuleNotFoundError: No module named 'vllm._C'
.
How to fix it?
@Lena-Jurkschat @george-kuanli-peng Did you find the solution?
Unfortunately, not. "python setup.py develop" did not work either in combination with VLLM_USE_PRECOMPILED=false.
I just solved it this way.
The problem is with from vllm._C import ops
while there is no vllm._C
.
We need ops
that exists in your_environment_name/lib/python3.10/site-packages/vllm/model_executor/layers/
(see the figure below)
So, what we have to do is to change from vllm._C import ops
to from vllm.model_executor.layers import ops
in every single file of the package.
This solves the problem :)
I too am getting this error and not in a position to find and replace all the instances of 'vllm._C' in the code. Cc @liangfu
Hardware: inf2.8xlarge AMI: Neuron DLAMI us-east-1 (ami-0e0f965ee5cfbf89b) Versions:
aws-neuronx-runtime-discovery==2.9
libneuronxla==2.0.965
neuronx-cc==2.13.66.0+6dfecc895
torch-neuronx==2.1.2.2.1.0
transformers-neuronx==0.10.0.21
torch==2.1.2
torch-neuronx==2.1.2.2.1.0
torch-xla==2.1.2
Was also seeing the same error on an inf2 instance with the latest release.
Running the following in the vLLM directory before installing with pip solved the issue for me.
find . -type f -exec sed -i 's/from vllm\._C import ops/from vllm.model_executor.layers import ops/g' {} +
I'm not sure if this is a solution for all distributions.
diff --git a/benchmarks/kernels/benchmark_aqlm.py b/benchmarks/kernels/benchmark_aqlm.py
index 9602d20..02c816b 100644
--- a/benchmarks/kernels/benchmark_aqlm.py
+++ b/benchmarks/kernels/benchmark_aqlm.py
@@ -6,7 +6,7 @@ from typing import Optional
import torch
import torch.nn.functional as F
-from vllm._C import ops
+from vllm.model_executor.layers import ops
from vllm.model_executor.layers.quantization.aqlm import (
dequantize_weight, generic_dequantize_gemm, get_int_dtype,
optimized_dequantize_gemm)
diff --git a/vllm/_custom_ops.py b/vllm/_custom_ops.py
index e4b16ed..a7ae8b4 100644
--- a/vllm/_custom_ops.py
+++ b/vllm/_custom_ops.py
@@ -4,7 +4,7 @@ import torch
try:
from vllm._C import cache_ops as vllm_cache_ops
- from vllm._C import ops as vllm_ops
+ from vllm.model_executor.layers import ops as vllm_ops
except ImportError:
pass
diff --git a/vllm/model_executor/layers/quantization/aqlm.py b/vllm/model_executor/layers/quantization/aqlm.py
index 6115b1d..566a9cf 100644
--- a/vllm/model_executor/layers/quantization/aqlm.py
+++ b/vllm/model_executor/layers/quantization/aqlm.py
@@ -8,7 +8,7 @@ import torch
import torch.nn.functional as F
from torch.nn.parameter import Parameter
-from vllm._C import ops
+from vllm.model_executor.layers import ops
from vllm.model_executor.layers.linear import (LinearMethodBase,
set_weight_attrs)
from vllm.model_executor.layers.quantization.base_config import (
Hi guys I have one way to solve this problem. I found it because it used vllm module "/home/user/vllm/vllm" but not the env package "/home/user/miniconda3/envs/vllm/lib/pythonxx.xx/site-packages/vllm". So I just copy the .so compilation files under the "/home/user/vllm/vllm"
This is the successful:
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Hi, after building vllm from source, the following error occures when running a multi-gpu inference using a local ray instance:
I already checked Issue #1814, which does not help. So there is no additional vllm folder to delete, which could lead to confusion.
I run the following to build vllm:
I run the inference using
However, building vllm via pip instead leads to an MPI error when running multi-gpu inference (probably due to version incompatiablity of MPI on my System and the prebuild vllm things?), so I wanted to build it from source.
Some Specs: