vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.81k stars 4.5k forks source link

Building VLLM from source and running inference: No module named 'vllm._C' #3061

Open Lena-Jurkschat opened 8 months ago

Lena-Jurkschat commented 8 months ago

Hi, after building vllm from source, the following error occures when running a multi-gpu inference using a local ray instance:

File "vllm/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
    from vllm._C import ops
ModuleNotFoundError: No module named 'vllm._C'

I already checked Issue #1814, which does not help. So there is no additional vllm folder to delete, which could lead to confusion.

I run the following to build vllm:

export VLLM_USE_PRECOMPILED=false
git clone https://github.com/vllm-project/vllm.git
cd vllm
pip install -e .

I run the inference using

from langchain_community.llms import VLLM
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

llm = VLLM(model=model_name,
           trust_remote_code=True,  # mandatory for hf models
           max_new_tokens=100,
           top_k=top_k,
           top_p=top_p,
           temperature=temperature,
           tensor_parallel_size=2)

prompt = PromptTemplate(template=template, input_variables=["ques"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
llm_chain.run(ques)

However, building vllm via pip instead leads to an MPI error when running multi-gpu inference (probably due to version incompatiablity of MPI on my System and the prebuild vllm things?), so I wanted to build it from source.

(RayWorkerVllm pid=3391490) *** An error occurred in MPI_Init_thread
(RayWorkerVllm pid=3391490) *** on a NULL communicator
(RayWorkerVllm pid=3391490) *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
(RayWorkerVllm pid=3391490) ***    and potentially your MPI job)
(RayWorkerVllm pid=3391490) [i8006:3391490] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

Some Specs:

george-kuanli-peng commented 8 months ago

I have the same problem (vllm built from source):

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workspace/vllm/vllm/entrypoints/llm.py", line 109, in __init__
    self.llm_engine = LLMEngine.from_engine_args(engine_args)
  File "/workspace/vllm/vllm/engine/llm_engine.py", line 371, in from_engine_args
    engine = cls(*engine_configs,
  File "/workspace/vllm/vllm/engine/llm_engine.py", line 120, in __init__
    self._init_workers()
  File "/workspace/vllm/vllm/engine/llm_engine.py", line 143, in _init_workers
    from vllm.worker.worker import Worker
  File "/workspace/vllm/vllm/worker/worker.py", line 11, in <module>
    from vllm.model_executor import set_random_seed
  File "/workspace/vllm/vllm/model_executor/__init__.py", line 2, in <module>
    from vllm.model_executor.model_loader import get_model
  File "/workspace/vllm/vllm/model_executor/model_loader.py", line 10, in <module>
    from vllm.model_executor.weight_utils import (get_quant_config,
  File "/workspace/vllm/vllm/model_executor/weight_utils.py", line 18, in <module>
    from vllm.model_executor.layers.quantization import (get_quantization_config,
  File "/workspace/vllm/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
    from vllm.model_executor.layers.quantization.awq import AWQConfig
  File "/workspace/vllm/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
    from vllm._C import ops
ModuleNotFoundError: No module named 'vllm._C'
cocoderss commented 8 months ago

I have the same problem (vllm built from source):

Had the same issue too, it turned out because i had a folder named vllm in my working directory. So whenever I import vllm, I get this issue, so the solution is to run/import vllm when you are not in a directory that contains a folder named vllm.

Unfortunately I had another issue, so now gave up.

Lena-Jurkschat commented 8 months ago

Installing vllm==0.2.6 solves at least the No module named 'vllm._C' error but the downgrade is not nice.

When building vllm without precompiled (VLLM_USE_PRECOMPILED=false), it leads to thevllm._C unfound. Is there something missing in the setup.py then and a way to fix that?

ghost commented 7 months ago

I installed using "python setup.py install" and got this error. I fixed it with "python setup.py develop"

MojHnd commented 7 months ago

@Lena-Jurkschat @george-kuanli-peng Did you find the solution?

Lena-Jurkschat commented 7 months ago

@Lena-Jurkschat @george-kuanli-peng Did you find the solution?

Unfortunately, not. "python setup.py develop" did not work either in combination with VLLM_USE_PRECOMPILED=false.

MojHnd commented 7 months ago

@liangfu

I successfully installed vllm-0.4.0.post1+neuron213.

In setup.py, there is this function:

if not _is_neuron():
    ext_modules.append(CMakeExtension(name="vllm._C"))

and

cmdclass={"build_ext": cmake_build_ext} if not _is_neuron() else {},

So, vllm._C won't be created. This results in ModuleNotFoundError: No module named 'vllm._C'.

How to fix it?

MojHnd commented 7 months ago

@Lena-Jurkschat @george-kuanli-peng Did you find the solution?

Unfortunately, not. "python setup.py develop" did not work either in combination with VLLM_USE_PRECOMPILED=false.

I just solved it this way.

The problem is with from vllm._C import ops while there is no vllm._C. We need ops that exists in your_environment_name/lib/python3.10/site-packages/vllm/model_executor/layers/ (see the figure below) image

So, what we have to do is to change from vllm._C import ops to from vllm.model_executor.layers import ops in every single file of the package. This solves the problem :)

dongreenberg commented 7 months ago

I too am getting this error and not in a position to find and replace all the instances of 'vllm._C' in the code. Cc @liangfu

Hardware: inf2.8xlarge AMI: Neuron DLAMI us-east-1 (ami-0e0f965ee5cfbf89b) Versions:

aws-neuronx-runtime-discovery==2.9
libneuronxla==2.0.965
neuronx-cc==2.13.66.0+6dfecc895
torch-neuronx==2.1.2.2.1.0
transformers-neuronx==0.10.0.21
torch==2.1.2
torch-neuronx==2.1.2.2.1.0
torch-xla==2.1.2
adamrb commented 6 months ago

Was also seeing the same error on an inf2 instance with the latest release.

Running the following in the vLLM directory before installing with pip solved the issue for me.

find . -type f -exec sed -i 's/from vllm\._C import ops/from vllm.model_executor.layers import ops/g' {} +

I'm not sure if this is a solution for all distributions.

diff --git a/benchmarks/kernels/benchmark_aqlm.py b/benchmarks/kernels/benchmark_aqlm.py
index 9602d20..02c816b 100644
--- a/benchmarks/kernels/benchmark_aqlm.py
+++ b/benchmarks/kernels/benchmark_aqlm.py
@@ -6,7 +6,7 @@ from typing import Optional
 import torch
 import torch.nn.functional as F

-from vllm._C import ops
+from vllm.model_executor.layers import ops
 from vllm.model_executor.layers.quantization.aqlm import (
     dequantize_weight, generic_dequantize_gemm, get_int_dtype,
     optimized_dequantize_gemm)
diff --git a/vllm/_custom_ops.py b/vllm/_custom_ops.py
index e4b16ed..a7ae8b4 100644
--- a/vllm/_custom_ops.py
+++ b/vllm/_custom_ops.py
@@ -4,7 +4,7 @@ import torch

 try:
     from vllm._C import cache_ops as vllm_cache_ops
-    from vllm._C import ops as vllm_ops
+    from vllm.model_executor.layers import ops as vllm_ops
 except ImportError:
     pass

diff --git a/vllm/model_executor/layers/quantization/aqlm.py b/vllm/model_executor/layers/quantization/aqlm.py
index 6115b1d..566a9cf 100644
--- a/vllm/model_executor/layers/quantization/aqlm.py
+++ b/vllm/model_executor/layers/quantization/aqlm.py
@@ -8,7 +8,7 @@ import torch
 import torch.nn.functional as F
 from torch.nn.parameter import Parameter

-from vllm._C import ops
+from vllm.model_executor.layers import ops
 from vllm.model_executor.layers.linear import (LinearMethodBase,
                                                set_weight_attrs)
 from vllm.model_executor.layers.quantization.base_config import (
scao0208 commented 3 months ago

Hi guys I have one way to solve this problem. I found it because it used vllm module "/home/user/vllm/vllm" but not the env package "/home/user/miniconda3/envs/vllm/lib/pythonxx.xx/site-packages/vllm". So I just copy the .so compilation files under the "/home/user/vllm/vllm"

Untitled 2

This is the successful: success

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!