vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
24.16k stars 3.49k forks source link

v0.3.3 - Module not found when deploying modes to Inferentia2/NeuronSDK #3284

Open samir-souza opened 4 months ago

samir-souza commented 4 months ago

Error when running sample python3 examples/offline_inference_neuron.py, after installing v0.3.3 (from cloned source or from pip install git+...).

Cause:

directory vllm/model_executor/models/neuron/ is not copied to expected path: /opt/conda/lib/python3.10/site-packages/vllm/model_executor/models/neuron/ during package instalation.

    module = importlib.import_module(
  File "/opt/conda/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'vllm.model_executor.models.neuron'

Workaround:

Manually copy the file vllm/model_executor/models/neuron/llama.py to /opt/conda/lib/python3.10/site-packages/vllm/model_executor/models/neuron/llama.py after pip installing vllm. After that, everything works fine.

Could you fix that in the package installation, please?

Installed packages

vllm 0.3.2+neuron212

torch 2.1.2 torch-model-archiver 0.9.0 torch-xla 2.1.1 torchserve 0.9.0 torchvision 0.16.2

aws-neuronx-runtime-discovery 2.9 libneuronxla 2.0.755 neuronx-cc 2.12.68.0+4480452af neuronx-distributed 0.6.0 neuronx-hwm 2.12.0.0+422c9037c optimum-neuron 0.0.20 torch-neuronx 2.1.1.2.0.1b0 transformers-neuronx 0.9.474 vllm 0.3.2+neuron212

liangfu commented 4 months ago

From vllm 0.3.2+neuron212, I assume you are using v0.3.2, not v0.3.3, right?

I manually downloaded v0.3.3 from

https://github.com/vllm-project/vllm/archive/refs/tags/v0.3.3.tar.gz

, and the downloaded file shows neuron/llama.py file, and the installation works fine.

Just need to upgrade to v0.3.3?

liangfu commented 4 months ago

For now, we need to manually copy the neuron/llama.py file into the corresponding location in site-packages directory. Besides, we need to add __init__.py file in the neuron directory, so that the directory would be copied in the pip install process.