NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
7.41k stars 800 forks source link

ModuleNotFoundError: No module named 'tensorrt_llm.bindings' after pip install uvicorn #1288

Open WuhanMonkey opened 3 months ago

WuhanMonkey commented 3 months ago

System Info

x86_64 Ubuntu20.04 A100x8 TRT-LLM version v0.9.0

Who can help?

No response

Information

Tasks

Reproduction

  1. Follow the guidance in homepage to setup TRT-LLM.
  2. Verify by python3 -c "import tensorrt_llm" and output the correct version.
  3. Follow the instruction on https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/server. Run python3 -m examples.server.server <path_to_tllm_engine_dir> <tokenizer_type> &
  4. Complains about uvicorn module not found
  5. pip install uvicorn
  6. After installation, running same command to start serving reports the following issue:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/powerop/work/rwq/TensorRT-LLM/tensorrt_llm/__init__.py", line 46, in <module>
    from .hlapi.llm import LLM, ModelConfig
  File "/home/powerop/work/rwq/TensorRT-LLM/tensorrt_llm/hlapi/__init__.py", line 1, in <module>
    from .llm import LLM, ModelConfig
  File "/home/powerop/work/rwq/TensorRT-LLM/tensorrt_llm/hlapi/llm.py", line 18, in <module>
    from ..executor import (GenerationExecutor, GenerationResult,
  File "/home/powerop/work/rwq/TensorRT-LLM/tensorrt_llm/executor.py", line 11, in <module>
    import tensorrt_llm.bindings as tllm
ModuleNotFoundError: No module named 'tensorrt_llm.bindings'

Expected behavior

Expect no import issue

actual behavior

missing module named

additional notes

Even after I uninstall the uvicorn, the issue still persist

HamidShojanazeri commented 3 months ago

it seems the following working

follow the readme here and here.

python3 server.py --model_dir /TensorRT-LLM/examples/llama/trt_engines/bf16/1-gpu/ --tokenizer_type /TensorRT-LLM/examples/llama/Llama-2-7b-ch
at-hf 
WuhanMonkey commented 3 months ago

Seems like if you use apt-get install uvicorn then no issues, but pip install uvicorn (which can be installed as part of the requirements.txt) will cause this issue. Likely a version mismatch.

github-actions[bot] commented 4 weeks ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."