NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.11k stars 895 forks source link

V0.6.1 conversion script failed for python3.9 #680

Open liyu-tan opened 8 months ago

liyu-tan commented 8 months ago

I was trying to convert the Llama model by using python3.9, but I keep getting the error:


Traceback (most recent call last):
  File "/home/liyu.tan/uber-triton-workspace/src/tensorrtllm_backend/tensorrt_llm/examples/llama/build.py", line 33, in <module>
    from weight import (get_scaling_factors, load_from_awq_llama, load_from_binary,
  File "/home/liyu.tan/uber-triton-workspace/src/tensorrtllm_backend/tensorrt_llm/examples/llama/weight.py", line 24, in <module>
    import tensorrt_llm
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/__init__.py", line 15, in <module>
    import tensorrt_llm.functional as functional
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/functional.py", line 29, in <module>
    from . import graph_rewriting as gw
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/graph_rewriting.py", line 11, in <module>
    from .logger import logger
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/logger.py", line 147, in <module>
    logger = Logger()
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/logger.py", line 33, in __call__
    cls._instances[cls] = super(Singleton,
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/logger.py", line 66, in __init__
    self.mpi_rank = mpi_rank()
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/_utils.py", line 221, in mpi_rank
    return mpi_comm().Get_rank()
  File "/home/liyu.tan/trt_12_16/lib/python3.9/site-packages/tensorrt_llm/_utils.py", line 216, in mpi_comm
    from mpi4py import MPI
ImportError: /usr/lib/libmpi.so.40: undefined symbol: mca_common_sm_fini```

After I change to python3.10, it is working. Wonder if that is the known issue or not?
byshiue commented 8 months ago

You could try re-installing the mpi4py, but it might affect other libs. So, we recommand using the docker image we provide to prevent such environment setting issue.