NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.23k stars 913 forks source link

Issue building TensorRT-LLM code in Python #840

Open CHesketh76 opened 8 months ago

CHesketh76 commented 8 months ago

I trying to setup the TensorRT-LLM on a google Colab environment, but after I run this line of code

!python3 ./scripts/build_wheel.py --trt_root /usr/local/tensorrt

I get this error message.

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com/
Requirement already satisfied: build in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 1)) (1.0.3)
Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 2)) (2.1.0+cu121)
Requirement already satisfied: transformers==4.31.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 3)) (4.31.0)
Requirement already satisfied: diffusers==0.15.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 4)) (0.15.0)
Requirement already satisfied: accelerate==0.20.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 5)) (0.20.3)
Requirement already satisfied: colored in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 6)) (2.2.4)
Requirement already satisfied: polygraphy in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 7)) (0.49.0)
Requirement already satisfied: onnx>=1.12.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 8)) (1.15.0)
Requirement already satisfied: mpi4py in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 9)) (3.1.5)
Requirement already satisfied: tensorrt>=8.6.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 10)) (8.6.1.post1)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 11)) (1.23.5)
Requirement already satisfied: cuda-python==12.2.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 12)) (12.2.0)
Requirement already satisfied: sentencepiece>=0.1.99 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 13)) (0.1.99)
Requirement already satisfied: wheel in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 14)) (0.42.0)
Requirement already satisfied: lark in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 15)) (1.1.8)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (3.13.1)
Requirement already satisfied: huggingface-hub<1.0,>=0.14.1 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (0.20.1)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (23.2)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (2023.6.3)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (2.31.0)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (0.13.3)
Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (0.4.1)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers==4.31.0->-r requirements.txt (line 3)) (4.66.1)
Requirement already satisfied: Pillow in /usr/local/lib/python3.10/dist-packages (from diffusers==0.15.0->-r requirements.txt (line 4)) (9.4.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.10/dist-packages (from diffusers==0.15.0->-r requirements.txt (line 4)) (7.0.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate==0.20.3->-r requirements.txt (line 5)) (5.9.5)
Requirement already satisfied: cython in /usr/local/lib/python3.10/dist-packages (from cuda-python==12.2.0->-r requirements.txt (line 12)) (3.0.7)
Requirement already satisfied: pyproject_hooks in /usr/local/lib/python3.10/dist-packages (from build->-r requirements.txt (line 1)) (1.0.0)
Requirement already satisfied: tomli>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from build->-r requirements.txt (line 1)) (2.0.1)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch->-r requirements.txt (line 2)) (4.5.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->-r requirements.txt (line 2)) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->-r requirements.txt (line 2)) (3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->-r requirements.txt (line 2)) (3.1.2)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->-r requirements.txt (line 2)) (2023.6.0)
Requirement already satisfied: triton==2.1.0 in /usr/local/lib/python3.10/dist-packages (from torch->-r requirements.txt (line 2)) (2.1.0)
Requirement already satisfied: protobuf>=3.20.2 in /usr/local/lib/python3.10/dist-packages (from onnx>=1.12.0->-r requirements.txt (line 8)) (3.20.3)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata->diffusers==0.15.0->-r requirements.txt (line 4)) (3.17.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->-r requirements.txt (line 2)) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.31.0->-r requirements.txt (line 3)) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.31.0->-r requirements.txt (line 3)) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.31.0->-r requirements.txt (line 3)) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.31.0->-r requirements.txt (line 3)) (2023.11.17)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->-r requirements.txt (line 2)) (1.3.0)
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- NVTX is disabled
-- Importing batch manager
-- Building PyTorch
-- Building Google tests
-- Building benchmarks
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- CUDA compiler: /usr/local/cuda/bin/nvcc
-- GPU architectures: 70-real;80-real;86-real;89-real;90-real
-- The CUDA compiler identification is NVIDIA 12.2.140
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.2.140") 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- ========================= Importing and creating target nvinfer ==========================
-- Looking for library nvinfer
-- Library that was found nvinfer_LIB_PATH-NOTFOUND
-- ==========================================================================================
-- ========================= Importing and creating target nvuffparser ==========================
-- Looking for library nvparsers
-- Library that was found nvparsers_LIB_PATH-NOTFOUND
-- ==========================================================================================
-- CUDAToolkit_VERSION 12.2 is greater or equal than 11.0, enable -DENABLE_BF16 flag
-- CUDAToolkit_VERSION 12.2 is greater or equal than 11.8, enable -DENABLE_FP8 flag
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- COMMON_HEADER_DIRS: /content/TensorRT-LLM/cpp;/usr/local/cuda/include
-- TORCH_CUDA_ARCH_LIST: 7.0;8.0;8.6;8.9;9.0
-- Found Python3: /usr/local/bin/python (found version "3.10.12") found components: Interpreter Development Development.Module Development.Embed 
-- Found Python executable at /usr/local/bin/python
-- Found Python libraries at /usr/lib/x86_64-linux-gnu
-- Found CUDA: /usr/local/cuda (found version "12.2") 
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.2.140") 
-- Caffe2: CUDA detected: 12.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 12.2
-- /usr/local/cuda-12.2/targets/x86_64-linux/lib/libnvrtc.so shorthash is 000ca627
-- USE_CUDNN is set to 0. Compiling without cuDNN support
-- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
-- Added CUDA NVCC flags for: -gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90
CMake Warning at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:281 (find_package)

-- Found Torch: /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch.so  
-- TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=0
CMake Error at CMakeLists.txt:288 (file):
  file STRINGS file "/usr/local/tensorrt/include/NvInferVersion.h" cannot be
  read.

CMake Error at CMakeLists.txt:291 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:293 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:291 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:293 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:291 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:293 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:291 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:293 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:297 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:299 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:297 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:299 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:297 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:299 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

-- Building for TensorRT version: .., library version: 
-- Using MPI_CXX_INCLUDE_DIRS: /usr/lib/x86_64-linux-gnu/openmpi/include;/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi
-- Using MPI_CXX_LIBRARIES: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- USE_CXX11_ABI: False
CMake Error at tensorrt_llm/plugins/CMakeLists.txt:106 (set_target_properties):
  set_target_properties called with incorrect number of arguments.

-- The C compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Python: /usr/local/bin/python (found version "3.10.12") found components: Interpreter 
-- ========================= Importing and creating target nvonnxparser ==========================
-- Looking for library nvonnxparser
-- Library that was found nvonnxparser_LIB_PATH-NOTFOUND
-- ==========================================================================================
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/content/TensorRT-LLM/./scripts/build_wheel.py", line 248, in <module>
    main(**vars(args))
  File "/content/TensorRT-LLM/./scripts/build_wheel.py", line 149, in main
    build_run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'cmake -DCMAKE_BUILD_TYPE="Release" -DBUILD_PYT="ON"  -DTRT_LIB_DIR=/usr/local/tensorrt/targets/x86_64-linux-gnu/lib -DTRT_INCLUDE_DIR=/usr/local/tensorrt/include -S "/content/TensorRT-LLM/cpp"' returned non-zero exit status 1.
Gitnameisname commented 8 months ago

did you installed NCCL? https://developer.nvidia.com/nccl

byshiue commented 8 months ago

Could you try the docker image we suggest in the README?

CHesketh76 commented 8 months ago

@byshiue I am hoping to get this working without using a docker. I eventually want to move this work over to my company computer but the company that I work for doesn't allow dockers.

pipul commented 4 months ago

@CHesketh76 这个问题你们解决了吗?

CHesketh76 commented 4 months ago

@pipul 还没有.

byshiue commented 4 months ago

Could you find the tensorrt in /usr/local/tensorrt/include/NvInferVersion.h? If you have installed the tensorrt but not find it in the path, you might need to setup the TRT path by --trt_root like

python3 ./scripts/build_wheel.py --trt_root /usr/local/tensorrt

For error

CMake Error at CMakeLists.txt:299 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

it might be caused by different version of cmake. You could install the cmake used by TensorRT-LLM, or change the related codes by yourself.

letmerecall commented 2 months ago

Any luck @CHesketh76? Facing a similar issue.

jasonngap1 commented 2 weeks ago

Hi receiving the same error as well.