NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.31k stars 931 forks source link

build_wheel.py error #2145

Closed Beatlesso closed 1 month ago

Beatlesso commented 1 month ago
python3 ./scripts/build_wheel.py --cuda_architectures "80-real;86-real" --trt_root /usr/local/tensorrt
[  0%] Generating .check_symbol
[  0%] Generating .check_symbol_executor
[  1%] Built target fb_gemm_src
[  1%] Built target gemm_swiglu_sm90_src
[  1%] Built target cutlass_src
[  1%] Built target check_symbol
[  1%] Built target check_symbol_executor
[  2%] Built target selective_scan_src
[  2%] Built target layers_src
[  3%] Built target common_src
[  4%] Built target moe_gemm_src
[  5%] Built target fpA_intB_gemm_src
[  6%] Built target runtime_src
[ 26%] Built target context_attention_src
[ 28%] Built target decoder_attention
[ 42%] Built target decoder_attention_src
[100%] Built target kernels_src
[100%] Built target tensorrt_llm
[  0%] Generating .check_symbol_executor
[  0%] Generating .check_symbol
[  0%] Built target check_symbol_executor
[  1%] Built target gemm_swiglu_sm90_src
[  1%] Built target selective_scan_src
[  1%] Built target check_symbol
[  1%] Built target layers_src
[  1%] Built target cutlass_src
[  2%] Built target fb_gemm_src
[  3%] Built target common_src
[  4%] Built target moe_gemm_src
[  5%] Built target runtime_src
[  6%] Built target fpA_intB_gemm_src
[ 26%] Built target context_attention_src
[ 28%] Built target decoder_attention
[ 42%] Built target decoder_attention_src
[ 98%] Built target kernels_src
[ 98%] Built target tensorrt_llm
[100%] Built target nvinfer_plugin_tensorrt_llm
gmake: *** No rule to make target 'th_common'.  Stop.
Traceback (most recent call last):
  File "/home/lyc/TensorRT-LLM/./scripts/build_wheel.py", line 404, in <module>
    main(**vars(args))
  File "/home/lyc/TensorRT-LLM/./scripts/build_wheel.py", line 195, in main
    build_run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'cmake --build . --config Release --parallel 80 --target tensorrt_llm nvinfer_plugin_tensorrt_llm th_common bindings   executorWorker  ' returned non-zero exit status 2.

Why is this happening? Can anyone help me? Thank you very much.

yuhengxnv commented 1 month ago

Version numbers of: TensorRT? Cuda? GCC? Can you put your full build log?

Beatlesso commented 1 month ago

Version numbers of: TensorRT? Cuda? GCC? Can you put your full build log? The missing logs:

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple, https://pypi.nvidia.com
Requirement already satisfied: accelerate>=0.25.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 2)) (0.33.0)
Requirement already satisfied: build in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 3)) (1.2.1)
Requirement already satisfied: colored in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 4)) (2.2.4)
Requirement already satisfied: cuda-python in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 5)) (12.6.0)
Requirement already satisfied: diffusers>=0.27.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 6)) (0.30.0)
Requirement already satisfied: lark in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 7)) (1.2.2)
Requirement already satisfied: mpi4py in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 8)) (4.0.0)
Requirement already satisfied: numpy<2 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 9)) (1.26.4)
Requirement already satisfied: onnx>=1.12.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 10)) (1.16.2)
Requirement already satisfied: polygraphy in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 11)) (0.49.9)
Requirement already satisfied: psutil in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 12)) (6.0.0)
Requirement already satisfied: pynvml>=11.5.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 13)) (11.5.3)
Requirement already satisfied: pulp in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 14)) (2.9.0)
Requirement already satisfied: pandas in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 15)) (2.2.2)
Requirement already satisfied: h5py==3.10.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 16)) (3.10.0)
Requirement already satisfied: StrEnum in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 17)) (0.4.15)
Requirement already satisfied: sentencepiece>=0.1.99 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 18)) (0.2.0)
Requirement already satisfied: tensorrt~=10.3.0 in /usr/lib/python3.10/dist-packages (from -r requirements.txt (line 19)) (10.3.0)
Requirement already satisfied: torch<=2.4.0,>=2.4.0a0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 22)) (2.4.0)
Requirement already satisfied: nvidia-modelopt~=0.15.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 23)) (0.15.1)
Requirement already satisfied: transformers<=4.42.4,>=4.38.2 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 24)) (4.42.4)
Requirement already satisfied: pillow==10.3.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 25)) (10.3.0)
Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from -r requirements.txt (line 26)) (0.37.1)
Requirement already satisfied: optimum in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 27)) (1.21.4)
Requirement already satisfied: evaluate in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 28)) (0.4.2)
Requirement already satisfied: janus in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 29)) (1.0.0)
Requirement already satisfied: mpmath>=1.3.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 30)) (1.3.0)
Requirement already satisfied: click in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 31)) (8.1.7)
Requirement already satisfied: click_option_group in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 32)) (0.5.6)
Requirement already satisfied: aenum in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements.txt (line 33)) (3.1.15)
Requirement already satisfied: datasets==2.19.2 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 2)) (2.19.2)
Requirement already satisfied: einops in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 3)) (0.8.0)
Requirement already satisfied: graphviz in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 4)) (0.20.3)
Requirement already satisfied: mypy in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 5)) (1.11.1)
Requirement already satisfied: parameterized in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 6)) (0.9.0)
Requirement already satisfied: pre-commit in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 7)) (3.8.0)
Requirement already satisfied: pybind11 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 8)) (2.13.5)
Requirement already satisfied: pybind11-stubgen in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 9)) (2.5.1)
Requirement already satisfied: pytest-cov in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 10)) (5.0.0)
Requirement already satisfied: pytest-forked in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 11)) (1.6.0)
Requirement already satisfied: pytest-xdist in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 12)) (3.6.1)
Requirement already satisfied: rouge_score in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 13)) (0.1.2)
Requirement already satisfied: cloudpickle in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 14)) (3.0.0)
Requirement already satisfied: typing-extensions==4.8.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 15)) (4.8.0)
Requirement already satisfied: bandit==1.7.7 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 16)) (1.7.7)
Requirement already satisfied: jsonlines==4.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 17)) (4.0.0)
Requirement already satisfied: jieba==0.42.1 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 18)) (0.42.1)
Requirement already satisfied: rouge==1.0.1 in /home/lyc/.local/lib/python3.10/site-packages (from -r requirements-dev.txt (line 19)) (1.0.1)
Requirement already satisfied: multiprocess in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (0.70.16)
Requirement already satisfied: fsspec[http]<=2024.3.1,>=2023.1.0 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (2024.3.1)
Requirement already satisfied: huggingface-hub>=0.21.2 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (0.24.6)
Requirement already satisfied: pyyaml>=5.1 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (6.0.2)
Requirement already satisfied: aiohttp in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (3.10.5)
Requirement already satisfied: pyarrow>=12.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (17.0.0)
Requirement already satisfied: xxhash in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (3.5.0)
Requirement already satisfied: packaging in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (24.1)
Requirement already satisfied: filelock in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (3.15.4)
Requirement already satisfied: pyarrow-hotfix in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (0.6)
Requirement already satisfied: tqdm>=4.62.1 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (4.66.5)
Requirement already satisfied: requests>=2.32.1 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (2.32.3)
Requirement already satisfied: dill<0.3.9,>=0.3.0 in /home/lyc/.local/lib/python3.10/site-packages (from datasets==2.19.2->-r requirements-dev.txt (line 2)) (0.3.8)
Requirement already satisfied: stevedore>=1.20.0 in /home/lyc/.local/lib/python3.10/site-packages (from bandit==1.7.7->-r requirements-dev.txt (line 16)) (5.3.0)
Requirement already satisfied: rich in /home/lyc/.local/lib/python3.10/site-packages (from bandit==1.7.7->-r requirements-dev.txt (line 16)) (13.7.1)
Requirement already satisfied: attrs>=19.2.0 in /home/lyc/.local/lib/python3.10/site-packages (from jsonlines==4.0.0->-r requirements-dev.txt (line 17)) (24.2.0)
Requirement already satisfied: six in /home/lyc/.local/lib/python3.10/site-packages (from rouge==1.0.1->-r requirements-dev.txt (line 19)) (1.16.0)
Requirement already satisfied: safetensors>=0.3.1 in /home/lyc/.local/lib/python3.10/site-packages (from accelerate>=0.25.0->-r requirements.txt (line 2)) (0.4.4)
Requirement already satisfied: pyproject_hooks in /home/lyc/.local/lib/python3.10/site-packages (from build->-r requirements.txt (line 3)) (1.1.0)
Requirement already satisfied: tomli>=1.1.0 in /home/lyc/.local/lib/python3.10/site-packages (from build->-r requirements.txt (line 3)) (2.0.1)
Requirement already satisfied: importlib-metadata in /home/lyc/.local/lib/python3.10/site-packages (from diffusers>=0.27.0->-r requirements.txt (line 6)) (8.4.0)
Requirement already satisfied: regex!=2019.12.17 in /home/lyc/.local/lib/python3.10/site-packages (from diffusers>=0.27.0->-r requirements.txt (line 6)) (2024.7.24)
Requirement already satisfied: protobuf>=3.20.2 in /home/lyc/.local/lib/python3.10/site-packages (from onnx>=1.12.0->-r requirements.txt (line 10)) (5.27.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/lyc/.local/lib/python3.10/site-packages (from pandas->-r requirements.txt (line 15)) (2.9.0.post0)
Requirement already satisfied: tzdata>=2022.7 in /home/lyc/.local/lib/python3.10/site-packages (from pandas->-r requirements.txt (line 15)) (2024.1)
Requirement already satisfied: pytz>=2020.1 in /home/lyc/.local/lib/python3.10/site-packages (from pandas->-r requirements.txt (line 15)) (2024.1)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (11.4.5.107)
Requirement already satisfied: jinja2 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (3.1.4)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (10.3.2.106)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.1.3.1)
Requirement already satisfied: nvidia-nccl-cu12==2.20.5 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (2.20.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.1.105)
Requirement already satisfied: triton==3.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (3.0.0)
Requirement already satisfied: nvidia-cudnn-cu12==9.1.0.70 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (9.1.0.70)
Requirement already satisfied: sympy in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (1.13.2)
Requirement already satisfied: networkx in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (3.3)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.1.105)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (11.0.2.54)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.1.105)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.1.0.106)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/lyc/.local/lib/python3.10/site-packages (from torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.1.105)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/lyc/.local/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (12.6.20)
Requirement already satisfied: ninja in /home/lyc/.local/lib/python3.10/site-packages (from nvidia-modelopt~=0.15.0->-r requirements.txt (line 23)) (1.11.1.1)
Requirement already satisfied: scipy in /home/lyc/.local/lib/python3.10/site-packages (from nvidia-modelopt~=0.15.0->-r requirements.txt (line 23)) (1.14.1)
Requirement already satisfied: pydantic>=2.0 in /home/lyc/.local/lib/python3.10/site-packages (from nvidia-modelopt~=0.15.0->-r requirements.txt (line 23)) (2.8.2)
Requirement already satisfied: tokenizers<0.20,>=0.19 in /home/lyc/.local/lib/python3.10/site-packages (from transformers<=4.42.4,>=4.38.2->-r requirements.txt (line 24)) (0.19.1)
Requirement already satisfied: coloredlogs in /home/lyc/.local/lib/python3.10/site-packages (from optimum->-r requirements.txt (line 27)) (15.0.1)
Requirement already satisfied: mypy-extensions>=1.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from mypy->-r requirements-dev.txt (line 5)) (1.0.0)
Requirement already satisfied: nodeenv>=0.11.1 in /home/lyc/.local/lib/python3.10/site-packages (from pre-commit->-r requirements-dev.txt (line 7)) (1.9.1)
Requirement already satisfied: cfgv>=2.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from pre-commit->-r requirements-dev.txt (line 7)) (3.4.0)
Requirement already satisfied: virtualenv>=20.10.0 in /home/lyc/.local/lib/python3.10/site-packages (from pre-commit->-r requirements-dev.txt (line 7)) (20.26.3)
Requirement already satisfied: identify>=1.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from pre-commit->-r requirements-dev.txt (line 7)) (2.6.0)
Requirement already satisfied: coverage[toml]>=5.2.1 in /home/lyc/.local/lib/python3.10/site-packages (from pytest-cov->-r requirements-dev.txt (line 10)) (7.6.1)
Requirement already satisfied: pytest>=4.6 in /home/lyc/.local/lib/python3.10/site-packages (from pytest-cov->-r requirements-dev.txt (line 10)) (8.3.2)
Requirement already satisfied: py in /home/lyc/.local/lib/python3.10/site-packages (from pytest-forked->-r requirements-dev.txt (line 11)) (1.11.0)
Requirement already satisfied: execnet>=2.1 in /home/lyc/.local/lib/python3.10/site-packages (from pytest-xdist->-r requirements-dev.txt (line 12)) (2.1.1)
Requirement already satisfied: absl-py in /home/lyc/.local/lib/python3.10/site-packages (from rouge_score->-r requirements-dev.txt (line 13)) (2.1.0)
Requirement already satisfied: nltk in /home/lyc/.local/lib/python3.10/site-packages (from rouge_score->-r requirements-dev.txt (line 13)) (3.9.1)
Requirement already satisfied: aiosignal>=1.1.2 in /home/lyc/.local/lib/python3.10/site-packages (from aiohttp->datasets==2.19.2->-r requirements-dev.txt (line 2)) (1.3.1)
Requirement already satisfied: frozenlist>=1.1.1 in /home/lyc/.local/lib/python3.10/site-packages (from aiohttp->datasets==2.19.2->-r requirements-dev.txt (line 2)) (1.4.1)
Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /home/lyc/.local/lib/python3.10/site-packages (from aiohttp->datasets==2.19.2->-r requirements-dev.txt (line 2)) (2.4.0)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/lyc/.local/lib/python3.10/site-packages (from aiohttp->datasets==2.19.2->-r requirements-dev.txt (line 2)) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /home/lyc/.local/lib/python3.10/site-packages (from aiohttp->datasets==2.19.2->-r requirements-dev.txt (line 2)) (4.0.3)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/lyc/.local/lib/python3.10/site-packages (from aiohttp->datasets==2.19.2->-r requirements-dev.txt (line 2)) (6.0.5)
Requirement already satisfied: annotated-types>=0.4.0 in /home/lyc/.local/lib/python3.10/site-packages (from pydantic>=2.0->nvidia-modelopt~=0.15.0->-r requirements.txt (line 23)) (0.7.0)
Requirement already satisfied: pydantic-core==2.20.1 in /home/lyc/.local/lib/python3.10/site-packages (from pydantic>=2.0->nvidia-modelopt~=0.15.0->-r requirements.txt (line 23)) (2.20.1)
Requirement already satisfied: iniconfig in /home/lyc/.local/lib/python3.10/site-packages (from pytest>=4.6->pytest-cov->-r requirements-dev.txt (line 10)) (2.0.0)
Requirement already satisfied: pluggy<2,>=1.5 in /home/lyc/.local/lib/python3.10/site-packages (from pytest>=4.6->pytest-cov->-r requirements-dev.txt (line 10)) (1.5.0)
Requirement already satisfied: exceptiongroup>=1.0.0rc8 in /home/lyc/.local/lib/python3.10/site-packages (from pytest>=4.6->pytest-cov->-r requirements-dev.txt (line 10)) (1.2.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/lyc/.local/lib/python3.10/site-packages (from requests>=2.32.1->datasets==2.19.2->-r requirements-dev.txt (line 2)) (2.2.2)
Requirement already satisfied: idna<4,>=2.5 in /home/lyc/.local/lib/python3.10/site-packages (from requests>=2.32.1->datasets==2.19.2->-r requirements-dev.txt (line 2)) (3.7)
Requirement already satisfied: certifi>=2017.4.17 in /home/lyc/.local/lib/python3.10/site-packages (from requests>=2.32.1->datasets==2.19.2->-r requirements-dev.txt (line 2)) (2024.7.4)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/lyc/.local/lib/python3.10/site-packages (from requests>=2.32.1->datasets==2.19.2->-r requirements-dev.txt (line 2)) (3.3.2)
Requirement already satisfied: pbr>=2.0.0 in /home/lyc/.local/lib/python3.10/site-packages (from stevedore>=1.20.0->bandit==1.7.7->-r requirements-dev.txt (line 16)) (6.0.0)
Requirement already satisfied: distlib<1,>=0.3.7 in /home/lyc/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit->-r requirements-dev.txt (line 7)) (0.3.8)
Requirement already satisfied: platformdirs<5,>=3.9.1 in /home/lyc/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit->-r requirements-dev.txt (line 7)) (4.2.2)
Requirement already satisfied: humanfriendly>=9.1 in /home/lyc/.local/lib/python3.10/site-packages (from coloredlogs->optimum->-r requirements.txt (line 27)) (10.0)
Requirement already satisfied: zipp>=0.5 in /home/lyc/.local/lib/python3.10/site-packages (from importlib-metadata->diffusers>=0.27.0->-r requirements.txt (line 6)) (3.20.0)
Requirement already satisfied: MarkupSafe>=2.0 in /home/lyc/.local/lib/python3.10/site-packages (from jinja2->torch<=2.4.0,>=2.4.0a0->-r requirements.txt (line 22)) (2.1.5)
Requirement already satisfied: joblib in /home/lyc/.local/lib/python3.10/site-packages (from nltk->rouge_score->-r requirements-dev.txt (line 13)) (1.4.2)
Requirement already satisfied: markdown-it-py>=2.2.0 in /home/lyc/.local/lib/python3.10/site-packages (from rich->bandit==1.7.7->-r requirements-dev.txt (line 16)) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/lyc/.local/lib/python3.10/site-packages (from rich->bandit==1.7.7->-r requirements-dev.txt (line 16)) (2.18.0)
Requirement already satisfied: mdurl~=0.1 in /home/lyc/.local/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->bandit==1.7.7->-r requirements-dev.txt (line 16)) (0.1.2)

GCC:

lyc@0694873fff7f:~/TensorRT-LLM$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) 
yuhengxnv commented 1 month ago

Wierd, did you do a clean build?

Beatlesso commented 1 month ago

Wierd, did you do a clean build?

I'm sorry. It was an oversight. Now the error logs is:

-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- NVTX is disabled
-- Importing batch manager
-- Importing executor
-- Importing nvrtc wrapper
-- Building PyTorch
-- Building Google tests
-- Building benchmarks
-- Not building C++ micro benchmarks
-- TensorRT-LLM version: 0.13.0.dev2024082000
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda-12.2/bin/nvcc
-- CUDA compiler: /usr/local/cuda-12.2/bin/nvcc
-- GPU architectures: 80-real;86-real
-- The C compiler identification is GNU 11.4.0
-- The CUDA compiler identification is NVIDIA 12.2.140
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-12.2/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /usr/local/cuda-12.2/include (found version "12.2.140") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- CUDA library status:
--     version: 12.2.140
--     libraries: /usr/local/cuda-12.2/lib64
--     include path: /usr/local/cuda-12.2/targets/x86_64-linux/include
-- ========================= Importing and creating target nvinfer ==========================
-- Looking for library nvinfer
-- Library that was found /usr/lib/x86_64-linux-gnu/libnvinfer.so
-- ==========================================================================================
-- CUDAToolkit_VERSION 12.2 is greater or equal than 11.0, enable -DENABLE_BF16 flag
-- CUDAToolkit_VERSION 12.2 is greater or equal than 11.8, enable -DENABLE_FP8 flag
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found version "3.1") 
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- COMMON_HEADER_DIRS: /home/lyc/TensorRT-LLM/cpp;/usr/local/cuda-12.2/include
-- Found Python3: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter Development Development.Module Development.Embed 
-- USE_CXX11_ABI is set by python Torch to 0
-- TORCH_CUDA_ARCH_LIST: 8.0;8.6
-- Found Python executable at /usr/bin/python3.10
-- Found Python libraries at /usr/lib/x86_64-linux-gnu
-- Found CUDA: /usr/local/cuda-12.2 (found version "12.2") 
-- Found CUDAToolkit: /usr/local/cuda-12.2/include (found version "12.2.140") 
-- Caffe2: CUDA detected: 12.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda-12.2/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda-12.2
-- Caffe2: Header version is: 12.2
-- /usr/local/cuda-12.2/lib64/libnvrtc.so shorthash is 000ca627
-- USE_CUDNN is set to 0. Compiling without cuDNN support
-- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
-- Added CUDA NVCC flags for: -gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CMake Warning at /home/lyc/.local/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /home/lyc/.local/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found)
  CMakeLists.txt:499 (find_package)

-- Found Torch: /home/lyc/.local/lib/python3.10/site-packages/torch/lib/libtorch.so  
-- TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=0
CMake Error at CMakeLists.txt:524 (file):
  file STRINGS file "/usr/local/tensorrt/include/NvInferVersion.h" cannot be
  read.

CMake Error at CMakeLists.txt:527 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:529 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:527 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:529 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:527 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:529 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:527 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

CMake Error at CMakeLists.txt:529 (string):
  string sub-command REGEX, mode MATCH needs at least 5 arguments total to
  command.

-- Building for TensorRT version: .., library version: 
CMake Error at CMakeLists.txt:543 (if):
  if given arguments:

    "LESS" "10"

  Unknown arguments specified

-- Configuring incomplete, errors occurred!
See also "/home/lyc/TensorRT-LLM/cpp/build/CMakeFiles/CMakeOutput.log".
Traceback (most recent call last):
  File "/home/lyc/TensorRT-LLM/./scripts/build_wheel.py", line 404, in <module>
    main(**vars(args))
  File "/home/lyc/TensorRT-LLM/./scripts/build_wheel.py", line 190, in main
    build_run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'cmake -DCMAKE_BUILD_TYPE="Release" -DBUILD_PYT="ON" -DBUILD_PYBIND="ON" -DNVTX_DISABLE="ON" -DBUILD_MICRO_BENCHMARKS=OFF "-DCMAKE_CUDA_ARCHITECTURES=80-real;86-real" -DTRT_LIB_DIR=/usr/local/tensorrt/targets/x86_64-linux-gnu/lib -DTRT_INCLUDE_DIR=/usr/local/tensorrt/include  -S "/home/lyc/TensorRT-LLM/cpp"' returned non-zero exit status 1.

The location of “NvInferVersion.h” is as follows, how should I change the settings? There are also a couple of errors that I don't quite understand.

lyc@0694873fff7f:/usr$ find -name "NvInferVersion.h"
./include/x86_64-linux-gnu/NvInferVersion.h
yuhengxnv commented 1 month ago

Looks like /usr/local/tensorrt is still being repected. Remove /usr/local/tensorrt and add --clean to your build_wheel.py args.

Beatlesso commented 1 month ago

Looks like /usr/local/tensorrt is still being repected. Remove /usr/local/tensorrt and add --clean to your build_wheel.py args.

Thank you very much, but there still seems to be some errors at the end:

python3 ./scripts/build_wheel.py --clean
...
...
...
[100%] Built target fpA_intB_gemm_src
[100%] Linking CUDA device code CMakeFiles/kernels_src.dir/cmake_device_link.o
[100%] Linking CXX static library libkernels_src.a
[100%] Built target kernels_src
gmake[2]: *** [CMakeFiles/Makefile2:1180: tensorrt_llm/kernels/cutlass_kernels/CMakeFiles/moe_gemm_src.dir/all] Error 2
gmake[2]: *** Waiting for unfinished jobs....
[100%] Built target decoder_attention_src
[100%] Linking CUDA device code CMakeFiles/decoder_attention.dir/cmake_device_link.o
[100%] Linking CUDA shared library libdecoder_attention.so
[100%] Built target decoder_attention
gmake[1]: *** [CMakeFiles/Makefile2:1057: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/rule] Error 2
gmake: *** [Makefile:205: tensorrt_llm] Error 2
Traceback (most recent call last):
  File "/home/lyc/TensorRT-LLM/./scripts/build_wheel.py", line 404, in <module>
    main(**vars(args))
  File "/home/lyc/TensorRT-LLM/./scripts/build_wheel.py", line 195, in main
    build_run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'cmake --build . --config Release --parallel 80 --target tensorrt_llm nvinfer_plugin_tensorrt_llm th_common bindings   executorWorker  ' returned non-zero exit status 2.
yuhengxnv commented 1 month ago

"python3 ./scripts/build_wheel.py --clean" I didn't mean such a simple command line.

The issue seems to be build_wheel.py (or cmake) can't find your NvInferVersion.h. Look into build_wheel.py, there is the --trt_root option to specify its location. But since --trt_root simply adds "/include" after trt_root, and your NvInferVersion.h is in /usr/include/x86_64-linux-gnu, I guess it may still can't find your NvInferVersion.h if you specified --trt_root=/usr. Here, I think -D XXX=XXX is needed.

Please do a CLEAN git clone, and run: python3 ./scripts/build_wheel.py -j 12 --cuda_architectures "80-real;86-real" -D TRT_LIB_DIR=/usr/lib -D TRT_INCLUDE_DIR=/usr/include/x86_64-linux-gnu (Please change the path of TRT_LIB_DIR to where libnvinfer.so can be found.)

What is the full build log if it still runs into error?

Beatlesso commented 1 month ago

"python3 ./scripts/build_wheel.py --clean" I didn't mean such a simple command line.

The issue seems to be build_wheel.py (or cmake) can't find your NvInferVersion.h. Look into build_wheel.py, there is the --trt_root option to specify its location. But since --trt_root simply adds "/include" after trt_root, and your NvInferVersion.h is in /usr/include/x86_64-linux-gnu, I guess it may still can't find your NvInferVersion.h if you specified --trt_root=/usr. Here, I think -D XXX=XXX is needed.

Please do a CLEAN git clone, and run: python3 ./scripts/build_wheel.py -j 12 --cuda_architectures "80-real;86-real" -D TRT_LIB_DIR=/usr/lib -D TRT_INCLUDE_DIR=/usr/include/x86_64-linux-gnu (Please change the path of TRT_LIB_DIR to where libnvinfer.so can be found.)

What is the full build log if it still runs into error?

Thank you so much, that solved my problem! This is actually a simple question, but the problem is that the documentation doesn't explain what TRT_LIB_DIR means, so I don't really know what path to specify, and explanations like the one below are very helpful! (Please change the path of TRT_LIB_DIR to where libnvinfer.so can be found.)