NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.11k stars 896 forks source link

build from source fail #914

Closed park12sj closed 7 months ago

park12sj commented 7 months ago

System Info

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+

- commit : c89653021e66ca78c55f02b366f404455bc12e8d

### Who can help?

@byshiue 

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

1. build image for building whl https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/build_from_source.md#on-systems-without-gnu-make
2. create container and whl build https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/build_from_source.md#build-tensorrt-llm

### Expected behavior

Build Success

### actual behavior

1. error occuered when pip install nvidia-ammo

Collecting nvidia-ammo~=0.5.0 (from -r requirements.txt (line 18)) Downloading nvidia-ammo-0.5.1.tar.gz (6.9 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [6 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/tmp/pip-install-cwksnwhn/nvidia-ammo_375eeb4e0f7248fea8e87aaffc9f0eec/setup.py", line 90, in raise RuntimeError("Bad params") RuntimeError: Bad params [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

2. if ignore installing nvidia-ammo, build is fail

[ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention64_half.cu.o [ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention80_bf16.cu.o [ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention80_float.cu.o [ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention80_half.cu.o [ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention96_bf16.cu.o [ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention96_float.cu.o [ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention96_half.cu.o nvcc error : 'ptxas' died due to signal 9 (Kill signal) gmake[3]: [tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/build.make:8149: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention144_half.cu.o] Error 9 gmake[3]: Waiting for unfinished jobs.... nvcc error : 'ptxas' died due to signal 9 (Kill signal) gmake[3]: [tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/build.make:8164: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention160_bf16.cu.o] Error 9 gmake[2]: [CMakeFiles/Makefile2:865: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/all] Error 2 gmake[1]: [CMakeFiles/Makefile2:790: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/rule] Error 2 gmake: [Makefile:192: tensorrt_llm] Error 2 Traceback (most recent call last): File "/workspace/storage/cephfs-personal/git/pai/paip-TensorRT-LLM/./scripts/build_wheel.py", line 306, in main(**vars(args)) File "/workspace/storage/cephfs-personal/git/pai/paip-TensorRT-LLM/./scripts/build_wheel.py", line 164, in main build_run( File "/usr/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'cmake --build . --config Release --parallel 40 --target tensorrt_llm tensorrt_llm_static nvinfer_plugin_tensorrt_llm th_common bindings ' returned non-zero exit status 2.



### additional notes

I don't know if it's related, but the error that occurs when pip install version 0.7.1 is the same as the error that occurs in the nvidia-ammo install during build.

https://github.com/NVIDIA/TensorRT-LLM/issues/777#issuecomment-1895562944
Tlntin commented 7 months ago

cpu memory? 64GB may need when compile.

park12sj commented 7 months ago

@Tlntin

Oh, it's 50GB... I'll go over 64GB and try again. I wonder if you have any opinions about the nvidia-ammo installation error.

Tlntin commented 7 months ago

I wonder if you have any opinions about the nvidia-ammo installation error.

what's your python version? nvidia-ammo only support python 3.10.

park12sj commented 7 months ago

@Tlntin

I am using 3.10 version.

root@personal-sangjune-trt-llm-0:/workspace/storage/cephfs-personal/git/pai/paip-TensorRT-LLM# python --version
Python 3.10.12

I'm trying to do a whl build in the container for build according to the guide https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/build_from_source.md#on-systems-without-gnu-make

Tlntin commented 7 months ago

oh, maybe other error, i will watch nvidia ammo setup.py source code after a moment.

Tlntin commented 7 months ago

maybe you can try install nvidia-ammo with this command

pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com nvidia-ammo~=0.5.0

when you install nvidia-ammo, the torch, tensorrt-llm, mpi4py must have installed.

park12sj commented 7 months ago

Thank you for the guidance.

By the way, nvidia-ammo is a requirement for tensorrt-llm, when installing nvidia-ammo, if tensorrt-llm is required to be installed, is there no dependency between each other? https://github.com/NVIDIA/TensorRT-LLM/blob/c89653021e66ca78c55f02b366f404455bc12e8d/requirements.txt#L18

Tlntin commented 7 months ago

Thank you for the guidance.

By the way, nvidia-ammo is a requirement for tensorrt-llm, when installing nvidia-ammo, if tensorrt-llm is required to be installed, is there no dependency between each other?

https://github.com/NVIDIA/TensorRT-LLM/blob/c89653021e66ca78c55f02b366f404455bc12e8d/requirements.txt#L18

This may be a bug, in the previous 0.7.0/0.6.0/0.5.0 versions, tensorrt-llm was installed first, and then nvidia-ammo. I haven't tried the compilation of 0.7.1 though, so I can't be sure yet.

park12sj commented 7 months ago

Thank you for your answer. I'll close the issue.