NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.71k stars 996 forks source link

Error Code 9: API Usage Error (Target GPU SM 70 is not supported by this TensorRT release.) #2400

Closed aliencaocao closed 3 weeks ago

aliencaocao commented 3 weeks ago

System Info

TensorRT-LLM version: 0.15.0.dev2024102900

Using Tesla V100 SXM2 16GB. Following the official instructions and official wheel. Building BLIP2-OPT failed with

[11/01/2024-03:33:08] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 9: API Usage Error (Target GPU SM 70 is not supported by this TensorRT release.)

Tried to build from source and specifying "70-real" using python3 ./scripts/build_wheel.py --trt_root /home/ubuntu/TensorRT-LLM/TensorRT-10.5.0.18 --cuda_architectures "70-real;75-real" Produced wheel is 900MB. Still same error.

The same wheel and command works on T4.

Who can help?

No response

Information

Tasks

Reproduction

Building BLIP2-OPT failed with official cmd shown in the example readme, with some of my own args.

path redacted
    --gemm_plugin float16 \
    --max_beam_width 5 \
    --max_batch_size 16 \
    --max_seq_len 100 \
    --max_input_len 48 \
    --context_fmha disable \
    --multiple_profiles disable \
    --max_multimodal_len 512 \
    --opt_num_tokens 576 \
    --profiling_verbosity detailed \
    --workers 8 \
    --log_level verbose

Expected behavior

Works on V100

actual behavior

Does not work

additional notes

NIL

aliencaocao commented 3 weeks ago

Wow ok...then anyway to get the old version's wheel? or Which is the commit right before removal of SM70 so I can build from src?

nv-guomingz commented 3 weeks ago

Wow ok...then anyway to get the old version's wheel? or Which is the commit right before removal of SM70 so I can build from src?

You may try this commit https://github.com/NVIDIA/TensorRT-LLM/commit/f14d1d433c8082a0e1c935f274ac4c0348d05060

aliencaocao commented 3 weeks ago

You mean https://github.com/NVIDIA/TensorRT-LLM/tree/3c46c2794e7f6df48250a68de6240994a77a26a7? I see that most of the code changes are after this

aliencaocao commented 3 weeks ago

Another related qns would be, is it possible to build the previous commit with TensorRT 10.5, instead of 10.4?

nv-guomingz commented 3 weeks ago

You mean https://github.com/NVIDIA/TensorRT-LLM/tree/3c46c2794e7f6df48250a68de6240994a77a26a7? I see that most of the code changes are after this

Yes, we release the code in weekly bias, so there're lots of changes.

nv-guomingz commented 3 weeks ago

Another related qns would be, is it possible to build the previous commit with TensorRT 10.5, instead of 10.4?

I don't have a try on that but I wouldn't recommend you have a try since it may raise unknown issues.

aliencaocao commented 3 weeks ago

Thank you.