NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.77k stars 2.13k forks source link

Error Code 4: Internal Error of TensorRT 9.1.0 when running blip-large on GPU Tesla T4 #3551

Open LoveNingBo opened 11 months ago

LoveNingBo commented 11 months ago

Description

Followed https://github.com/NVIDIA/TensorRT/blob/release/9.1/demo/HuggingFace/notebooks/blip.ipynb to convert a blip-large model into TensorRT engine.

Points to note about blip-large model: -> original pre-trained model : Salesforce/blip-image-captioning-large

Error obtained while running the jupyter script: "[E] 4: [network.cpp::validate::3640] Error Code 4: Internal Error (image_embeds: for dimension number 2 in profile 0 does not match network definition (got min=768, opt=768, max=768), expected min=opt=max=1024).)"

Points to note about the jupyter script: -> Works as expected when I use a blip-base

Environment

TensorRT Version: 9.1.0

NVIDIA GPU: Tesla T4

NVIDIA Driver Version: 525.85.12

CUDA Version: 12.2

CUDNN Version: 8.9.5.29

Operating System: ubuntu-20.04 (build from ubuntu-20.04.Dockerfile)

Python Version (if applicable): 3.8.10

Tensorflow Version (if applicable): 2.9.1

PyTorch Version (if applicable): 2.0.1+cu118

Baremetal or Container (if so, version): Container (build from ubuntu-20.04.Dockerfile)

Relevant Files

Model link: https://huggingface.co/Salesforce/blip-image-captioning-large/tree/main

Steps To Reproduce

  1. I clone the TensorRT repo (https://github.com/NVIDIA/TensorRT.git release/9.1) and run step 1 in "Downloading TensorRT Build" section

  2. I build the TensorRT OSS container with "./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.2"

  3. Then, I launch the container with "./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.2 --gpus all"

  4. Then, I install the requirements from TensorRT/demo/HuggingFace directory with "pip3 install -r requirements.txt"

  5. I then copy the model files from the instance into the docker container

  6. Then I run the steps in the attached script (blip_test.py) blip_test.py.zip

  7. I get the following error at step: "trt_model.models = trt_model.setup_tokenizer_and_model()" Error Message:

[E] 4: [network.cpp::validate::3640] Error Code 4: Internal Error (image_embeds: for dimension number 2 in profile 0 does not match network definition (got min=768, opt=768, max=768), expected min=opt=max=1024).) [!] Invalid Engine. Please ensure the engine was built correctly Traceback (most recent call last): File "blip_test.py", line 80, in trt_model.models = trt_model.setup_tokenizer_and_model() File "/workspace/TensorRT/demo/HuggingFace/Vision2Seq/trt.py", line 204, in setup_tokenizer_and_model self.setup_engines_from_onnx() File "/workspace/TensorRT/demo/HuggingFace/Vision2Seq/trt.py", line 317, in setup_engines_from_onnx self.decoder_engine = self.onnx_decoder.as_trt_engine( File "/workspace/TensorRT/demo/HuggingFace/NNDF/models.py", line 553, in as_trt_engine return converter.onnx_to_trt( File "/workspace/TensorRT/demo/HuggingFace/NNDF/models.py", line 153, in onnx_to_trt trt_engine = engine_from_network( File "", line 3, in engine_from_network File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/base/loader.py", line 40, in call return self.call_impl(*args, kwargs) File "/usr/local/lib/python3.8/dist-packages/polygraphy/util/util.py", line 710, in wrapped return func(*args, *kwargs) File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/loader.py", line 604, in call_impl return engine_from_bytes(super().call_impl, runtime=self._runtime) File "", line 3, in engine_from_bytes File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/base/loader.py", line 40, in call return self.call_impl(args, kwargs) File "/usr/local/lib/python3.8/dist-packages/polygraphy/util/util.py", line 710, in wrapped return func(*args, kwargs) File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/loader.py", line 633, in callimpl buffer, = util.invoke_if_callable(self._serialized_engine) File "/usr/local/lib/python3.8/dist-packages/polygraphy/util/util.py", line 678, in invoke_if_callable ret = func(*args, *kwargs) File "/usr/local/lib/python3.8/dist-packages/polygraphy/util/util.py", line 710, in wrapped return func(args, kwargs) File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/loader.py", line 537, in call_impl G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly") File "/usr/local/lib/python3.8/dist-packages/polygraphy/logger/logger.py", line 605, in critical raise ExceptionType(message) from None polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly

zerollzeng commented 10 months ago

"[E] 4: [network.cpp::validate::3640] Error Code 4: Internal Error (image_embeds: for dimension number 2 in profile 0 does not match network definition (got min=768, opt=768, max=768), expected min=opt=max=1024).)"

It means the dynamic shape you provide is invalid for the network, you can only use min=opt=max=1024

ixtora commented 10 months ago

I have the same problem with the Salesforce/blip-image-captioning-large model. Is there any solution?