NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.84k stars 2.14k forks source link

DLA standalone: Safety certified DLA should only have one graph for the whole network #4165

Open mayulin0206 opened 1 month ago

mayulin0206 commented 1 month ago

Description

Problems with building cudla models using EngineCapability::kDLA_STANDALONE. We want to use patterns kDLA_STANDALONE to run the model, but we encounter the following error when compiling the model. the command is: /usr/src/tensorrt/bin/trtexec --maxAuxStreams=1 --onnx=No_upsample.onnx --verbose --int8 --useDLACore=0 --buildDLAStandalone --inputIOFormats=int8:dla_linear --outputIOFormats=int8:dla_linear

09/27/2024-17:35:27] [E] Error[2]: [foreignNode.cpp::determineCandidateForeignNodes::895] Error Code 2: Internal Error (Safety certified DLA should only have one graph for the whole network.) [09/27/2024-17:35:27] [E] Engine could not be created from network [09/27/2024-17:35:27] [E] Building engine failed [09/27/2024-17:35:27] [E] Failed to create engine from model or file. [09/27/2024-17:35:27] [E] Engine set up failed

our model don't have loop and condition

Environment

**TensorRT Version 8.6.1.2:

NVIDIA GPU:

**NVIDIA Driver Version: orin 6090:

CUDA Version:

CUDNN Version:

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

/usr/src/tensorrt/bin/trtexec --maxAuxStreams=1 --onnx=No_upsample.onnx --verbose --int8 --useDLACore=0 --buildDLAStandalone --inputIOFormats=int8:dla_linear --outputIOFormats=int8:dla_linear

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

moraxu commented 1 month ago

Can you share your No_upsample.onnx?

lix19937 commented 1 month ago

DLA_STANDALONE : DLA Standalone: TensorRT flow with restrictions targeting external, to TensorRT, DLA runtimes. See DLA documentation for list of supported layers and formats. This flow supports only DeviceType::kDLA.

EngineCapability.SAFETY provides a restricted subset of network operations that are safety certified and the resulting serialized engine can be executed with TensorRT’s safe runtime APIs in the tensorrt.safe namespace. EngineCapability.DLA_STANDALONE provides a restricted subset of network operations that are DLA compatible and the resulting serialized engine can be executed using standalone DLA runtime APIs.

See sampleCudla for an example of integrating cuDLA APIs with TensorRT APIs.

Maybe some layers of your onnx are not support on dla.