NeMo fastpitch onnx convert to tensorrt failure of TensorRT 10.3.0

yuananf commented 2 months ago

Description

Environment

I'm using this docker image: nvcr.io/nvidia/tensorrt:24.08-py3

TensorRT Version:

NVIDIA GPU: L40S

NVIDIA Driver Version: 535.183.01

CUDA Version: 12.6

CUDNN Version: 9.3.0

Operating System: Ubuntu22.04

Python Version (if applicable): 3.10.12

PyTorch Version (if applicable): 2.4.0

Relevant Files

Model link:

Steps To Reproduce

First install NeMo

pip install nemo_toolkit['tts']

Code used to generate the onnx model.


from nemo.collections.tts.models.fastpitch import FastPitchModel

spec_model = FastPitchModel.from_pretrained("tts_en_fastpitch") spec_model.export('ljspeech.onnx', onnx_opset_version=20)


2. command that reproduce the error

trtexec --onnx=ljspeech.onnx --minShapes=text:1x32,pitch:1x32,pace:1x32 --optShapes=text:1x768,pitch:1x768,pace:1x768 --maxShapes=text:1x1664,pitch:1x1664,pace:1x1664 --shapes=text:1x768,pitch:1x768,pace:1x768 --memPoolSize=workspace:4096 --noTF32 --saveEngine=ljspeech.engine


3. The error is:

[E] Error[7]: IExecutionContext::enqueueV3: Error Code 7: Internal Error (/decoder/layers.0/dec_attn/MatMul_1: attempt to multiply two matrices with mismatching dimensions Condition '==' violated: 0 != 1. Instruction: CHECK_EQUAL 0 1.) [E] Error occurred during inference



**Commands or scripts**:

**Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**: yes

**Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`): yes

akhilg-nv commented 2 months ago

Can you verify that the shapes provided to the trtexec call are valid? The error could be caused due to invalid shape profile passed in.

yuananf commented 2 months ago

Can you verify that the shapes provided to the trtexec call are valid? The error could be caused due to invalid shape profile passed in.

Thanks for response, after changing the input shape, I am able to execute the trtexec command and generate the engine file. But there is another problem now.

trtexec --onnx=ljspeech.onnx --minShapes=text:1x32,pitch:1x32,pace:1x32 --optShapes=text:1x128,pitch:1x128,pace:1x128 --maxShapes=text:1x128,pitch:1x128,pace:1x128 --shapes=text:1x128,pitch:1x128,pace:1x128 --memPoolSize=workspace:4096 --noTF32 --saveEngine=ljspeech.engine

After setting the dynamic input shape, the output shape did not change, so I can not determine the real output shape.

Correct me if I'm wrong.

import tensorrt as trt

TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
trt.init_libnvinfer_plugins(TRT_LOGGER, '')

engine_filepath = 'ljspeech.engine'

with open(engine_filepath, "rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
    engine = runtime.deserialize_cuda_engine(f.read())
    context = engine.create_execution_context()

    context.set_input_shape('text', (1,33))
    context.set_input_shape('pitch', (1,33))
    context.set_input_shape('pace', (1,33))

    print('all_binding_shapes_specified: ', context.all_binding_shapes_specified)

    print('spect shape: ', context.get_tensor_shape('spect'))
    print('num_frames', context.get_tensor_shape('num_frames'))
    print('durs_predicted', context.get_tensor_shape('durs_predicted'))
    print('log_durs_predicted', context.get_tensor_shape('log_durs_predicted'))
    print('pitch_predicted', context.get_tensor_shape('pitch_predicted'))

The output is:

all_binding_shapes_specified:  True
spect shape:  (1, 80, -1)
num_frames (1,)
durs_predicted (1, 33)
log_durs_predicted (1, 33)
pitch_predicted (1, 33)

zhenhuaw-me commented 1 month ago

@yuananf Which are the specific dims that have wrong value? IIUC, some of them have changed.

Can you confirm that if the "specific dims" are the ones that mentioned in the API doc?

A dimension in an output tensor will have a -1 wildcard value if the dimension depends on values of execution tensors OR if all the following are true: It is a runtime dimension. setInputShape() has NOT been called for some input tensor(s) with a runtime shape. setTensorAddress() has NOT been called for some input tensor(s) with isShapeInferenceIO() = true. An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to enqueueV3().

yuananf commented 1 month ago

@yuananf Which are the specific dims that have wrong value? IIUC, some of them have changed.

Can you confirm that if the "specific dims" are the ones that mentioned in the API doc?

A dimension in an output tensor will have a -1 wildcard value if the dimension depends on values of execution tensors OR if all the following are true: It is a runtime dimension. setInputShape() has NOT been called for some input tensor(s) with a runtime shape. setTensorAddress() has NOT been called for some input tensor(s) with isShapeInferenceIO() = true. An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to enqueueV3().

As you can see from previous comment, the output shape of spect is still (1, 80, -1) after all input shapes are set.

I can confirm set_input_shape are called for all input tensors.

So the reason might be An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to [enqueueV3()]

What does this mean?

yuananf commented 1 month ago

@yuananf Which are the specific dims that have wrong value? IIUC, some of them have changed.

Can you confirm that if the "specific dims" are the ones that mentioned in the API doc?

A dimension in an output tensor will have a -1 wildcard value if the dimension depends on values of execution tensors OR if all the following are true: It is a runtime dimension. setInputShape() has NOT been called for some input tensor(s) with a runtime shape. setTensorAddress() has NOT been called for some input tensor(s) with isShapeInferenceIO() = true. An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to enqueueV3().

Any update on this issue?

NVIDIA / TensorRT