Tensor sizes are different onnx model and DLA loadable engine outputs

AnhPC03 commented 10 months ago

Hello, I could run your repo and when I printed input and output tensors size, the values were

[hybrid mode] create cuDLA device SUCCESS
[hybrid mode] load cuDLA module from memory SUCCESS
[hybrid mode] cuDLA module get number of input tensors SUCCESS
[hybrid mode] cuDLA module get number of output tensors SUCCESS
[hybrid mode] cuDLA module get input tensors descriptors SUCCESS
[hybrid mode] cuDLA module get output tensors descriptors SUCCESS
Input tensor size: 1806336 (1x4x672x672)
Output tensor size 0: 3612672 (1x?x?x?)
Output tensor size 1: 903168 (1x?x?x?)
Output tensor size 2: 225792 (1x?x?x?)
[hybrid mode] register cuda input tensor memory to cuDLA SUCCESS
[hybrid mode] register cuda output tensor memory to cuDLA SUCCESS
[hybrid mode] register cuda output tensor memory to cuDLA SUCCESS
[hybrid mode] register cuda output tensor memory to cuDLA SUCCESS

But the converted onnx model has this values, I saw on Netron having same values

Input tensor size: 1354752 (1x3x672x627)
Output tensor size 0: 1799280 (1x255x84x84)
Output tensor size 1: 449820 (1x255x42x42)
Output tensor size 2: 112455 (1x255x21x21)

If I converted onnx model to .engine only for inferencing using GPU

${TRTEXEC} --shapes=images:1x3x672x672 --onnx=data/model/yolov5_trimmed_qat_noqdq.onnx --saveEngine=data/gpu/yolov5.int8.int8chw32in.fp16chw16out.engine --inputIOFormats=int8:chw32 --outputIOFormats=fp16:chw16 --int8 --fp16 --calib=data/model/qat2ptq.cache --precisionConstraints=prefer --layerPrecisions="/model.24/m.0/Conv":fp16,"/model.24/m.1/Conv":fp16,"/model.24/m.2/Conv":fp16

This having the same tensor size with Netron. And I couldn't inference this .engine using GPU.

How can I convert only inferencing using GPU but having same tensor size with yours DLA loadable? Thank you very much.

lynettez commented 9 months ago

Hey @AnhPC03, sorry for late reply. Did you try with "--minShapes=images:1x3x672x672 --maxShapes=images:1x3x672x672 --optShapes=images:1x3x672x672 --shapes=images:1x3x672x672" to specific the input shape range?

lynettez commented 2 months ago

closing since no activity for several months, thanks!

NVIDIA-AI-IOT / cuDLA-samples

Tensor sizes are different onnx model and DLA loadable engine outputs #17