tensort7 load onnx resize ops error

syshensyshen commented 4 years ago

Description

when i load onnx model, fpn F.interpolate ops error

While parsing node number 209 [Resize]: ERROR: builtin_op_importers.cpp:2412 In function importResize: [8] Assertion failed: scales.is_weights() && "Resize scales must be an initializer!"

this error in onnx-tensorrt

Environment

TensorRT Version: 7.0 GPU Type: 1060 Nvidia Driver Version: 441.22 CUDA Version: 10.2 CUDNN Version: 7.6.5.32 Operating System + Version: win10

shen338 commented 4 years ago

Same thing happens to me. TensorRT 7.0.0 may not support bilinear resizing yet. It only support linear and nearest.

syshensyshen commented 4 years ago

it seems that it seems that onnx-tensort code error， not tensorrt7.0.0 export error

rmccorm4 commented 4 years ago

Hi @syshensyshen ,

Can you share the ONNX model that reproduces this error?

anhtu812 commented 4 years ago

Hi @rmccorm4 i have same error with onnx model converted from tensorflow model on ubuntu my model: https://1drv.ms/u/s!AhFk3ICqlZI2iro5kv2QaNjMIa7k4A?e=7PZ8ic

zhangyilalala commented 4 years ago

I meet the same problem,has anyone solved this problem?

zhangyilalala commented 4 years ago

@syshensyshen Hi~have you solved this problem? it seems like the resize scale is not defined in onnx

rmccorm4 commented 4 years ago

For my own future reference:

----------------------------------------------------------------
Input filename:   tensorflow_model.onnx
ONNX IR version:  0.0.6
Opset version:    10
Producer name:    tf2onnx
Producer version: 1.5.3
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
==== SUMMARY ====
Time taken: 928.83 seconds
❌  FAIL: nvcr.io/nvidia/tensorrt:19.12-py3 - /bin/bash -c 'cd /mnt && trtexec --explicitBatch --onnx=tensorflow_model.onnx': 
        [E] [TRT] Parameter check failed at: ../builder/Network.cpp::addInput::671, condition: isValidDims(dims, hasImplicitBatchDimension())
        ERROR: ModelImporter.cpp:80 In function importInput:
        [8] Assertion failed: *tensor = importer_ctx->network()->addInput( input.name().c_str(), trt_dtype, trt_dims)
        [E] Failed to parse onnx file
        [E] Parsing model failed
        [E] Engine could not be created
❌  FAIL: nvcr.io/nvidia/tensorrt:19.12-py3 - /bin/bash -c 'wget https://raw.githubusercontent.com/rmccorm4/tensorrt-utils/master/OSS/build_OSS.sh && source build_OSS.sh 19.12 && cd /mnt && trtexec --explicitBatch --onnx=tensorflow_model.onnx': 
        ERROR: /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:286 In function importCast:
        [8] Assertion failed: trt_dtype == nvinfer1::DataType::kHALF && cast_dtype == ::ONNX_NAMESPACE::TensorProto::FLOAT
        [E] Failed to parse onnx file
        [E] Parsing model failed
        [E] Engine creation failed
        [E] Engine set up failed
❌  FAIL: nvcr.io/nvidia/tensorrt:20.02-py3 - /bin/bash -c 'cd /mnt && trtexec --explicitBatch --onnx=tensorflow_model.onnx': 
        ERROR: builtin_op_importers.cpp:2412 In function importResize:
        [8] Assertion failed: scales.is_weights() && "Resize scales must be an initializer!"
        [E] Failed to parse onnx file
        [E] Parsing model failed
        [E] Engine creation failed
        [E] Engine set up failed
❌  FAIL: nvcr.io/nvidia/tensorrt:20.02-py3 - /bin/bash -c 'wget https://raw.githubusercontent.com/rmccorm4/tensorrt-utils/master/OSS/build_OSS.sh && source build_OSS.sh && cd /mnt && trtexec --explicitBatch --onnx=tensorflow_model.onnx': 
        ERROR: /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:2473 In function importResize:
        [8] Assertion failed: scales.is_weights() && "Resize scales must be an initializer!"
        [E] Failed to parse onnx file
        [E] Parsing model failed
        [E] Engine creation failed
        [E] Engine set up failed

rmccorm4 commented 4 years ago

Hi @anhtu812 ,

I see your model was exported with ONNX opset 10.

Can you try exporting the model with ONNX opset 11?

anhtu812 commented 4 years ago

Hi @rmccorm4

my model with opset 11: https://1drv.ms/u/s!AhFk3ICqlZI2irt0eND7KCAWsj6RFw?e=okuC31

it out new error message:

ONNX IR version:  0.0.6
Opset version:    11
Producer name:    tf2onnx
Producer version: 1.5.5
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
ERROR: ModelImporter.cpp:92 In function parseGraph:
[8] Assertion failed: convertOnnxWeights(initializer, &weights, ctx)

rmccorm4 commented 4 years ago

Hi @anhtu812 ,

Thanks for sharing. I repro'd your error with the opset11 model above with TRT7.

However, I was able to parse your model using TRT7 + OSS components:

# Start TensorRT 7 container
nvidia-docker run -it -v $PWD:/mnt nvcr.io/nvidia/tensorrt:20.02-py3

# Build OSS Components from https://github.com/NVIDIA/TensorRT
wget https://raw.githubusercontent.com/rmccorm4/tensorrt-utils/master/OSS/build_OSS.sh
source build_OSS.sh

# Parse model
trtexec --explicitBatch --onnx=/mnt/386-resize/tf_model_opset11.onnx
...
&&&& PASSED TensorRT.trtexec # trtexec --explicitBatch --onnx=/mnt/386-resize/tf_model_opset11.onnx

So the issue was likely fixed in upstream onnx parser. Hope this helps.

anhtu812 commented 4 years ago

Hi @rmccorm4

Parse model is done, but I can not build cuda engine dynamic shape of this model. When i run builder->buildEngineWithConfig(network,config), this function run no stop, no return. My program work fine with model no resize ops

I set profile: "fts_input_images:0":[[[1,320,320,3],[1,320,320,3],[1,320,320,3]]] (multiple of 32, to after resize (upsample) it have same shape and sum with other tensor)

i use cuda10.0 cudnn7.6.5

rmccorm4 commented 4 years ago

When i run buildEngineWithConfig(), this function no return.

What does this mean?

rmccorm4 commented 4 years ago

Are you saying the program hangs? It may just be taking a long time to build the engine. Try setting the logger to verbose mode, and you might see what's actually going on inside the function call. Some users have reported long build times for certain models.

anhtu812 commented 4 years ago

@rmccorm4 I know it take a long time and use GPU (display in GPU Utilization). In my case, GPU Utilization 0% and i waited more than one hour. Can you build cuda engine dynamic shape of this model?

function run and stop with log (no done, no return):

...
PE fts_input_images:0)) 2) 2) 2) 2)))) 2) (* 2 (CEIL_DIV (CEIL_DIV (BROADCAST_SIZE (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) (* 2 (BROADCAST_SIZE (* 2 (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2) 2)) (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2)))) 2) 2))))) 2) (* 2 (CEIL_DIV (CEIL_DIV (BROADCAST_SIZE (BROADCAST_SIZE (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) (* 2 (BROADCAST_SIZE (* 2 (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2) 2)) (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2)))) (* 2 (BROADCAST_SIZE (CEIL_DIV (BROADCAST_SIZE (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) (* 2 (BROADCAST_SIZE (* 2 (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2) 2)) (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2)))) 2) (* 2 (CEIL_DIV (CEIL_DIV (BROADCAST_SIZE (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) (* 2 (BROADCAST_SIZE (* 2 (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2) 2)) (CEIL_DIV (CEIL_DIV (CEIL_DIV (CEIL_DIV (# 2 (SHAPE fts_input_images:0)) 2) 2) 2) 2)))) 2) 2))))) 2) 2))))) 2) 2))))) 2) 2))))) 2) 2)))))))) ***************
[V] [TRT] --------------- Timing Runner: Transpose__669 (Shuffle)
[V] [TRT] Tactic: 0 time 0.02224
[V] [TRT] Fastest Tactic: 0 Time: 0.02224
[V] [TRT] --------------- Timing Runner: <reformat> (Reformat)
[V] [TRT] Tactic: 0 time 0.015168
[V] [TRT] Fastest Tactic: 0 Time: 0.015168

deeptf commented 4 years ago

@rmccorm4 Thanks for your suggestion. I followed the steps in your comment, however, my ONNX model from PyTorch, that has F.Interpolate('nearest') still fails with an error on a resize Node. Do you have any thoughts? Thanks!

The two types of error I receive are (for two different ONNX files with Interpolate Op):

While parsing node number 104 [Resize]: ERROR: ModelImporter.cpp:124 In function parseGraph: [5] Assertion failed: ctx->tensors().count(inputName)

(OR)

ERROR: ModelImporter.cpp:92 In function parseGraph: [8] Assertion failed: convertOnnxWeights(initializer, &weights, ctx)

rmccorm4 commented 4 years ago

Hi @deeptf,

Did you build the OSS components like in my comment above?
If it's still not working, can you try running onnx-simplifier on your model, and try to parse the simplified model?

Check out this issue's discussion: https://github.com/NVIDIA/TensorRT/issues/439#issuecomment-604155733

anhtu812 commented 4 years ago

Hi @rmccorm4, i am waiting your answer, thanks.

rmccorm4 commented 4 years ago

Hi @anhtu812,

Sorry I missed your reply above.

I can try to run your code when I get a chance tomorrow.

rmccorm4 commented 4 years ago

Hi @anhtu812 ,

I tried the following with a V100 GPU in NGC TensorRT 20.03 container, and it took about 5mins for each engine to build:

$ time trtexec \
--explicitBatch \
--onnx=tf_model_opset11.onnx \
--minShapes='fts_input_images:0':1x320x320x3 \
--optShapes='fts_input_images:0':1x320x320x3 \
--maxShapes='fts_input_images:0':1x320x320x3 \
--shapes='fts_input_images:0':1x320x320x3 \
--verbose > verbose.log 2>&1
...
&&&& PASSED TensorRT.trtexec # trtexec --explicitBatch --onnx=tf_model_opset11.onnx --minShapes='fts_input_images:0':1x320x320x3 --optShapes='fts_input_images:0':1x320x320x3 --maxShapes='fts_input_images:0':1x320x320x3 --shapes='fts_input_images:0':1x320x320x3 --verbose

real    4m48.807s
user    4m29.848s
sys     0m8.748s

Can you try to let it run to completion with --verbose if possible, maybe run it overnight or something, and prepend time before your command so you know how long it took, and then share the verbose logs here?

anhtu812 commented 4 years ago

Thanks @rmccorm4 , on my machine it take 26 minutes for each profile.

Can I build cuda engine with multi threads? Because i see it use only one thread of CPU in 25 minutes.
I see it build repeat 3 similar profiles (use with 3 context). Can I build one and clone it?

rmccorm4 commented 4 years ago

Hi @anhtu812 ,

Can I build cuda engine with multi threads? Because i see it use only one thread of CPU in 25 minutes.

At the moment I don't think this is currently possible.

I see it build repeat 3 similar profiles (use with 3 context). Can I build one and clone it?

I'm not quite sure what you mean by this. Can you clarify/post an example of buidler repeating 3 similar profiles?

anhtu812 commented 4 years ago

Hi @rmccorm4 , example:

        config = builder->createBuilderConfig();
        int n_profile = 3;
        for (int j1=0; j1<n_profile; ++j1){// one profile for one context
                IOptimizationProfile* profile = builder->createOptimizationProfile();
                nvinfer1::Dims4 dims(1,320,320,3);
                auto tensor_name = "fts_input_images:0";
                profile->setDimensions(tensor_name, OptProfileSelector::kMIN, dims);
                profile->setDimensions(tensor_name, OptProfileSelector::kOPT, dims);
                profile->setDimensions(tensor_name, OptProfileSelector::kMAX, dims);
                config->addOptimizationProfile(profile);
        }
        engine = builder->buildEngineWithConfig(*network,*config);

when i see log of verbose mode it take 26 minutes for each profile -> build engiine make 26*3=78 minutes.

rmccorm4 commented 4 years ago

Why are you creating 3 identical profiles? Off the top of my head, I'm pretty sure you can create several execution contexts from the same profile.

anhtu812 commented 4 years ago

@rmccorm4 , I readed somewhere, in dynamic shape mode to run multi threads with diff contexts, each context must be set with diff profiles.

rmccorm4 commented 4 years ago

I believe you must explicitly define the profile that each execution context is using, but I think it's okay that each execution context is using the same profile in this example.

Try it out.

anhtu812 commented 4 years ago

@rmccorm4 , my test code:

#include "NvInfer.h"
#include <iostream>
#include "NvUtils.h"
#include "NvOnnxParser.h"
using namespace nvinfer1;
#include <thread>

// #include "common/logger.h"
// #include "common/buffers.h"
// std::string model_path = "detection_model.onnx";
#include "tensorrt_util/logger.h"
#include "tensorrt_util/buffers.h"
std::string model_path = "model.onnx";

std::string input_name = "fts_input_images:0";
Dims4 dims1(1,320,320,3);
Dims4 dims2(1,320,320,3);
Dims4 dims3(1,320,320,3);

int main(int argc, char** argv) {
  auto builder = createInferBuilder(gLogger);

  auto config = builder->createBuilderConfig();
  int n_profile = 2;
  for (int i=0; i<n_profile; ++i){
    auto profile = builder->createOptimizationProfile();
    profile->setDimensions(input_name.c_str(), OptProfileSelector::kMIN, dims1);
    profile->setDimensions(input_name.c_str(), OptProfileSelector::kOPT, dims2);
    profile->setDimensions(input_name.c_str(), OptProfileSelector::kMAX, dims3);
    config->addOptimizationProfile(profile);
  }
  auto network = builder->createNetworkV2(1U << static_cast<int>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));
  auto parser = nvonnxparser::createParser(*network, gLogger);
  parser->parseFromFile(model_path.c_str(), 3);
  auto engine = builder->buildEngineWithConfig(*network,*config);

  std::vector<IExecutionContext*> contexts;
  for (int i=0; i<2; ++i){
    contexts.emplace_back(engine->createExecutionContext());
    auto context = contexts.back();
    context->setOptimizationProfile(0);
    std::cout<<"allInputDimensionsSpecified: "<<context->allInputDimensionsSpecified()<<"\n";
  }
}

it out error when set 2 context with profile 0 and run "context->allInputDimensionsSpecified()". no error if replace "context->setOptimizationProfile(0)" by "context->setOptimizationProfile(i)"

DJMeng commented 4 years ago

Same thing happens to me. TensorRT 7.0.0 may not support bilinear resizing yet. It only support linear and nearest. Do you solve it? surpporting resize bilinear on tensorrt7.0? Thanks.

lovejing0306 commented 4 years ago

Hi @anhtu812 ,

Thanks for sharing. I repro'd your error with the opset11 model above with TRT7.

However, I was able to parse your model using TRT7 + OSS components:
# Start TensorRT 7 container
nvidia-docker run -it -v $PWD:/mnt nvcr.io/nvidia/tensorrt:20.02-py3

# Build OSS Components from https://github.com/NVIDIA/TensorRT
wget https://raw.githubusercontent.com/rmccorm4/tensorrt-utils/master/OSS/build_OSS.sh
source build_OSS.sh

# Parse model
trtexec --explicitBatch --onnx=/mnt/386-resize/tf_model_opset11.onnx
...
&&&& PASSED TensorRT.trtexec # trtexec --explicitBatch --onnx=/mnt/386-resize/tf_model_opset11.onnx
So the issue was likely fixed in upstream onnx parser. Hope this helps.

I use your method, but got the error:

Assertion failed: (transformationMode == "asymmetric") && "This version of TensorRT only supports asymmetric resize!

Can you help me ?

gentlebreeze1 commented 4 years ago

@deeptf i meet the same problem. do you solve it? The two types of error I receive are (for two different ONNX files with Interpolate Op):

While parsing node number 104 [Resize]: ERROR: ModelImporter.cpp:124 In function parseGraph: [5] Assertion failed: ctx->tensors().count(inputName)

(OR)

ERROR: ModelImporter.cpp:92 In function parseGraph: [8] Assertion failed: convertOnnxWeights(initializer, &weights, ctx)

lovejing0306 commented 4 years ago

@deeptf i meet the same problem. do you solve it? The two types of error I receive are (for two different ONNX files with Interpolate Op):

While parsing node number 104 [Resize]: ERROR: ModelImporter.cpp:124 In function parseGraph: [5] Assertion failed: ctx->tensors().count(inputName)

(OR)

ERROR: ModelImporter.cpp:92 In function parseGraph: [8] Assertion failed: convertOnnxWeights(initializer, &weights, ctx)

I use fixed shape is work

gentlebreeze1 commented 4 years ago

@lovejing0306 i am beginner for this.what do you mean the fixed shape?my model is yolov4, Some people say that the scale value of the resize layer comes concat layer, it is not fixed.

lovejing0306 commented 4 years ago

@lovejing0306 i am beginner for this.what do you mean the fixed shape?my model is yolov4, Some people say that the scale value of the resize layer comes concat layer, it is not fixed.

if your input shape is [None, None, 3], it is dynamic shape if your input shape is [640, 640, 3], it is fixed shape

mdhexplorer commented 3 years ago

However, I was able to parse your model using TRT7 + OSS components:

Hi! I've met similar problem when parsing an onnx model. What is the specific version of TRT7 you mentioned?

mdhexplorer commented 3 years ago

For some reasons, the latest CUDA version I have available is 10.0. Refer to the official website information（https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-713/support-matrix/index.html）, the latest TRT version I can use is 7.0.0.11. Does this mean that the problem of Upsample/Resize cannot be resolved in my current environment?

rajeevsrao commented 3 years ago

@mdhexplorer which platform are you using (limited to cuda-10.0)? We are adding additional enhancements to our ONNX Resize implementation which resolves some of the issues above. This will be part of our next major RC. @ttyio FYI.

mdhexplorer commented 3 years ago

@rajeevsrao @ttyio Glad to hear from you！ I'm working on CentOS 7. The driver version in my current working environment is 410.79 and for some reasons, I cannot use the latest driver version. (This is not because I don't know how to upgrade the driver. In fact, I have used the latest tensorrt to solve this problem on other machines that are convenient to upgrade the driver version.) Can this problem be solved by modifying the source code of 7.0.0.11?

ttyio commented 3 years ago

Hello @mdhexplorer , For the enhancement that planed in next RC, we also touch the tensorrt core part that not in opensource repo. For the gap between 7.0 and 7.2, not sure about which issue are you hitting, maybe try build the onnx parser in 7.2 https://github.com/NVIDIA/TensorRT/tree/release/7.2/parsers especially the resize op in https://github.com/onnx/onnx-tensorrt/blob/eb559b6cdd1ec2169d64c0112fab9b564d8d503b/builtin_op_importers.cpp#L2563

mdhexplorer commented 3 years ago

Hello @ttyio , According to https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html , the latest CUDA version that my current driver version 410.79 can support is 10.0, which is obviously not enough to run TRT7.2 (at least CUDA11.0 is required). Do you mean to download the 7.0 Release version and compile with the 7.2 open source code?

ttyio commented 3 years ago

Hello @mdhexplorer TRT7.0 + opensource 7.0 modified with onnx resize operator changes in https://github.com/onnx/onnx-tensorrt/blob/eb559b6cdd1ec2169d64c0112fab9b564d8d503b/builtin_op_importers.cpp#L2563

mdhexplorer commented 3 years ago

I will have a try. Thanks!

Egozjuer commented 3 years ago

Hello @mdhexplorer TRT7.0 + opensource 7.0 modified with onnx resize operator changes in https://github.com/onnx/onnx-tensorrt/blob/eb559b6cdd1ec2169d64c0112fab9b564d8d503b/builtin_op_importers.cpp#L2563

@ttyio Hi, I have used the newest onnx-tensorrt,the same error encountered, While parsing node number 119 [Resize -> "575"]: ERROR: /home/ego/software/onnx-tensorrt/builtin_op_importers.cpp:2616 In function importResize: [8] Assertion failed: (transformationMode == "asymmetric" || transformationMode == "pytorch_half_pixel" || transformationMode == "half_pixel") && "TensorRT only supports half pixel, pytorch half_pixel, and asymmetric tranformation mode for linear resizes when scales are provided!"

ttyio commented 3 years ago

Sorry @Egozjuer , the full set of coordinateTranformationMode in resize op is part of our next major RC.

Egozjuer commented 3 years ago

Sorry @Egozjuer , the full set of coordinateTranformationMode in resize op is part of our next major RC.

@ttyio Hi，I only used bilinear mode in Pytorch，such as up_level1 = F.interpolate(out_layer4, scale_factor=2, mode='bilinear', align_corners=True) Does it mean that bilinear Resize is not supported in TensorRT?

ttyio commented 3 years ago

@Egozjuer Hmmm, the gap come from TF (some detail https://hackernoon.com/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35); Not sure why you see failure for pytorch exported model. Are you using TRT 7.2?

Egozjuer commented 3 years ago

@ttyio Yes, I am very sure the TRT verison is 7.2.1.6 Input filename: lidar_ma_nodecode12.4_fixUnsample_sim.onnx ONNX IR version: 0.0.6 Opset version: 11 Producer name: pytorch Producer version: 1.7 Domain:
Model version: 0 Doc string:

Parsing model [2020-12-04 09:41:45 WARNING]~/GPU/onnx-tensorrt/onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. While parsing node number 119 [Resize -> "575"]: ERROR: /home/ego/GPU/onnx-tensorrt/builtin_op_importers.cpp:2616 In function importResize: [8] Assertion failed: (transformationMode == "asymmetric" || transformationMode == "pytorch_half_pixel" || transformationMode == "half_pixel") && "TensorRT only supports half pixel, pytorch half_pixel, and asymmetric tranformation mode for linear resizes when scales are provided!"

ttyio commented 3 years ago

Hello @Egozjuer , just checked the fix is still in internal repo, will available in next major release. Could you WAR this issue by provide scale instead of output size in the resize op? thanks!

mminervini commented 3 years ago

Hi,

I get a similar error when I try to convert to TensorRT a semantic segmentation model trained with Python and TensorFlow (specifically, a Unet with ResNet18 backend from qubvel/segmentation_models). My ultimate goal is to run inference on a Jetson device, using C++ TensorRT API. This seemingly straightforward task is proving extremely elusive!

My initial attempts were with CUDA 10.1, cuDNN 7.6.5, TensorRT 6.0.1, and TF 2.2 from PyPI. Following guidelines I found online, I first converted the TF saved model to ONNX (already tried with different opset values):

python3 -m tf2onnx.convert --saved-model ~/Unet-ResNet18 --output model.onnx --opset 11

Then I tried to convert the ONNX model to TensorRT using trtexec:

/usr/src/tensorrt/bin/trtexec --verbose --explicitBatch --onnx=/home/massimo/Desktop/trt_cpp_tests/Unet-ResNet18_20201202T1625/model.onnx

However, this resulted in the following error at the very beginning of the model:

While parsing node number 1 [BatchNormalization]:
ERROR: ModelImporter.cpp:296 In function importModel:
[5] Assertion failed: tensors.count(input_name)

After googling the error from trtexec (or similarly from the C++ API), I understood from other issues that I had to upgrade to TensorRT 7.

Therefore, I upgraded to CUDA 11.1, cuDNN 8.0.5, TensorRT 7.2.1, and finally long hours to compile Tensorflow 2.3.1 (unfortunately, it is not readily available on PyPI for CUDA versions supporting TensorRT 7). With the new environment, I got this error:

ERROR: builtin_op_importers.cpp:3569 In function importUpsample:
[8] Assertion failed: scales_input.is_weights()

A comment in another issue (#555) led me here, to compile and install TensorRT OSS, which brought me a little further. Currently, parsing of the ONNX model stops in the middle of the graph. The verbose error log is:

[12/08/2020-17:51:24] [V] [TRT] ModelImporter.cpp:103: Parsing node: Resize__191 [Resize]
[12/08/2020-17:51:24] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/functional_3/relu1/Relu:0
[12/08/2020-17:51:24] [V] [TRT] ModelImporter.cpp:119: Searching for input: roi__279
[12/08/2020-17:51:24] [V] [TRT] ModelImporter.cpp:119: Searching for input: roi__279
[12/08/2020-17:51:24] [V] [TRT] ModelImporter.cpp:119: Searching for input: Concat__190:0
[12/08/2020-17:51:24] [V] [TRT] ModelImporter.cpp:125: Resize__191 [Resize] inputs: [StatefulPartitionedCall/functional_3/relu1/Relu:0 -> (-1, 512, 8, 14)], [roi__279 -> ()], [roi__279 -> ()], [Concat__190:0 -> (4)], 
[12/08/2020-17:51:24] [V] [TRT] ImporterContext.hpp:154: Registering layer: Resize__191 for ONNX node: Resize__191
ERROR: builtin_op_importers.cpp:2586 In function importResize:
[8] Assertion failed: (transformationMode == "asymmetric" || transformationMode == "align_corners" || transformationMode == "half_pixel" || transformationMode == "pytorch_half_pixel") && "This version of TensorRT only supports asymmetric, align_corners, half_pixel, and pytorch_half_pixel resize!"
[12/08/2020-17:51:24] [E] Failed to parse onnx file
[12/08/2020-17:51:24] [E] Parsing model failed
[12/08/2020-17:51:24] [E] Engine creation failed
[12/08/2020-17:51:24] [E] Engine set up failed

I opened the ONNX model in Netron to take a look at node Resize__191 in the error message. It says that coordinate transform mode is tf_half_pixel_for_nn, so I guess this could be firing the assertion.

I couldn't find much by googling the keywords in this error message and I'm stuck in trying to make things work. Not sure what else I can do next to see this through.

@ttyio, do you think the fix you mentioned in your last comment could be related to my case too? If so, what timeline do you envision to have it released?

Please let me know if you need further details, I will be happy to run any additional tests.

Best regards, Massimo

ttyio commented 3 years ago

Hello @mminervini , I believe the tf_half_pixel_for_nn is introduced in TF1.x, they use a wrong formula (more details in https://hackernoon.com/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35), and TF2.x fix it but you can still use the legacy formula when use half_pixel = 0 for backward compatibility. The main different between tf_half_pixel_for_nn and half_pixel is that the rounding mode is ceil instead of half_down after coordinate transformation. We have add support it in internal repository recently, and will available to public in next major release. Before that, you might try one of the approach:

change your TF model with half_pixel = 0 and train again?
or check opensource NN plugin implementation in https://github.com/NVIDIA/TensorRT/tree/master/plugin/resizeNearestPlugin, and change the implementation in kernel, to follow the tf_half_pixel_for_nn formula (https://github.com/onnx/onnx/blob/master/docs/Changelog.md#Resize-11).

mminervini commented 3 years ago

Hello @ttyio,

The model is built with Keras and I couldn't easily find how to change the image resize operation to use half_pixel = 0 as you advised.

Eventually, as a workaround, I replaced upsampling with transposed convolution to obtain a similar effect. Basically, the UpSampling2D layer that was causing the tf_half_pixel_for_nn issue was replaced by Conv2DTranspose layers. The new model should be computationally more expensive, but at least it was parsed successfully by trtexec. I look forward to the new major release to be able to revert to the cheaper upsampling operation.

Thank you for the support!

ttyio commented 3 years ago

Thanks for the confirm of the workaround, closing

NVIDIA / TensorRT