onnx / onnx-tensorrt

ONNX-TensorRT: TensorRT backend for ONNX
Apache License 2.0
2.95k stars 544 forks source link

Pad opset 11 not supported #378

Closed axeldavy closed 2 years ago

axeldavy commented 4 years ago

Hi there,

Apparently 'Pad' had changes for opset 11.

I use the the tensorflow -> onnx generator.

I use zero padding at some point of my network (not followed by a Conv).

With opset 10, I get

node {
    input: "model_3/conv2d_3/Sigmoid:0"
    output: "model_3/zero_padding2d_1/Pad:0"
    name: "model_3/zero_padding2d_1/Pad"
    op_type: "Pad"
    attribute {
      name: "pads"
      ints: 0
      ints: 8
      ints: 8
      ints: 0
      ints: 0
      ints: 8
      ints: 8
      ints: 0
      type: INTS
    }
  }

With opset 11, I get

 node {
    input: "model_3/conv2d_3/Sigmoid:0"
    input: "model_3/zero_padding2d_2/Pad__188:0"
    output: "model_3/zero_padding2d_1/Pad:0"
    name: "model_3/zero_padding2d_1/Pad"
    op_type: "Pad"
  }
[....]
initializer {
    dims: 8
    data_type: 7
    name: "model_3/zero_padding2d_2/Pad__188:0"
    raw_data: "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000\010\000\000\000\000\000\000\000"
  }

As onnx-tensorrt expects the "pads" field to be present, the import fails with IndexError: Attribute not found: pads

Unfortunately I need that to use opset 11 as I use an op that needs at least opset 10, and my network is buggy with opset 10 (no idea whether it is tensorflow conversion or tensorrt). Opset 11 without the padding is ok.

zidanehuang001 commented 4 years ago

met same issue...

kaishian commented 4 years ago

We encountered the same problem and hope the following information is helpful.

error message and back-trace stack image

corresponding model file model

mrlzla commented 4 years ago

the same problem

amrit110 commented 4 years ago

any update to this?

rmccorm4 commented 4 years ago

Hi,

Can someone share (1) an ONNX model with the opset 11 zero padding that's not working and (2) the exact commands used to produce the ONNX model so I can reproduce?

amrit110 commented 4 years ago

@rmccorm4, so i was trying to convert a mobilenetv2 to onnx and then to trt engine. with opset 11, i get the above error where onnx2trt fails with what(): Attribute not found: pads (ONNX model link).

I used the Python API to generate the ONNX file.

import tensorflow as tf
import onnx
import tf2onnx

_INPUT_NAME = 'ImageTensor'
_RAW_OUTPUT_NAME = 'RawSemanticPredictions'

tf.reset_default_graph()
with tf.Graph().as_default() as tf_graph:
    tf.import_graph_def(frozen_graph_def, name='')
    onnx_graph = tf2onnx.tfonnx.process_tf_graph(tf_graph, 
                                                 input_names=[_INPUT_NAME + ':0'], 
                                                 output_names=[_RAW_OUTPUT_NAME + ':0'],
                                                 opset=11)
    model_proto = onnx_graph.make_model("onnx")
    onnx.checker.check_model(model_proto)
    with open("model.onnx", "wb") as f:
      f.write(model_proto.SerializeToString())

Weirdly with opset 10, i run into an assert same as (https://github.com/onnx/onnx-tensorrt/issues/79) which is strange since somehow onnx conversion messes up with the padding (https://github.com/onnx/tensorflow-onnx/issues/201). you can load the frozen_graph_def from file link.

rmccorm4 commented 4 years ago

@amrit110,

I'm getting this error from your ONNX model with TRT7 (with or without OSS build):

tf2onnx opset 11

ERROR: ModelImporter.cpp:92 In function parseGraph:
[8] Assertion failed: convertOnnxWeights(initializer, &weights, ctx)
...
&&&& FAILED TensorRT.trtexec # trtexec --explicitBatch --onnx=model.onnx

tf2onnx opset 10

While parsing node number 133 [SpaceToDepth]:
ERROR: builtin_op_importers.cpp:2946 In function importSpaceToDepth:
[6] Assertion failed: dims.d[1 + i] % block_size == 0
...
&&&& FAILED TensorRT.trtexec # trtexec --explicitBatch --onnx=model.opset10.onnx
amrit110 commented 4 years ago

yes i get that too while using trtexec, which doesn't say much to help debug. Then i tried the onnx2trt executable which gave me

terminate called after throwing an instance of 'std::out_of_range'
  what():  Attribute not found: pads

Edit: yeah i get the same Assertion failed: dims.d[1 + i] % block_size == 0, for opset 10. for 11 i get the above.

rmccorm4 commented 4 years ago

Hi @amrit110 ,

So I played around with the ONNX parser source code for a bit to try to fix this error:

terminate called after throwing an instance of 'std::out_of_range'
  what():  Attribute not found: pads

This is what I came up with naively, though I haven't really played around with this code before:

DEFINE_BUILTIN_OP_IMPORTER(Pad)
    ...
    std::vector<int> onnx_padding;
    if (ctx->getOpsetVersion() >= 11) {
        int pad;
        auto pads_tensor = inputs.at(1).weights();
        for (int i = 0; i < pads_tensor.count(); i++) {
            pad = (static_cast<int const*>(pads_tensor.values))[i];
            onnx_padding.push_back(pad);
            std::cout << "ONNX PADDING:" << pad << std::endl;
        }
    } else {
        onnx_padding = attrs.get<std::vector<int>>("pads");
    }
    ...

After which I get past that error, and see that the pad values were gotten correctly matching what I saw in Netron:

ONNX PADDING:0
ONNX PADDING:2
ONNX PADDING:2
ONNX PADDING:0
ONNX PADDING:0
ONNX PADDING:2
ONNX PADDING:2
ONNX PADDING:0

But I got a new error which is more standard:

While parsing node number 424 [Pad]:
ERROR: /mnt/TensorRT/parsers/onnx/builtin_op_importers.cpp:2114 In function importPad:
[8] Assertion failed: onnx_padding.size() == 8 && onnx_padding[0] == 0 && onnx_padding[1] == 0 && onnx_padding[4] == 0 && onnx_padding[5] == 0 && "This version of TensorRT only supports padding on the outer two dimensions on 4D tensors!"

I think this is now more in the territory of model architecture, and I'm assuming you just have to abide by that error for now (change your padding in your model to meet those constraints). Though I'm not 100% sure.

amrit110 commented 4 years ago

@rmccorm4, this is the padding issue i mentioned in my earlier comment, which i also get from using opset 10. but you are right, the padding op is tied to the model, this seems related to https://github.com/onnx/tensorflow-onnx/issues/201. in that issue they say its some order of padding which doesn't get translated correctly. i dont understand how they got around it using their fix, ill look into that though.

copaah commented 4 years ago

Any updates on this issue?

rmccorm4 commented 4 years ago

Looks like this padding opset11 support was just added to master here: https://github.com/onnx/onnx-tensorrt/pull/408

@amrit110 can you see if this works for your model?

basaltzhang commented 4 years ago

After update the code, I get another error using onnx opset 11

 UNSUPPORTED_NODE: Assertion failed: inputs.at(1).is_weights()

It seems pad is not a weight but a tensor in here

copaah commented 4 years ago

@basaltzhang Did you resolve this issue? I encounter the same problem.

basaltzhang commented 4 years ago

@basaltzhang Did you resolve this issue? I encounter the same problem.

I changed to nn.ZeroPad2d with opset_version=10, that's fine for me

daixiangzi commented 4 years ago

i meet same problem ,when i run trtexec --onnx=xxx.onnx

terminate called after throwing an instance of 'std::out_of_range' what(): Attribute not found: pads

amrit110 commented 4 years ago

Looks like this padding opset11 support was just added to master here: #408

@amrit110 can you see if this works for your model?

@rmccorm4, checked and it works fine! thanks.

rmccorm4 commented 4 years ago

Thanks @amrit110, seems this can be closed then.

ray-lee-94 commented 4 years ago

I can successfully convert pytorch-efficient-b2 model to onnx model and the outputs match with Pytorch model. But when I load the onnx model with TensorRT7.0 , it throws som pad warning and the outputs is not the same.

Beam-wi commented 4 years ago

@VCBE123 have you solved this problem? i encountered the same problem. It showed me 'IndexError: Attribute not found: pads' when i convert pytorch-efficient-d0 to onnx.

wxthss82 commented 4 years ago

@VCBE123 @Beam-wi I have the same issue... No idea how to solve it.

vandesa003 commented 4 years ago

@VCBE123 @Beam-wi I have the same issue... No idea how to solve it.

Same issue here, tried to convert efficientnet-b5 but failed because of the padding op.

huangzuo commented 4 years ago

same here, any solution?

ibrahimsoliman97 commented 4 years ago

same problem here..

silaopi commented 4 years ago

I tried tensorrt 7.1 7.0, still same problem here.

davideboschetto commented 4 years ago

Are there any plans to officially support ZeroPadding layers (for example, multidimensional tf.keras.layers.ZeroPadding ) in opset 13?

cognitiveRobot commented 4 years ago

Having the same issue. opset_version=12 . Any update?

dinhphuong98 commented 4 years ago

I'm facing the same problem. Is there any solution yet?

dinhphuong98 commented 4 years ago

I'm facing the same problem. Is there any solution yet?

thanks for responding. I still meet the same error :(((

haimh100 commented 4 years ago

same problem here, Is there any solution?

nlunscher-cpr commented 3 years ago

any solutions found that can be shared or released in a new version?

simutisernestas commented 3 years ago

@rmccorm4 problem is still relevant

kevinch-nv commented 3 years ago

What version of TRT are you guys using? We should handle this case correctly in both TRT 7.1 and TRT 7.2.

simutisernestas commented 3 years ago

@kevinch-nv 7.2.1.6

nlunscher-cpr commented 3 years ago

@kevinch-nv 7.1.3.4

kevinch-nv commented 3 years ago

Opset 11 pad should work for both those versions. I've uploaded an ONNX model here that performs an opset 11 pad. This should work with both 7.2 and 7.1.

nlunscher-cpr commented 3 years ago

@kevinch-nv Thanks for the test model. I can confirm that your model does work. You pad a 1x3x3x3 tensor to 1x3x4x4, which corresponds to the columns and channels in a (batch, rows, columns, channel) tensor format.

I tested this model that pads the rows and columns, which is how I ran into this issue. Nx3x3x3 pad to Nx4x4x3 didn't work with opset 11.

ERROR: builtin_op_importers.cpp:2194 In function importPad: [8] Assertion failed: convertOnnxPadding(onnxPadding, &begPadding, &endPadding) && "This version of TensorRT only supports padding on the outer two dimensions!"

kevinch-nv commented 3 years ago

Typically ONNX models are in CHW format, and TRT parses the model assuming CHW order. It sounds like transposing the input and the corresponding pad operator to CHW will solve your issue.

nlunscher-cpr commented 3 years ago

@kevinch-nv HWC is the default format used in tensorflow/keras. Is that input format not supported here? I have not had this issue using any other layers in this format.

jiarenyf commented 3 years ago

Having the same issue. opset_version=12 .

After update to opset_version=13, new problem accurs: IndexError: Attribute not found: axes

Any help !!!

kevinch-nv commented 3 years ago

@nlunscher-cpr HWC can be supported in the parser if we add transposes. I can work on the change adding this.

@jiarenyf the Attribute not found: axes error doesn't sound like it's coming from the Pad op - since there's no axes input / attribute for that operator.

axeldavy commented 3 years ago

I can confirm that tensorflow's ZeroPadding2D converted to onnx works with the latest TensorRT release.

axeldavy commented 3 years ago

Actually, tensorflow seems to have some randomness in its generated onnx... Depending on my network weights, the onnx generated is slightly different.

And sometimes TensorRT manages to read and execute it. And sometimes it says [8] Assertion failed: convertOnnxPadding(onnxPadding, &begPadding, &endPadding) && "This version of TensorRT only supports padding on the outer two dimensions!"

So it'd be great if all Padding parameters were supported, or if TensorRT was doing the correct transpose operations to make it work.

The working onnx contains:

node {
    input: "model_3/conv2d_8/Conv2D:0"
    input: "model_3/zero_padding2d/Pad__252:0"
    output: "model_3/zero_padding2d/Pad:0"
    name: "model_3/zero_padding2d/Pad"
    op_type: "Pad"
  }
  node {
    input: "model_3/zero_padding2d/Pad:0"
    input: "new_shape__555"
    output: "model_3/conv2d_8/Conv2D__248:0"
    name: "model_3/conv2d_8/Conv2D__248"
    op_type: "Reshape"
    domain: ""
  }
  node {
    input: "model_3/conv2d_8/Conv2D__248:0"
    input: "new_shape__533"
    output: "model_3/model_1/conv1_1/Conv2D_1__253:0"
    name: "model_3/model_1/conv1_1/Conv2D_1__253"
    op_type: "Reshape"
    domain: ""
  }

And the non-working one has

node {
    input: "model_3/conv2d_8/Conv2D:0"
    input: "new_shape__533"
    output: "model_3/conv2d_8/Conv2D__248:0"
    name: "model_3/conv2d_8/Conv2D__248"
    op_type: "Reshape"
    domain: ""
  }
  node {
    input: "model_3/conv2d_8/Conv2D__248:0"
    input: "model_3/zero_padding2d/Pad__252:0"
    output: "model_3/zero_padding2d/Pad:0"
    name: "model_3/zero_padding2d/Pad"
    op_type: "Pad"
  }
  node {
    input: "model_3/zero_padding2d/Pad:0"
    input: "new_shape__534"
    output: "model_3/model_1/conv1_1/Conv2D_1__253:0"
    name: "model_3/model_1/conv1_1/Conv2D_1__253"
    op_type: "Reshape"
    domain: ""
  }

Frankly I have no clue why these Reshape operations are introduced in the onnx convertion, and why they are consecutive ones are not merged.

Mirocle007 commented 3 years ago

In node -1 (importPad): UNSUPPORTED_NODE: Assertion failed: inputs.at(1).is_weights()

col-in-coding commented 3 years ago

After one year, this issue is still there :(

col-in-coding commented 3 years ago

I got this resolved by using onnx-simplifier

whcjb commented 3 years ago

still have the problem,some one can help? `

[TensorRT] VERBOSE: ModelImporter.cpp:107: Parsing node: Pad_88 [Pad]

[TensorRT] VERBOSE: ModelImporter.cpp:123: Searching for input: 104

[TensorRT] VERBOSE: ModelImporter.cpp:123: Searching for input: 177

[TensorRT] VERBOSE: ModelImporter.cpp:123: Searching for input: 178

[TensorRT] VERBOSE: ModelImporter.cpp:129: Pad_88 [Pad] inputs: [104 -> (1, -1, 224, 224)], [177 -> (-1)], [178 -> ()],

Traceback (most recent call last):

File "examples/inpainting/inpaintint_onnx2trt.py", line 22, in if not parser.parse(model.read()): IndexError: Attribute not found: pads`

quancq commented 3 years ago

Having the same issue. opset_version=12 .

After update to opset_version=13, new problem accurs: IndexError: Attribute not found: axes

Any help !!!

I have same problem when build TensorRT engine from onnx model. My model has RoBERTa model. I used opset_version=13 when convert Pytorch model to Onnx model. If i use opset_version=12, error is: [TensorRT] ERROR: INVALID_ARGUMENT: getPluginCreator could not find plugin CumSum version 1 ERROR: Failed to parse the ONNX file. Is there any solution yet? Thanks.

p890040 commented 3 years ago

Hi guys. I solved this problem by reshaping inputs to 4 dimensions. For example, the input dimension is [123, 4], and I want to pad to [1000, 4].

Replace output = F.pad(input, (0,0,0,1000-123)) with

input = input[None,None,...] # [1, 1 , 123, 4]
output = F.pad(input, (0,0,0,1000-123))

pad

hoangmt commented 3 years ago

Seems like the problem is still there. Very simple code (modified from test) to reproduce:

import onnxruntime as rt
import numpy as np
import onnx
from onnx import version_converter
from onnx import AttributeProto, TensorProto, GraphProto, helper

node = onnx.helper.make_node('Pad',inputs=['x', 'pads'], outputs=['y'], mode='constant')

x = helper.make_tensor_value_info('x',TensorProto.FLOAT,[1,3,4,5])
pads =helper.make_tensor_value_info('pads',TensorProto.INT64,[8])
y=helper.make_tensor_value_info('y',TensorProto.FLOAT,[1, 3, 7, 12])
graph_def = helper.make_graph([node],"pad-model",[x,pads],[y])

model_def = helper.make_model(graph_def, producer_name='pad-model')
model_def.opset_import[0].version = 11

onnx.save(model_def,'pad_model.onnx')

Check that the model runs well

import onnxruntime as rt
sess = rt.InferenceSession('pad_model.onnx')
np_x = np.random.randn(1, 3, 4, 5).astype(np.float32)
np_pads = np.array([0, 0, 1, 3, 0, 0, 2, 4]).astype(np.int64)
np_value =np.array([1.2]).astype(np.float32)
np_y = pad_impl(np_x,np_pads,'constant',1.2)
import numpy as np
y = sess.run(['y'],{"x": np_x, 'pads':np_pads})

Converting to tensorrt using

trtexec --onnx=pad_model.onnx --saveEngine=nms.trt

generates builtin_op_importers.cpp:2220 In function importPad: [8] Assertion failed: inputs.at(1).is_weights() Also tried onnx-simplifier python3 -m onnxsim pad_model.onnx pad_model.onnx --input-shape X:1,3,4,5 pads:8 but still get the above error message afterward. This is on 7.1.3.4 tensorRT