grimoire / torch2trt_dynamic

A pytorch to tensorrt convert with dynamic shape support
MIT License
254 stars 34 forks source link

Error Code 9: Internal Error ((Unnamed Layer* 3886) [ElementWise]: broadcast dimensions must be conformable ) #21

Open Tailwhip opened 3 years ago

Tailwhip commented 3 years ago

Hi,

I'm trying to build engine from pytorch model (resnest200 backboned) with float16 conversion. Unfortunately I got an error: [TensorRT] ERROR: 9: [graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9: Internal Error ((Unnamed Layer* 3886) [ElementWise]: broadcast dimensions must be conformable ) and the traceback: Traceback (most recent call last): File "optimize_model.py", line 110, in <module> main(args) File "optimize_model.py", line 86, in main trt_model = torch2trt_dynamic(model, [x], fp16_mode=True, File "/home/venv_opt/lib/python3.8/site-packages/torch2trt_dynamic/torch2trt_dynamic.py", line 534, in torch2trt_dynamic outputs = module(*inputs) File "/home/venv_opt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/model.py", line 87, in forward d3 = self.decoder3(d4) + e2 File "/home/venv_opt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/model.py", line 162, in forward x = self.conv1(x) File "/home/venv_opt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/venv_opt/lib/python3.8/site-packages/torch2trt_dynamic/torch2trt_dynamic.py", line 326, in wrapper converter['converter'](ctx) File "/home/venv_opt/lib/python3.8/site-packages/torch2trt_dynamic/converters/Conv2d.py", line 12, in convert_Conv2d input_trt = trt_(ctx.network, input) File "/home/venv_opt/lib/python3.8/site-packages/torch2trt_dynamic/torch2trt_dynamic.py", line 148, in trt_ num_dim = len(t._trt.shape) ValueError: __len__() should return >= 0

The shape I give as an input is: opt_shape_param = [[ [1, 3, 448, 448], [4, 3, 448, 448], [4, 3, 448, 448] ]]

I just restored conversion via ONNX to torch2trt_dynamic.py module and added some code to convert opt_shape_param to form usable by torch.onnx.export function and now optimization works perfectly fine (also with int8 quantization), so I haven't been doing much investigation. I also had this problem with TensorRT 7.2.3.4 I'm writing mostly because I don't want to have "my own" version of torch2trt_dynamic buy rather to use the official one.

I have two questions:

  1. What's the reason you've resigned from conversion via ONNX or if there's no particular reason then can it possibly be restored (I'm sharing my modified module maybe it'll be helpfull)? torch2trt_dynamic.zip
  2. Could you please help me find reason why building engine fails using original torch2trt_dynamic?

Thank you in advance!

clw5180 commented 1 year ago

opt_shape_param = [[ [1, 3, 448, 448], [4, 3, 448, 448], [4, 3,

same error, have you solved it? Thanks !

Tailwhip commented 1 year ago

Unfortunatelly, as you can see no one answered so I stayed with my modification that I shared in the zip file. If I remember correctly It was working there.

grimoire commented 1 year ago

Sorry for late reply. It is caused by split op https://github.com/zhanghang1989/ResNeSt/blob/1dfb3e8867e2ece1c28a65c9db1cded2818a2031/resnest/torch/models/splat.py#L62 . Split is not dynamic shape friendly. Different input might given number of node in the compute graph.

assert(len(torch.rand(10).split(5)) == 2)
assert(len(torch.rand(15).split(5)) == 3)

So I static it to generate the graph. Try use the chunk op. Which is much friendly.