NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter
MIT License
4.55k stars 671 forks source link

Error during model conversion on jetson xavier nx #824

Open andreazuna89 opened 1 year ago

andreazuna89 commented 1 year ago

Hi, I am currently using a Jetson Xavier NX but we have some problems during the model conversion. We also test the command python3 -m torch2trt.test --name=interpolate to check if torch2trt is installed correctly and get the following errors during some tests:

python3 -m torch2trt.test -o test_output.md --name interpolate | torch2trt.converters.interpolate.test_nearest_mode | float32 | [(1, 2, 12, 12)] | {} | 0.00E+00 | nan | 0.00E+00 | 1.07e+04 | 2.78e+03 | 0.163 | 0.408 | | torch2trt.converters.interpolate.test_bilinear_mode | float32 | [(1, 4, 12, 12)] | {} | 1.94E-07 | 159.40 | 9.84E-16 | 1.15e+04 | 2.55e+03 | 0.168 | 0.408 | | torch2trt.converters.interpolate.test_align_corner | float32 | [(1, 3, 12, 12)] | {} | 1.67E+00 | 15.38 | 1.64E-01 | 1.13e+04 | 2.75e+03 | 0.16 | 0.427 | | torch2trt.converters.interpolate.test_align_corner_functional | float32 | [(1, 3, 12, 12)] | {} | 2.47E+00 | 16.24 | 2.25E-01 | 6.48e+03 | 2.86e+03 | 0.161 | 0.406 | | torch2trt.converters.interpolate.test_bilinear_mode_odd_input_shape | float32 | [(1, 5, 13, 13)] | {} | 2.38E-07 | 156.73 | 1.35E-15 | 1.07e+04 | 2.76e+03 | 0.162 | 0.414 | | torch2trt.converters.interpolate.test_size_parameter | float32 | [(1, 4, 12, 12)] | {} | N/A | N/A | N/A | N/A | N/A | Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 168, in max_error,psnr_db,mse, fps, fps_trt, ms, ms_trt = run(test, serialize=args.serialize) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 42, in run module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20, self.torch2trt_kwargs) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt outputs = module(inputs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/upsampling.py", line 141, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper converter"converter" File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7 layer.set_input(1, shape._trt) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0) RuntimeError: std::exception

| torch2trt.converters.interpolate.test_size_parameter_odd_input | float32 | [(1, 3, 1, 1)] | {} | N/A | N/A | N/A | N/A | N/A | Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 168, in max_error,psnr_db,mse, fps, fps_trt, ms, ms_trt = run(test, serialize=args.serialize) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 42, in run module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20, self.torch2trt_kwargs) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt outputs = module(inputs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/upsampling.py", line 141, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper converter"converter" File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7 layer.set_input(1, shape._trt) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0) RuntimeError: std::exception

| torch2trt.converters.interpolate.test_size_parameter_odd_input | float32 | [(1, 3, 13, 13)] | {} | N/A | N/A | N/A | N/A | N/A | Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 168, in max_error,psnr_db,mse, fps, fps_trt, ms, ms_trt = run(test, serialize=args.serialize) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 42, in run module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20, self.torch2trt_kwargs) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt outputs = module(inputs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/upsampling.py", line 141, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper converter"converter" File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7 layer.set_input(1, shape._trt) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0) RuntimeError: std::exception

| torch2trt.converters.interpolate.test_nearest_mode_3d | float32 | [(1, 4, 6, 6, 6)] | {} | 0.00E+00 | nan | 0.00E+00 | 1.12e+04 | 2.62e+03 | 0.167 | 0.427 | | torch2trt.converters.interpolate.test_bilinear_mode_3d | float32 | [(1, 3, 5, 5, 5)] | {} | 2.38E-07 | 159.00 | 1.02E-15 | 1.02e+04 | 2.7e+03 | 0.164 | 0.436 | | torch2trt.converters.interpolate.test_align_corner_3d | float32 | [(1, 4, 8, 8, 8)] | {} | 3.82E+00 | 14.57 | 3.13E-01 | 1.06e+04 | 2.86e+03 | 0.164 | 0.42 | | torch2trt.converters.interpolate.test_bilinear_mode_odd_input_shape_3d | float32 | [(1, 3, 1, 1, 1)] | {} | 0.00E+00 | nan | 0.00E+00 | 1.16e+04 | 2.85e+03 | 0.161 | 0.415 | | torch2trt.converters.interpolate.test_bilinear_mode_odd_input_shape_3d | float32 | [(1, 3, 2, 4, 4)] | {} | 1.79E-07 | 155.05 | 1.44E-15 | 5.03e+03 | 2.29e+03 | 0.231 | 0.474 | | torch2trt.converters.interpolate.test_bilinear_mode_odd_input_shape_3d | float32 | [(1, 6, 7, 7, 7)] | {} | 3.58E-07 | 157.70 | 1.45E-15 | 1.13e+04 | 2.65e+03 | 0.163 | 0.438 | | torch2trt.converters.interpolate.test_size_parameter_3d | float32 | [(1, 1, 12, 12, 12)] | {} | N/A | N/A | N/A | N/A | N/A | Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 168, in max_error,psnr_db,mse, fps, fps_trt, ms, ms_trt = run(test, serialize=args.serialize) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 42, in run module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20, self.torch2trt_kwargs) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt outputs = module(inputs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/upsampling.py", line 141, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper converter"converter" File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7 layer.set_input(1, shape._trt) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0) RuntimeError: std::exception

| torch2trt.converters.interpolate.test_size_parameter_odd_input_3d | float32 | [(1, 4, 3, 5, 1)] | {} | N/A | N/A | N/A | N/A | N/A | Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 168, in max_error,psnr_db,mse, fps, fps_trt, ms, ms_trt = run(test, serialize=args.serialize) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 42, in run module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20, self.torch2trt_kwargs) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt outputs = module(inputs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/upsampling.py", line 141, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper converter"converter" File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7 layer.set_input(1, shape._trt) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0) RuntimeError: std::exception

| torch2trt.converters.interpolate.test_size_parameter_odd_input_3d | float32 | [(1, 3, 7, 9, 5)] | {} | N/A | N/A | N/A | N/A | N/A | Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 168, in max_error,psnr_db,mse, fps, fps_trt, ms, ms_trt = run(test, serialize=args.serialize) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/test.py", line 42, in run module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20, self.torch2trt_kwargs) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt outputs = module(inputs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(input, kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/upsampling.py", line 141, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper converter"converter" File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7 layer.set_input(1, shape._trt) File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0) RuntimeError: std::exception

NUM_TESTS: 17 NUM_SUCCESSFUL_CONVERSION: 11 NUM_FAILED_CONVERSION: 6 NUM_ABOVE_TOLERANCE: 0 NUM_pSNR_TOLERANCE: 3

We have JetPack 4.5 (L4T R32.5.0) with pytorch version 1.6 and torch2trt version 0.4.0. Can someone help here? Is there a conflict with pytorch version?

Thanks a lot Andrea

dchssk commented 1 year ago

I'm having the same problem. jetson xavier nx

[12/19 09:41:23 yolact.eval]: Loading model...
[12/19 09:41:31 yolact.eval]: Model loaded.
[12/19 09:41:31 yolact.eval]: Converting to TensorRT...
[12/19 09:41:31 yolact.eval]: Converting backbone to TensorRT...
[12/19 09:41:34 yolact.eval]: Converting protonet to TensorRT...
[12/19 09:41:34 yolact.eval]: Converting FPN to TensorRT...
Warning: Encountered known unsupported method torch.zeros
Traceback (most recent call last):
  File "eval.py", line 1275, in <module>
    convert_to_tensorrt(net, cfg, args, transform=BaseTransform())
  File "/share/yolact_edge/yolact_edge/utils/tensorrt.py", line 164, in convert_to_tensorrt
    net.to_tensorrt_fpn(cfg.torch2trt_fpn_int8, batch_size=args.trt_batch_size)
  File "/share/yolact_edge/yolact_edge/yolact.py", line 1539, in to_tensorrt_fpn
    self.trt_load_if("fpn_phase_1", trt_fn, x, int8_mode, batch_size=batch_size)
  File "/share/yolact_edge/yolact_edge/yolact.py", line 1481, in trt_load_if
    module = trt_fn(module, trt_fn_params)
  File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 778, in torch2trt
    outputs = module(*inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/share/yolact_edge/yolact_edge/yolact.py", line 907, in forward
    x = F.interpolate(x, size=(h, w), mode=self.interpolation_mode, align_corners=False)
  File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 310, in wrapper
    converter["converter"](ctx)
  File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/converters/interpolate.py", line 77, in convert_interpolate_trt7
    layer.set_input(1, shape._trt)
  File "/usr/local/lib/python3.6/dist-packages/torch2trt-0.4.0-py3.6-linux-aarch64.egg/torch2trt/torch2trt.py", line 1019, in _trt
    self._raw_trt = ctx.network._network.add_concatenation([d._trt for d in self]).get_output(0)
RuntimeError: std::exception
andreazuna89 commented 1 year ago

Hi, I have probably solved by downgrading the torch2trt version installed. Can you try this:

cd torch2trt git checkout 98f6ac21a9124302d92bfd9c018238004f165310

The commit points to the 0.4.0 version tag of torch2trt.

Best, Andrea

dchssk commented 1 year ago

Thanks for the lighting fast response!

This problem was resolved thanks to your support. I can’t thank you enough.