DerryHub / BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Apache License 2.0
430 stars 71 forks source link

onnx2trt failed with UNSUPPORTED_NODE #27

Closed WYYAHYT closed 1 year ago

WYYAHYT commented 1 year ago

Command: python tools/bevformer/onnx2trt.py configs/bevformer/bevformer_base_trt.py checkpoints/onnx/bevformer_r101_dcn_24ep.onnx

Error:

[02/17/2023-15:44:16] [TRT] [V] Parsing node: node_of_onnx::Cast_3779 [nan_to_num]
[02/17/2023-15:44:16] [TRT] [V] Searching for input: aten::nan_to_num_3775
[02/17/2023-15:44:16] [TRT] [V] node_of_onnx::Cast_3779 [nan_to_num] inputs: [aten::nan_to_num_3775 -> (4, 1, 6, 40000, 1)[BOOL]], [optional input, not set], [optional input, not set], [optional input, not set], 
[02/17/2023-15:44:16] [TRT] [I] No importer registered for op: nan_to_num. Attempting to import as plugin.
[02/17/2023-15:44:16] [TRT] [I] Searching for plugin: nan_to_num, plugin_version: 1, plugin_namespace: 
ERROR: Failed to parse the ONNX file.
In node 2599 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

Hi, thanks for your great job! But unfortunately, I got above error, could it be due to the installation problems?

WYYAHYT commented 1 year ago

I found that sh samples/test_trt_ops.sh always faild with ERRORlike this:

Loaded tensorrt plugins from /data/projects/bevformer_tensorrt/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so

#################### Running GridSampler2DTestCase ####################
test_fp16_bicubic_border_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... /data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/torch/onnx/utils.py:284: UserWarning: `add_node_names' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `add_node_names` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/torch/onnx/utils.py:284: UserWarning: `do_constant_folding' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `do_constant_folding` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
/data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/tensorrt/__init__.py:74: DeprecationWarning: Context managers for TensorRT types are deprecated. Memory will be freed automatically when the reference count reaches 0.
  warnings.warn(
ERROR
test_fp16_bicubic_border_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
ERROR
test_fp16_bicubic_reflection_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
ERROR
test_fp16_bicubic_reflection_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.

So the error about UNSUPPORTED_NODE was caused by this installation error?

WYYAHYT commented 1 year ago

And I set TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) in issue #16, here is the output.

Running on the GPU: 0
Loaded tensorrt plugins from /data/projects/bevformer_tensorrt/BEVFormer_tensorrt/TensorRT/lib/libtensorrt_ops.so

#################### Running GridSampler2DTestCase ####################
test_fp16_bicubic_border_NoAlignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... /data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/torch/onnx/utils.py:284: UserWarning: `add_node_names' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `add_node_names` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/torch/onnx/utils.py:284: UserWarning: `do_constant_folding' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `do_constant_folding` argument will be ignored.
  warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
[02/17/2023-16:54:24] [TRT] [I] [MemUsageChange] Init CUDA: CPU +297, GPU +0, now: CPU 927, GPU 12223 (MiB)
[02/17/2023-16:54:26] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +402, GPU +430, now: CPU 1348, GPU 12653 (MiB)
/data/anaconda3/envs/bevformer_tensorrt/lib/python3.8/site-packages/tensorrt/__init__.py:74: DeprecationWarning: Context managers for TensorRT types are deprecated. Memory will be freed automatically when the reference count reaches 0.
  warnings.warn(
[02/17/2023-16:54:26] [TRT] [I] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 1348, GPU 12653 (MiB)
ERROR
test_fp16_bicubic_border_alignCorners (det2trt.models.utils.test_trt_ops.test_grid_sampler.GridSampler2DTestCase) ... Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
Warning: Unsupported operator GridSampler2DTRT. No schema registered for this operator.
[02/17/2023-16:54:26] [TRT] [I] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 1384, GPU 13148 (MiB)
[02/17/2023-16:54:26] [TRT] [I] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 1384, GPU 13148 (MiB)
ERROR

This seems to be the only additional information:

[02/17/2023-16:54:24] [TRT] [I] [MemUsageChange] Init CUDA: CPU +297, GPU +0, now: CPU 927, GPU 12223 (MiB)
[02/17/2023-16:54:26] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +402, GPU +430, now: CPU 1348, GPU 12653 (MiB)
DerryHub commented 1 year ago

It seems an ONNX version issue. Please provide your pytorch and onnx versions.

WYYAHYT commented 1 year ago

It seems an ONNX version issue. Please provide your pytorch and onnx versions.

onnx=1.12.0 torch=1.11.0 Thanks for reply!

DerryHub commented 1 year ago

Maybe you can try to upgrade your pytorch to v1.12.

WYYAHYT commented 1 year ago

Maybe you can try to upgrade your pytorch to v1.12.

tried but not useful... Maybe you think that the issue operator nan_to_num to ONNX is the same as this problem, upgrading pytorch to v1.12 indeed solves issue operator nan_to_num to ONNX, but not useful for this one.

DerryHub commented 1 year ago

It seems that your tensorrt doesn't support this Op. What's the version of your tensorrt and cuda?

DerryHub commented 1 year ago

BTW, what's the difference between your environment and mine?

WYYAHYT commented 1 year ago

It seems that your tensorrt doesn't support this Op. What's the version of your tensorrt and cuda?

I got the key point!!

Although there was no nan_to_num error when converting pth to onnx (I don't know why there was no error reported using torch=1.11.0), but it will affect onnx2trt, and I ignored this point. After updating torch to v1.12.0, I still use the onnx model generated with torch v1.11.0 for trt conversion, so error still occured.

When I realized this, regenerated the onnx model with torch v1.12.0, and then execute onnx2trt, nan_to_num node passed successfully!

Thanks for your great job!

DerryHub commented 1 year ago

Glad to hear that!!!

sainttelant commented 1 year ago

@WYYAHYT after you generated the onnx by using same compatible version, you successfully convert onnx to trt model, however, have you tried to use custom plugin to do inference? and what is the result of running test_trt_ops? thanks a lot! @DerryHub