NVIDIA-AI-IOT / tf_trt_models

TensorFlow models accelerated with NVIDIA TensorRT
BSD 3-Clause "New" or "Revised" License
682 stars 245 forks source link

subgraph conversion error for subgraph_index:0 #9

Open geonseoks opened 5 years ago

geonseoks commented 5 years ago

Hello? I convert detection.ipynb file to detection.py and test it. But subgraph conversion error occur and there are no detection results.

nvidia@ tegra-ubuntu:~/tf_trt_models/examples/detection$ python3 detection.py 2018-08-03 12:56:54.283943: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero 2018-08-03 12:56:54.284075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005 pciBusID: 0000:00:00.0 totalMemory: 7.66GiB freeMemory: 1.60GiB 2018-08-03 12:56:54.284128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 2018-08-03 12:56:55.970822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-08-03 12:56:55.970892: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 2018-08-03 12:56:55.970919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 2018-08-03 12:56:55.971081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 908 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2) Converted 199 variables to const ops. ['scores', 'classes', 'boxes'] 2018-08-03 12:57:59.031208: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 0 2018-08-03 12:58:04.831767: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:383] MULTIPLE tensorrt candidate conversion: 2 2018-08-03 12:58:04.844875: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] subgraph conversion error for subgraph_index:0 due to: "Unimplemented: Require 4 dimensional input. Got 1 Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/zeros_like_30" SKIPPING......( 181 nodes) 2018-08-03 12:58:05.072039: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2660] Max batch size= 1 max workspace size= 23679062 2018-08-03 12:58:05.072123: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2664] Using FP16 precision mode 2018-08-03 12:58:05.072153: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2666] starting build engine 2018-08-03 12:58:48.920384: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2671] Built network 2018-08-03 12:58:49.088932: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2676] Serialized engine 2018-08-03 12:58:49.100270: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2684] finished engine my_trt_op1 containing 434 nodes 2018-08-03 12:58:49.100414: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2704] Finished op preparation 2018-08-03 12:58:49.120219: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2712] OK finished op building 2018-08-03 12:58:55.775380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 2018-08-03 12:58:55.775482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-08-03 12:58:55.775510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 2018-08-03 12:58:55.775535: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 2018-08-03 12:58:55.775620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 908 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)

I use JetPACK-3.2.1 and tensorflow1.8.0(python3) I don't know what wrong.. Can you help me with this error?

ghost commented 5 years ago

The reported error is actually a warning. In this instance, engine creation for one of the discovered sub-graphs failed so tensorrt integration fell back to the original tensorflow sub-graph. However, for the other sub-graph, engine creation succeeded (my_trt_op1) which contains 434 nodes.

lmmhh commented 5 years ago

Hello? I have the same problem When I followed the test all of the ssd models, I was given an error.

subgraph conversion error for subgraph_index:0 due to: "Unimplemented: Require 4 dimensional input. Got 1 Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/zeros_like_30" SKIPPING......( 181 nodes) subgraph conversion error for subgraph_index:0

I can't get the trt model. I don't know what wrong.. Can you help me with this error? thank you very much