SteveMacenski / jetson_nano_detection_and_tracking

Jetson Nano ML install scripts, automated optimization of robotics detection models, and filter-based tracking of detections
GNU Lesser General Public License v2.1
226 stars 66 forks source link

error runing the tensorrt optimization #19

Open omartin2010 opened 4 years ago

omartin2010 commented 4 years ago

Did anyone run into this kind of error (I've only kept the latest from the output... it's pretty long) :

2019-12-27 23:28:09.856472: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 1849 ops of 28 different types in the graph that are not converted to TensorRT: Fill, Merge, Switch, Range, ConcatV2, ZerosLike, Identity, NonMaxSuppressionV3, Squeeze, Mul, ExpandDims, Unpack, TopKV2, Cast, Transpose, Placeholder, Sub, Const, Greater, Shape, Where, Reshape, NoOp, GatherV2, AddV2, Pack, Minimum, StridedSlice, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2019-12-27 23:28:11.128378: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:633] Number of TensorRT candidate segments: 2
2019-12-27 23:28:11.468622: F tensorflow/core/util/device_name_utils.cc:92] Check failed: IsJobName(job) 
Aborted (core dumped)

I'm running this on a Jetson TX2 with the latest jetpack (as of now, it is 4.3). tensorflow-gpu 1.15.0+nv19.12.tf1. Would that cause an issue with this?

Any clue what might be causing this?

omartin2010 commented 4 years ago

I've checked and it happens when running specifically this line :

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

in the python script... if that helps. Found this might help : https://jkjung-avt.github.io/jetpack-4.3/

Looks like a lot of efforts to get this thing running properly for me.. given I'm running TF15 (which is the wheel provided by nvidia) it seems like too much work to get this to work for my requirement (a personal project... no need to be ultra fast). I'll see if I can instead use another installation of TF1.13 provided in the 2-day example...

tekeburak commented 3 years ago

Hi @omartin2010, Did you solve the problem? I'm facing with the same issue though. Your help will be very appreciated @SteveMacenski.

System info:

Jetpack 4.4.1 [L4T 32.4.4]
tensorflow-1.15.4+nv20.12-cp36-cp36m-linux_aarch64.whl

Error:

Creating Jetson optimized graph...
2020-12-23 14:23:39.825571: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
2020-12-23 14:24:06.363306: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.363477: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2020-12-23 14:24:06.363784: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-12-23 14:24:06.364686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.364814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties:
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-12-23 14:24:06.364889: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2020-12-23 14:24:06.364978: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2020-12-23 14:24:06.365038: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2020-12-23 14:24:06.365090: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2020-12-23 14:24:06.365139: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2020-12-23 14:24:06.365186: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2020-12-23 14:24:06.365229: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2020-12-23 14:24:06.365360: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.365513: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.365579: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2020-12-23 14:24:06.365647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-23 14:24:06.365681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0
2020-12-23 14:24:06.365707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N
2020-12-23 14:24:06.365854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.366024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.366123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1284 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-12-23 14:24:24.278440: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 1850 ops of 29 different types in the graph that are not converted to TensorRT: Fill, Merge, Switch, Range, ConcatV2, ZerosLike, Identity, NonMaxSuppressionV3, Minimum, StridedSlice, ExpandDims, Unpack, TopKV2, Cast, Transpose, Placeholder, ResizeBilinear, Squeeze, Mul, Sub, Const, Greater, Shape, Where, Reshape, NoOp, GatherV2, AddV2, Pack, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2020-12-23 14:24:25.465888: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 2
2020-12-23 14:24:25.630790: F tensorflow/core/util/device_name_utils.cc:92] Check failed: IsJobName(job)
Aborted (core dumped)
tekeburak commented 3 years ago

I solved the problem. These code changes need to be applied in tf_download_and_trt_model.py

diff --git a/tf_download_and_trt_model.py b/tf_download_and_trt_model.py
index c5e608c..083f746 100644
--- a/tf_download_and_trt_model.py
+++ b/tf_download_and_trt_model.py
@@ -1,4 +1,4 @@
-import tensorflow.contrib.tensorrt as trt
+from tensorflow.python.compiler.tensorrt import trt_convert as trt
 import sys
 import os
 try:
@@ -19,6 +19,7 @@ print ("Building detection graph from model " + MODEL + "...")
 frozen_graph, input_names, output_names = build_detection_graph(
     config=config_path,
     checkpoint=checkpoint_path,
+    force_nms_cpu=False,
     score_threshold=0.3,
     #iou_threshold=0.5,
     batch_size=1

Please refer to these issues for more details. https://github.com/tensorflow/tensorrt/issues/197 https://github.com/tensorflow/tensorrt/issues/107