SteveMacenski / jetson_nano_detection_and_tracking

Jetson Nano ML install scripts, automated optimization of robotics detection models, and filter-based tracking of detections
GNU Lesser General Public License v2.1
226 stars 66 forks source link

core dumped when running tf_download_and_trt_model.py after the model has been downloaded and is in process of building trt #22

Open nitindantu opened 4 years ago

nitindantu commented 4 years ago

image

tekeburak commented 3 years ago

Hi @nitindantu, Did you solve the problem? I'm facing with the same issue though. Your help will be very appreciated @SteveMacenski.

System info:

Jetpack 4.4.1 [L4T 32.4.4]
tensorflow-1.15.4+nv20.12-cp36-cp36m-linux_aarch64.whl

Error:

Creating Jetson optimized graph...
2020-12-23 14:23:39.825571: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
2020-12-23 14:24:06.363306: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.363477: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2020-12-23 14:24:06.363784: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-12-23 14:24:06.364686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.364814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties:
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-12-23 14:24:06.364889: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2020-12-23 14:24:06.364978: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2020-12-23 14:24:06.365038: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2020-12-23 14:24:06.365090: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2020-12-23 14:24:06.365139: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2020-12-23 14:24:06.365186: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2020-12-23 14:24:06.365229: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2020-12-23 14:24:06.365360: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.365513: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.365579: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2020-12-23 14:24:06.365647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-23 14:24:06.365681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0
2020-12-23 14:24:06.365707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N
2020-12-23 14:24:06.365854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.366024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2020-12-23 14:24:06.366123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1284 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-12-23 14:24:24.278440: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 1850 ops of 29 different types in the graph that are not converted to TensorRT: Fill, Merge, Switch, Range, ConcatV2, ZerosLike, Identity, NonMaxSuppressionV3, Minimum, StridedSlice, ExpandDims, Unpack, TopKV2, Cast, Transpose, Placeholder, ResizeBilinear, Squeeze, Mul, Sub, Const, Greater, Shape, Where, Reshape, NoOp, GatherV2, AddV2, Pack, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2020-12-23 14:24:25.465888: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 2
2020-12-23 14:24:25.630790: F tensorflow/core/util/device_name_utils.cc:92] Check failed: IsJobName(job)
Aborted (core dumped)
tekeburak commented 3 years ago

I solved the problem. These code changes need to be applied in tf_download_and_trt_model.py

diff --git a/tf_download_and_trt_model.py b/tf_download_and_trt_model.py
index c5e608c..083f746 100644
--- a/tf_download_and_trt_model.py
+++ b/tf_download_and_trt_model.py
@@ -1,4 +1,4 @@
-import tensorflow.contrib.tensorrt as trt
+from tensorflow.python.compiler.tensorrt import trt_convert as trt
 import sys
 import os
 try:
@@ -19,6 +19,7 @@ print ("Building detection graph from model " + MODEL + "...")
 frozen_graph, input_names, output_names = build_detection_graph(
     config=config_path,
     checkpoint=checkpoint_path,
+    force_nms_cpu=False,
     score_threshold=0.3,
     #iou_threshold=0.5,
     batch_size=1

Please refer to these issues for more details. https://github.com/tensorflow/tensorrt/issues/197 https://github.com/tensorflow/tensorrt/issues/107