1) Ubuntu 16.04
2) Docker tensorflow/tensorflow 1.13.1 and tensorflow/serving:latest-gpu
3) NVIDIA TensorRT 5.0.2 (https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html)
4) Tensorflow object detection Faster RCNN Resnet101 successfully built w/ 2 classes only
5) Model are converted into FP32 (also tried FP16 with the same below issue)
with graph.as_default():
with tf.Session() as sess:
trt_graph = trt.create_inference_graph(
input_graph_def=gdef,
outputs=outputs,
max_batch_size=1,
max_workspace_size_bytes=4000000000,
is_dynamic_op=True,
precision_mode='FP16')
precision_mode='FP32')
#precision_mode='INT8')
output_node=tf.import_graph_def(trt_graph, return_elements=outputs)
#sess.run(output_node)
tf.saved_model.simple_save(sess,
rt_output_file_name_32,
inputs={'input_image': graph.get_tensor_by_name('{}:0'.format(node.name))
for node in graph.as_graph_def().node if node.op=='Placeholder'},
outputs={t:graph.get_tensor_by_name('import/'+t) for t in outputs}
)
2019-07-10 06:52:26.523782: W external/org_tensorflow/tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:264] Engine buffer is full. buffer limit=1, current entries=1, requested batch=153216
2019-07-10 06:52:26.523827: W external/org_tensorflow/tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:281] Failed to get engine batch, running native segment for import/ClipToWindow/Area/TRTEngineOp_0
This is it. It runs but alsways took 1.8 seconds/image (size 1024x) which is terrible ! The message above keep poping up that batch_size is 153216 while I submit only ONE SINGLE image !!!
I do not think that TensorRT is for PRODUCTION level just yet !
I could not even run the inference putting the network in the Jetson Xavier. Could you show the code to let me know how to make the inference run ? Thx.
1) Ubuntu 16.04 2) Docker tensorflow/tensorflow 1.13.1 and tensorflow/serving:latest-gpu 3) NVIDIA TensorRT 5.0.2 (https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html) 4) Tensorflow object detection Faster RCNN Resnet101 successfully built w/ 2 classes only 5) Model are converted into FP32 (also tried FP16 with the same below issue) with graph.as_default(): with tf.Session() as sess: trt_graph = trt.create_inference_graph( input_graph_def=gdef, outputs=outputs, max_batch_size=1, max_workspace_size_bytes=4000000000, is_dynamic_op=True,
precision_mode='FP16')
RUN: docker kill food_non_food docker run --runtime=nvidia -p 8501:8501 --mount type=bind,source=/mnt/hatto/food_non_food,target=/models/food_non_food \ -e MODEL_NAME=food_non_food -t tensorflow/serving:latest-gpu
CLIENT: image = PIL.Image.open(IMAGE_PATH) image_np = np.array(image) payload = {"instances": [image_np.tolist()]} SERVING_URL = 'http://localhost:8501/v1/models/food_non_food:predict' start = time.time() t = requests.post(SERVING_URL, json=payload) end = time.time() print ('Took ', end-start)
Consistenly received ERROR/WARNING:
2019-07-10 06:52:26.523782: W external/org_tensorflow/tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:264] Engine buffer is full. buffer limit=1, current entries=1, requested batch=153216 2019-07-10 06:52:26.523827: W external/org_tensorflow/tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:281] Failed to get engine batch, running native segment for import/ClipToWindow/Area/TRTEngineOp_0
This is it. It runs but alsways took 1.8 seconds/image (size 1024x) which is terrible ! The message above keep poping up that batch_size is 153216 while I submit only ONE SINGLE image !!!
I do not think that TensorRT is for PRODUCTION level just yet !