1) Ubuntu 16.04 2) Docker tensorflow/tensorflow 1.13.1 and tensorflow/serving:latest-gpu 3) NVIDIA TensorRT 5.0.2 (https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html) 4) Tensorflow object detection Faster RCNN Resnet101 successfully built w/ 2 classes only 5) Model are converted into FP32 (also tried FP16 with the same below issue) with graph.as_default(): with tf.Session() as sess: trt_graph = trt.create_inference_graph( input_graph_def=gdef, outputs=outputs, max_batch_size=1, max_workspace_size_bytes=4000000000, is_dynamic_op=True,

precision_mode='FP16')

    precision_mode='FP32')
    #precision_mode='INT8')
    output_node=tf.import_graph_def(trt_graph, return_elements=outputs)
    #sess.run(output_node)
    tf.saved_model.simple_save(sess,
    rt_output_file_name_32,
    inputs={'input_image': graph.get_tensor_by_name('{}:0'.format(node.name))
    for node in graph.as_graph_def().node if node.op=='Placeholder'},
    outputs={t:graph.get_tensor_by_name('import/'+t) for t in outputs}
   )

RUN: docker kill food_non_food docker run --runtime=nvidia -p 8501:8501 --mount type=bind,source=/mnt/hatto/food_non_food,target=/models/food_non_food \ -e MODEL_NAME=food_non_food -t tensorflow/serving:latest-gpu

CLIENT: image = PIL.Image.open(IMAGE_PATH) image_np = np.array(image) payload = {"instances": [image_np.tolist()]} SERVING_URL = 'http://localhost:8501/v1/models/food_non_food:predict' start = time.time() t = requests.post(SERVING_URL, json=payload) end = time.time() print ('Took ', end-start)

Consistenly received ERROR/WARNING:

2019-07-10 06:52:26.523782: W external/org_tensorflow/tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:264] Engine buffer is full. buffer limit=1, current entries=1, requested batch=153216 2019-07-10 06:52:26.523827: W external/org_tensorflow/tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:281] Failed to get engine batch, running native segment for import/ClipToWindow/Area/TRTEngineOp_0

This is it. It runs but alsways took 1.8 seconds/image (size 1024x) which is terrible ! The message above keep poping up that batch_size is 153216 while I submit only ONE SINGLE image !!!

I do not think that TensorRT is for PRODUCTION level just yet !

NVIDIA-AI-IOT / tf_trt_models

Very slow 1.8s/image on Faster RCNN ResNet101 and wrongly perceived 1 image as 153216 images !!! #55

precision_mode='FP16')