inference time improvement using pb model

kusashim commented 4 years ago

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): pip3 install tensorflow-gpu==2.3.0rc0
TensorFlow version (use command below): tensorflow-gpu==2.3.0rc0
Python version: 3.8.6
CUDA/cuDNN version: 10.1
GPU model and memory: nvidia 2080ti

Describe the current behavior

method 1) inference with model define and load weights _model = tf.keras.Model(input_layer, bbox_tensors) utils.load_weights(model, FLAGS.weights) pred_bbox = model.predict(imagedata)

method 2) inference with pb model _saved_model_loaded = tf.saved_model.load(path_to_weight, tags=[tag_constants.SERVING]) infer = saved_model_loaded.signatures['serving_default'] batch_data = tf.constant(image_data) pred_bbox = infer(batchdata)

behavior Inference with method2 is much faster than that of method1. The inference time was reduced from 100 ms to 50 ms when an image was analyzed. And I observed that the metric, GPU-Util, is increased from 35% to 47%.

In both inference methods, all settings including batch size is identical except the model loading part - Image is analyzed one by one. And model is uploaded or defined only once at the beginning before reading images.

Describe the expected behavior I want to know why inference with pb model increased the inference time performance.

amahendrakar commented 4 years ago

@kusashim, In order to expedite the trouble-shooting process, could you please provide the complete code to reproduce the issue reported here and the dataset you are using.

Also, please update TensorFlow to v2.3 (i.e. the stable release) from TensorFlow v2.3.0rc0 and check if you are facing the same issue. Thanks!

google-ml-butler[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 3 years ago

Closing as stale. Please reopen if you'd like to work on this further.

tensorflow / tensorflow

inference time improvement using pb model #44217