Closed kusashim closed 3 years ago
@kusashim, In order to expedite the trouble-shooting process, could you please provide the complete code to reproduce the issue reported here and the dataset you are using.
Also, please update TensorFlow to v2.3 (i.e. the stable release) from TensorFlow v2.3.0rc0 and check if you are facing the same issue. Thanks!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.
System information
Describe the current behavior
method 1) inference with model define and load weights _model = tf.keras.Model(input_layer, bbox_tensors) utils.load_weights(model, FLAGS.weights) pred_bbox = model.predict(imagedata)
method 2) inference with pb model _saved_model_loaded = tf.saved_model.load(path_to_weight, tags=[tag_constants.SERVING]) infer = saved_model_loaded.signatures['serving_default'] batch_data = tf.constant(image_data) pred_bbox = infer(batchdata)
behavior Inference with method2 is much faster than that of method1. The inference time was reduced from 100 ms to 50 ms when an image was analyzed. And I observed that the metric, GPU-Util, is increased from 35% to 47%.
In both inference methods, all settings including batch size is identical except the model loading part - Image is analyzed one by one. And model is uploaded or defined only once at the beginning before reading images.
Describe the expected behavior I want to know why inference with pb model increased the inference time performance.