Improve inference time leveraging `tf.function`

AlexDut commented 4 years ago

Hi,

I made some improvements concerning the inference time for the RetinaFace.detect method. There are 2 significant changes in this PR:

The RetinaFace.model attribute was decorated with a tf.function with (None, None, None, 3) as input signature. Thanks to this, TensorFlow is tracing a Graph and calling the model on inputs is done much faster. The input signature shape prevents TensorFlow from retracing a new graph (which is a costly operation) whenever the function is called with a new input shape. This is very useful for RetinaFace as input images can have arbitrary shapes.
I spitted the pre-processing code into private methods that can be re-used if we want to implement batch prediction. I also removed transpose operations that were not necessary.

Considering that the model was instantiated with default parameters and processing images 1-by-1, the inference time on 475 images was reduced by 3 (from 342s to 115s - tested on a 1080Ti).

I am also considering to develop a detect_batch method with RetinaFace.pixel_means padding. This might have a small negative impact on accuracy as noise (padding) is added but could provide another boost in terms of inference time. What do you think ?

StanislasBertrand commented 4 years ago

Hi, Thanks for your work, this sounds like a nice speed up !

Regarding the batch inference, not sure why there would be a negative impact on accuracy ? Just pad smaller pictures to the largest pictures size with black or anything else. Or resize images to the same size.

StanislasBertrand commented 4 years ago

You can open a new PR for the discussion on batch inference

StanislasBertrand / RetinaFace-tf2

Improve inference time leveraging `tf.function` #7