Slower inference since TensorFlow Lite 2.12

System information

Android Device information: samsung/dm3qxeea/dm3q:14/UP1A.231005.007/S918BXXS3BWK5:user/release-keys (reproduced on any device I tested)
Bundled TFLite version from Maven: 2.12.0 and later
TensorFlow Lite in Play Services SDK version: 16.1.0
Google Play Services version: 23.45.23 (190400-587848529)

Standalone code to reproduce the issue I used the TensorFlow Lite Pose Estimation Android Demo where I changed the TensorFlow Lite version

Any other info / logs With any bundled version of TFLite >= 2.12.0, the inference time of some model is twice the inference time with TFLite v2.11.0. This issue is not reproduced with the TFLite from the Play Services.

Especially when I compared last week the bundled 2.13.0 against the 2.13.0 from the Play Services. The bundled version inference times were twice the ones of the Play Services. (I can't test it anymore since the version from the Play Services has been updated to 2.15.0 and this one is not available on Maven)

For example, on a Samsung Galaxy S23 ultra, the PoseNet model from the example app:

TFLite version: 2.11.0 (bundled) Average inference time: 12.28ms

TFLite version: 2.12.0 (bundled) Average inference time: 25.93ms

TFLite version: 2.13.0 (bundled) Average inference time: 26.03ms

TFLite version: 2.14.0 (bundled) Average inference time: 25.86ms

TFLite version: 2.15.0 (Google Play Services) Average inference time: 12.12ms

Any idea where that could come from ?

google-ai-edge / LiteRT

Slower inference since TensorFlow Lite 2.12 #91