quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
https://aihub.qualcomm.com
BSD 3-Clause "New" or "Revised" License
338 stars 45 forks source link

Using NPU in ImageClassification demo app #51

Open ciaranbor opened 1 month ago

ciaranbor commented 1 month ago

I am attempting to use the image classification Android app to verify NPU usage with the tflite runtime. I am using a phone with the Snapdragon 8 Gen 3 chipset.

When running the app, the execution time for CPU and NPU (after a few executions of each) settle at around the same, ~14ms compared to the reported NPU execution time of <1ms. Is there any way to debug this or verify whether the tflite runtime is using the NPU?

I have tried both downloading the MobileNet-v3-Small model from aihub and using the this export script. I have also tried installing the app with the provided build_apk.py script and using Android Studio, after manually copying the tflite model to assets. In each case the result is the same.

When I try the app on a phone without an NPU, it crashes while trying to register the HTP backend. This leads me to believe the HTP backend is being registered correctly for the Gen 3 phone.

mestrona-3 commented 1 month ago

Hi @ciaranbor, thank you for reporting this! This is a known issue, that the app performance doesn't reproduce the expected latency in AI Hub. we're actively working on a fix and are hoping to have an update by our next release.

MrRace commented 2 weeks ago

mark~