espressif / esp-tflite-micro

TensorFlow Lite Micro for Espressif Chipsets
Apache License 2.0
363 stars 79 forks source link

Significant Increase in Detection Speed During Keyword Spotting Using ESP-TFLite-Micro Compared to TFLite Micro (TFMIC-33) #90

Open pravee625 opened 1 month ago

pravee625 commented 1 month ago

When I was using TFLite Micro from TensorFlow for keyword spotting, I observed an average detection speed of approximately 226 ms. However, after switching to the latest version of ESP-TFLite-Micro with ESP-NN, the detection speed increased significantly to 926 ms. I am using same model.cc file for both implementation. tflite-micro
image

esp-tflite-micro with esp-nn image

is the esp-tflite-micro slower than the tensor flow tflite-micro??

vikramdattu commented 1 month ago

Hello @pravee625 this is not at all expected. esp-nn if any should reduce the detection time and not vice versa. Is it possible for you to share the simple example with which I can reproduce the issue? You can experiment from esp-tflite-micro by disabling and enabling esp-nn optimisations for profiling.
Additionally, by removing this flag here will let you run the esp-tflite-micro without esp-nn altogether which should give time exactly same as the tflite-micro.

pravee625 commented 1 month ago

Thanks for the fast reply @vikramdattu. I've uploaded two of my project files named CNN_test_with_new_tflite_ESP-nn.zip, which has the new ESP-TFLite-Micro library implemented, and chhecking mic for esp32-wrover.zip, which uses the older version of TFLite-Micro that I forked from https://github.com/atomic14/voice-controlled-robot/tree/main/firmware/lib/tfmicro.

I've attached both of the project folders on the below link https://github.com/pravee625/KWS-using-tflite-micro

You can use VS Code and PlatformIO to exactly reproduce my issue here. I am currently using the ESP32-DevKitC_V4, which has the ESP32-WROVER-IB microcontroller.

The main difference in my implementation of both libraries can be seen in lib\neural_network\src\NeuralNetwork.cpp of both project folders.

pravee625 commented 1 month ago

I suspect that I am not using esp-nn while compiling my project. I've recently discovered this, but the model still runs significantly slower than TFLite Micro without using esp-nn. I am unsure where I might be going wrong with the implementation of the esp-tflite-micro library in my code.

Could you please take a look at my code @vikramdattu ? I've attached the link to the file CNN_test_with_new_tflite_ESP-nn.zip in the above comment. Specifically, please review the NeuralNetwork.cpp file located in CNN_test_with_new_tflite_ESP-nn\lib\neural_network\src\NeuralNetwork.cpp, where I've used the esp-tflite-micro library implementation.

Thank you!

Hello @pravee625 this is not at all expected. esp-nn if any should reduce the detection time and not vice versa. Is it possible for you to share the simple example with which I can reproduce the issue? You can experiment from esp-tflite-micro by disabling and enabling esp-nn optimisations for profiling. Additionally, by removing this flag here will let you run the esp-tflite-micro without esp-nn altogether which should give time exactly same as the tflite-micro.

vikramdattu commented 1 month ago

Although your comment https://github.com/espressif/esp-tflite-micro/issues/91#issuecomment-2275019833 describes the time is now lesser, i.e., 200ms. It is not much faster compared to 226ms!

I have a question:

pravee625 commented 1 month ago

@vikramdattu I am detecting the average time for multiple invoke calls. image Lesser time means faster invoke operations, reducing the delay between two invokes and thus increasing the pace of detection.

pravee625 commented 1 month ago

Although your comment #91 (comment) describes the time is now lesser, i.e., 200ms. It is not much faster compared to 226ms!

I have a question:

  • What's the method you employ for measuring the time? Ideally, you should just measure raw times across single/multiple invoke calls and should not be based on average detections. Only the time taken by invoke calls can give us the idea about the CPU consumption.

@vikramdattu the raw time for a single invoke call is 174 ms. image