Implement micro-speech on esp32

mocleiri commented 3 years ago

On esp32 we need to use the i2s perhiperal to sample audio and then convert into spectrograms to feed into the example input tensor.

The unix port works on the fixed 1 second yes and no samples so we need to adapt for running on a continually sampling basis.

At the moment my plan is to redo the sliding window sampling apprach in micropython. Also to redo the inference averaging output processor in micropython.

mocleiri commented 3 years ago

I have all of the different parts implemented for ESP32 with a INMP441 MEMS microphone. But not working yet.

mocleiri commented 3 years ago

I was able to use the latest audio segmentation code in the unix port by switching the model.tflite used. The one I picked up from the training page in the example had the wrong quantization, uint8 instead of int8.

I filed https://github.com/tensorflow/tensorflow/issues/48752

I used the google collab script to regenerate it from the tensorflow checkpoint and then it worked.

I'm still trying to get esp32 to work. I still suspect I'm not reading the i2s dma buffers fast enough but will probably need to rework the code so I can run esp32 from a file loaded from flash to confirm baseline inference is working before exploring ways to sample the audio from the second pro core while inferencing from the app core.

mocleiri commented 3 years ago

Runing on the latest micropython and the latest i2s module seems to have solved the audio gap issues. I am now able to almost always get the inference to work where as before it was almost never.

I need to clean-up the file structure and improve the example documentation.

mocleiri / tensorflow-micropython-examples

Implement micro-speech on esp32 #2