Clarification of code implementation of audio_provider.cc in the micro_speech example (TFMIC-27)

mingyr commented 3 months ago

I just wonder anyone can elaberate on a piece of code below which relates to audio capturing via I2S protocol on the ESP32 S3 platform? It is excerpted from the CaptureSamples function. My question is why right shift by 14 bits not 16? I tried to find some reference but up to now without success.

Anyone can give some explanation in this matter is highly appreciated.

#if CONFIG_IDF_TARGET_ESP32S3
      // rescale the data
      for (int i = 0; i < bytes_read / 4; ++i) {
        ((int16_t *) g_i2s_read_buffer)[i] = ((int32_t *) g_i2s_read_buffer)[i] >> 14;
      }
      bytes_read = bytes_read / 2;
#endif

vikramdattu commented 3 months ago

Hi @mingyr The capture is happening with 30 bits of data, and hence, I neglect the 14LSB to get 16 bit samples.

You can also set sample size to 16 and use 16 bit data directly.

mingyr commented 3 months ago

Dear @vikramdattu, appreciate your explanation. I would think there might exist some reference which details these technical stuff. It will be immensely helpful if you could please point it out for me (if the reference does exist) and thanks in advance.

espressif / esp-tflite-micro

Clarification of code implementation of audio_provider.cc in the micro_speech example (TFMIC-27) #84