Open alrostami opened 1 year ago
Ok, digging more, I found out when the AFE config is set to use 1 mic
and 1 ref
, i.e.:
afe_config.pcm_config.total_ch_num = 2;
afe_config.pcm_config.mic_num = 1;
afe_config.pcm_config.ref_num = 1;
calling afe_handle->get_feed_chunksize(afe_data)
, the audio chunk size is 160.
On the other hand, when changing the AFE config to use 2 mics
and 1 ref
(or 2 mics
and 0 ref
):
afe_config.pcm_config.total_ch_num = 2;
afe_config.pcm_config.mic_num = 2;
afe_config.pcm_config.ref_num = 0;
and
afe_config.pcm_config.total_ch_num = 3;
afe_config.pcm_config.mic_num = 2;
afe_config.pcm_config.ref_num = 1;
the audio chunk size becomes 1024. I believe this is a bug.
Regardless of the microphone array settings, in detect_Task
, afe_chunksize
for fetching processed audio is always 512, which matches the mu_chunksize
512.
Hi @alrostami , it is not a bug. we use different algorithms for 1mic and 2mic. Those algorithms need different context.
Thanks for your reply @feizi. I am slightly lost in understanding what AFE's feed function expects. I already know that bits per sample must be 16 (or downsample to 16 if i2s is set to more than 16). Does it expect left and right channels no matter afe_config.pcm_config.mic_num
is 1 or 2? The only comment I found on this is located in esp-sr/include/esp32s3/esp_afe_sr_iface.h
, which says:
/**
* @brief Feed samples of an audio stream to the AFE_SR
*
* @Warning The input data should be arranged in the format of channel interleaving.
* The last channel is reference signal if it has reference data.
*
* @param afe The AFE_SR object to query
*
* @param in The input microphone signal, only support signed 16-bit @ 16 KHZ. The frame size can be queried by the
* `get_feed_chunksize`.
* @return The size of input
*/
Essentially "The input data should be arranged in the format of channel interleaving. The last channel is reference signal if it has reference data."
Could you point out documentation on this or tell me how I can find more information?
The doc of AFE is here
I was trying to get the speech recognition examples to work with two MEMS microphones in
I2S_SLOT_MODE_STEREO
mode, but all of the available examples are set to use a single mic inI2S_SLOT_MODE_MONO
mode. I tried it with two INMP441 and wired one as L (grounded the L/R pin) and the other as R (connected L/R pin to VDD). After a few tries, I noticed the i2s_new_channel(.) arguments are hard coded to mono and 16 bits per channel, so I modified it to stereo and set the bit per channel to 16 and the sample rate to 16000. Plus, setting the total channels to 3, the number of microphones to 2, and the number of refs to 1. When I flashed it, the wake word stopped working. I am wondering if this is a bug or it's me who is missing something here.