ShawnHymel / ei-keyword-spotting

155 stars 50 forks source link

Some issues related to mfcc feature extraction #5

Open elimsjxr opened 2 years ago

elimsjxr commented 2 years ago

Hello, I don’t know much about the feature extraction mfcc part of the code. If I want to view the 650 feature values after feature extraction by mfcc, where are these 650 feature values? Is it in the out_buffer of Processing.hpp? image

ShawnHymel commented 2 years ago

@elimsjxr This is a really good question, and it's not something I have dug into. As the feature extraction (MFCC) and inference code is written by Edge Impulse, you might want to ask on their forums. The dev team is usually super responsive there.

I believe the MFCCs for each time slice (a few milliseconds?) are stored in the output_matrix array in the ei_run_dsp.h file:

int ret = speechpy::feature::mfcc(output_matrix, &preemphasized_audio_signal,
        frequency, config.frame_length, config.frame_stride, config.num_cepstral, config.num_filters, config.fft_length,
        config.low_frequency, config.high_frequency);

This function is called in the run_classifier_continuous() function in ei_run_classifier.h. I think that's where the slices are stacked together to form a full set of features (i.e. MFCC array for a 1-second sample).

elimsjxr commented 2 years ago

@elimsjxr This is a really good question, and it's not something I have dug into. As the feature extraction (MFCC) and inference code is written by Edge Impulse, you might want to ask on their forums. The dev team is usually super responsive there.

I believe the MFCCs for each time slice (a few milliseconds?) are stored in the output_matrix array in the ei_run_dsp.h file:

int ret = speechpy::feature::mfcc(output_matrix, &preemphasized_audio_signal,
        frequency, config.frame_length, config.frame_stride, config.num_cepstral, config.num_filters, config.fft_length,
        config.low_frequency, config.high_frequency);

This function is called in the run_classifier_continuous() function in ei_run_classifier.h. I think that's where the slices are stacked together to form a full set of features (i.e. MFCC array for a 1-second sample).

I think you are right. In addition, I still have some things I don't understand about SAI. SAI has been receiving voice data continuously. How can I control its interrupt priority? For example, if I introduce a new interrupt in “while", the operation of SAI always seems to affect the new interrupt, or can I close SAI directly?

ShawnHymel commented 2 years ago

I think you can adjust the SAI (or associated DMA) priority in the NVIC settings. Here it is in CubeMX (I'm not sure where the code for it is, though--I just haven't dug through it):

image

I don't think it's a good idea to lower the priority of the SAI buffer, as missing samples would probably be really bad (and you'd have to modify how the buffering system works in the main program to feed the Edge Impulse library).

I have not tried closing the SAI and restarting it, so I'm not exactly sure how this would work. I'm guessing you'd have to call HAL_SAI_Abort() and then restart it with HAL_SAI_Receive_DMA() again. Looking through the HAL functions for SAI is probably a good place to start: https://www.st.com/resource/en/user_manual/dm00173145-description-of-stm32l4l4-hal-and-lowlayer-drivers-stmicroelectronics.pdf.

elimsjxr commented 2 years ago

Hi,I have a question that may seem stupid, but I am really confused about it. After I added a timer in the inference process, I found that the previous SAI function could not be used normally, and it was stuck in the "ei_microphone_inference_record()" function at runtime, specifically in the sentence while(inference.buf_ready=0). It means that inference.buf_ready is always equal to zero? Or the callback function does not work? Do you know why about it? Thank you so much !!

ShawnHymel commented 2 years ago

Are you talking about this while loop?

  // %%%TODO: make this non-blocking
  while (inference.buf_ready == 0)
  {
    continue;
  }

This causes the program to wait (do nothing) until the buffer is filled. If you're finding that the code is not exiting from that loop, then it means that the buffer is not filling up for some reason. Specifically, it means that this callback is not being called or for whatever reason, the buf_count variable is not incrementing (such that the conditional is not called to set buf_ready to 1):

static void audio_buffer_inference_callback(uint32_t n_bytes, uint32_t offset)
{
  // Copy samples from I2S buffer to inference buffer. Convert 24-bit, 32kHz
  // samples to 16-bit, 16kHz
  for (uint32_t i = 0; i < (n_bytes >> 1); i++) {
    inference.buffers[inference.buf_select][inference.buf_count++] =
        (int16_t)(i2s_buf[offset + (I2S_BUF_SKIP * i)] >> 8);

    if (inference.buf_count >= inference.n_samples) {
      inference.buf_select ^= 1;
      inference.buf_count = 0;
      inference.buf_ready = 1;
    }
  }
}

It's been a while, but the last time I checked, my code should work as-is. Make sure that you're enabling the DMA interrupt (or SAI interrupt if you're not using DMA) in CubeMX. I recommend toggling an LED in the callback to make sure that it's executing.

elimsjxr commented 2 years ago

Okay, I learned a lot, thank you very much!