Realtime frequency peak detection

Clon1998 commented 9 months ago

Hello,

I would like some guidance on how to implement a real-time (FFT) on microphone data to detect the peak frequency. However, most of the examples in the repo are using WAV files, and I am unsure of how to get this working with the microphone data directly on the device.

Currently, I am using mic_stream lib to obtain real-time readings of the microphone as 32-bit float Pulse Code Modulation (PCM) values. My current approach is to buffer X number of samples from the mic_stream, run a STFT on the buffered input, and then use the magnitude method to perform a max search to get the index of the peak frequency. Finally, I use the stft.frequency method to obtain the actual frequency.

I'm having trouble selecting the buffer sample size X. My current approach is to collect values of the mic_stream lib for 250ms, then process it via STFT. Unfortunately, in 90% of cases, the STFT function returns a list of NaN values, and in cases where it does not, the detected peak is not even close to the frequency I am playing into the input (200hz wave).

Now I am unsure if the data I feed into the STFT is simply wrong (each sample: [-1.0; 1.0), and 32 bit precision) or if the samplingFrequency I am using for the index to frequency conversion is wrong? Currently, I am using the samplingfrequency of the mic that the mic_stream lab reports.

liamappelbe commented 9 months ago

Your overall approach sounds fine. Step one is definitely to verify the data you're getting from mic_stream. You definitely shouldn't be seeing NaNs, so I suspect the mic data is bad.

Coincidentally I recently worked on a project that streamed microphone data through STFT, and I ran into similar issues. To verify the microphone data you'll need to find some way of playing the audio. In my case I saved the audio chunks to WAV files on my phone, then copied then to my computer so I could look at the sample data directly.

I also started out my project using mic_stream, but I ran into all sorts of issues. For example, I was using 16-bit PCM and I found that the 2 bytes in each sample were the wrong way around, so I had to manually decode it. After a lot of troubleshooting I eventually got the audio to sound ok, but there was a weird clicking sound in the chunks. Turned out that the first few samples in each chunk I received from the microphone were 0. At that point I gave up and switched to flutter_audio_capture.

Clon1998 commented 9 months ago

Hey, thank you so much for your helpful advice! I was able to get non-NaN values by using FFT instead of STFT, as I only needed to detect peaks FFT is sufficient. Your suggestion about using mic_stream was also very useful. I switched to using flutter_audio_capture and it's now working perfectly. Thanks again!

liamappelbe commented 9 months ago

@Clon1998 glad you got it working. I'm a bit concerned that you were getting NaNs with STFT but not FFT though. Sounds like a bug in my code. Are you able to figure out the input data that does this and post it here?

Clon1998 commented 9 months ago

@Clon1998 glad you got it working. I'm a bit concerned that you were getting NaNs with STFT but not FFT though. Sounds like a bug in my code. Are you able to figure out the input data that does this and post it here?

I need to double check. But I think I also got NaN values while using the 32 BIT Float PCM encoding. After switching to 32 Bit PCM (int) I got normal values.

liamappelbe / fftea

Realtime frequency peak detection #45