Closed Clon1998 closed 9 months ago
Your overall approach sounds fine. Step one is definitely to verify the data you're getting from mic_stream. You definitely shouldn't be seeing NaNs, so I suspect the mic data is bad.
Coincidentally I recently worked on a project that streamed microphone data through STFT, and I ran into similar issues. To verify the microphone data you'll need to find some way of playing the audio. In my case I saved the audio chunks to WAV files on my phone, then copied then to my computer so I could look at the sample data directly.
I also started out my project using mic_stream, but I ran into all sorts of issues. For example, I was using 16-bit PCM and I found that the 2 bytes in each sample were the wrong way around, so I had to manually decode it. After a lot of troubleshooting I eventually got the audio to sound ok, but there was a weird clicking sound in the chunks. Turned out that the first few samples in each chunk I received from the microphone were 0. At that point I gave up and switched to flutter_audio_capture.
Hey, thank you so much for your helpful advice!
I was able to get non-NaN values by using FFT instead of STFT, as I only needed to detect peaks FFT is sufficient. Your suggestion about using mic_stream was also very useful. I switched to using flutter_audio_capture
and it's now working perfectly. Thanks again!
@Clon1998 glad you got it working. I'm a bit concerned that you were getting NaNs with STFT
but not FFT
though. Sounds like a bug in my code. Are you able to figure out the input data that does this and post it here?
@Clon1998 glad you got it working. I'm a bit concerned that you were getting NaNs with
STFT
but notFFT
though. Sounds like a bug in my code. Are you able to figure out the input data that does this and post it here?
I need to double check. But I think I also got NaN values while using the 32 BIT Float PCM encoding. After switching to 32 Bit PCM (int) I got normal values.
Hello,
I would like some guidance on how to implement a real-time (FFT) on microphone data to detect the peak frequency. However, most of the examples in the repo are using WAV files, and I am unsure of how to get this working with the microphone data directly on the device.
Currently, I am using mic_stream lib to obtain real-time readings of the microphone as 32-bit float Pulse Code Modulation (PCM) values. My current approach is to buffer
X
number of samples from themic_stream
, run a STFT on the buffered input, and then use themagnitude
method to perform a max search to get the index of the peak frequency. Finally, I use thestft.frequency
method to obtain the actual frequency.I'm having trouble selecting the buffer sample size
X
. My current approach is to collect values of the mic_stream lib for 250ms, then process it via STFT. Unfortunately, in 90% of cases, the STFT function returns a list ofNaN
values, and in cases where it does not, the detected peak is not even close to the frequency I am playing into the input (200hz wave).Now I am unsure if the data I feed into the STFT is simply wrong (each sample:
[-1.0; 1.0), and 32 bit precision
) or if thesamplingFrequency
I am using for the index to frequency conversion is wrong? Currently, I am using thesamplingfrequency
of the mic that the mic_stream lab reports.