spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
1.05k stars 149 forks source link

Is it possible to get small piece of audio every few seconds while keeping stream opened? #195

Closed Laope94 closed 5 years ago

Laope94 commented 5 years ago

Hi, I am working on project, which ultimate goal is emotion classification from speech in real time (actually close to real time with short delay). I can handle feature extraction and classification, but somehow I need to come up with good method of input from microphone. Basically what I need to do is open stream, extract features from piece of audio long about two or three seconds, but keep stream opened so I won't loose any data in next piece to process. More clearly - open stream and get small piece of audio from it as output every few seconds. It doesn't matter if output is raw numpy array or .wav, I can handle both. I've experimented with looping of rec() , but that of course led to short dropouts between wavs. as stream was opening and closing everytime. Is there a way how to do this with sounddevice or should I be looking for something else?

mgeier commented 5 years ago

Did you have a look at the examples?

Laope94 commented 5 years ago

I was eventually able to modify example for unlimited recording to save new .wav file every few seconds. What I'm doing is getting chunks from query and appending them to array. If array reaches certain number of chunks (I am using 100 right now, which is about 4.2 seconds of audio at 48k sample) I am running function I wrote on new thread, I am passing array with chunks to this function and inside I convert it to numpy array. Then I need to reshape numpy array because it's [100, 2048, 1] and I need [204800,1] and just then I can save .wav file with soundfile. And while this thread is running, I can make array where I am saving chunks from queue empty again and ready for completely new piece of audio. I am not sure if this is optimal way, but it works without loosing any data. I wanted to keep this opened until I see if I can come up with something better or not.

mgeier commented 5 years ago

This sounds like a reasonable way to do it.

However, if you want to process the data immediately, it probably doesn't make sense to write it into WAV files. You can just keep the NumPy arrays in memory and do the processing on them, without having to write files.

Laope94 commented 5 years ago

I guess I can now close this, since I am sticking with this method. Of course, you are right, it's reasonable to process just numpy arrays, but for now I needed .wav files since I did few experiments with external software, but in future there won't be need for that, because everything will be handled in my own application. Anyway, it's definitely possible to save or process just piece of audio every few seconds without data loss. Maybe someone else will find this helpful too.