Fixed blocksize support in _Recorder.record((with pulseaudio)

mpariente commented 6 years ago

Here are the proposed changes suggested in #10

Added the optional argument fixed_blocksize (default False) in the _Recorder class
Added _Rercorder.flush method to empty and return the last unused chunk

Here is a working example

import soundcard as sc
import numpy as np

FS = 16000
CHUNK = 128
RECORD_SECONDS = 20
RECORD_STEP = int(float(FS * RECORD_SECONDS) / CHUNK)

default_speaker = sc.default_speaker()
default_mic = sc.default_microphone()

with default_mic.recorder(FS, blocksize=CHUNK, fix_blocksize=True) as mic,\
     default_speaker.player(FS, blocksize=CHUNK) as sp:
    for _ in range(RECORD_STEP):
        data = mic.record(CHUNK)
        if data.shape[0] != CHUNK:
            print('Not matching')
        sp.play(data)
last_chunk = mic.flush()

bastibe commented 6 years ago

Thank you for your pull request.

There is currently already an open pull request for this issue, #9, and @stvol will fix this issue very soon.

mpariente commented 6 years ago

Thank you for your answer.

I saw the pull request but nothing happened for 3 weeks so I thought that I could take over. Moreover, this PR enables the user to choose whether to use to this feature or not, and is tested. Is there any problem with the code of this PR? Thanks,

bastibe commented 6 years ago

I saw the pull request but nothing happened for 3 weeks so I thought that I could take over.

You couldn't know this, but @stvol is a colleague of mine, and we had a few discussions about this topic already. He has been away for a few days, that's why progress on that pull request stalled.

However, your pull request has made me think about this problem in more depth. My original issue with this kind of scheme was that a user could, hypothetically, open a Recorder, then immediately record a short bit of audio, then wait, then record again. In that case, I feared, we would keep a partial block from the first recording, and then append it to the later recording; If a buffer overflow happened in the meantime, this would concatenate two discontinuous pieces of audio and produce an artifact. I didn't actually want to go forward with this plan before finding a solution to this continuity problem.

It seemed like a difficult problem, since pulseaudio does not seem to expose any API for checking for buffer overflows while recording. Today I learned why (by trying it out): Pulse just buffers everything, and will never overrun as long as your memory doesn't fill up. I tried as much as several minutes without issues. This is incredibly convenient, and means that we can go ahead and fix this problem.

That said, let's talk API:

this PR enables the user to choose whether to use to this feature or not, and is tested.

I think I don't like how we have to set both blocksize and fix_blocksize. This is confusing.
I see two solutions to this:

Have record and record_chunk, where the former does all the buffering, while the latter just returns whatever pulse is returning (no buffering, no additional latency). The record method would then call record_chunk to get data from pulse.
allow numframes to be None in record, in which case it returns whatever pulse is returning (plus any buffered frames, if available). If numframes is set, do the buffering. (And probably factor out a hidden _record_chunk anyway to make the code cleaner).

What do you think about these options?

mpariente commented 6 years ago

My original issue with this kind of scheme was that a user could, hypothetically, open a Recorder, then immediately record a short bit of audio, then wait, then record again.

The solution I found to this problem was to implement the flush method. But if we can use deeper knowledge of pulseaudio to write cleaner code, it's obviously better. Just to be sure that I understand what you explained on how pulseaudio works : If I record something, stop the recording, and (with the same object) start the recording a minute later, record will return the audio that happened a minute ago, right? In this case there is no need to flush the remaining unreturned buffer.

So on the first option, the current record method would become record_chunk and the record method would call it and perform the buffering? I like that !

For the second option, it's possible that a user would want to request numframes samples and prefers to have variable size chunks rather than a little additional delay. And I think it wouldn't be possible with this option, or am I missing something?

Another question, what should be the value of blocksize for realtime recording/playback loop? Should it be similar to numframes or much smaller?

Also, what drives the choice of the 1ms sleep here in the record code? Maybe the samples will be available before with a small blocksize.

Tell me if I can help or if I let @stvol and you to do what you will decide to do. Anyway, thank you for keeping me in the loop!

bastibe commented 6 years ago

So on the first option, the current record method would become record_chunk and the record method would call it and perform the buffering? I like that !

exactly!

For the second option, it's possible that a user would want to request numframes samples and prefers to have variable size chunks rather than a little additional delay. And I think it wouldn't be possible with this option, or am I missing something?

Sorry for being unclear: I meant, if the user says numframes=1024 we will return exactly 1024 samples, and buffer in the background. If the user does not give numframes, we set a default value of numframes=None, and return whatever pulse returns.

Another question, what should be the value of blocksize for realtime recording/playback loop? Should it be similar to numframes or much smaller?

I was thinking about this, too. It should probably be much smaller than your expected numframes. We'll have to experiment with the finished code to find a reasonable value.

Also, what drives the choice of the 1ms sleep here in the record code? Maybe the samples will be available before with a small blocksize.

In coreaudio, the only way you can access recorded data is in a callback. This loop just polls a queue that is filled by the callback. It's not a great solution. I am currently working on a better version that uses signals instead of polling, which should be much cleaner.

Tell me if I can help or if I let @stvol and you to do what you will decide to do.

If you want to, you can go ahead and implement either of the above solutions (numframes=None or record_chunk). @stvol said he'd be grateful if someone else took over.

mpariente commented 6 years ago

In coreaudio, the only way you can access recorded data is in a callback.

I was talking about pulseaudio, and I guess it's not the same. I'm no expert in any of these three libraries, that's why I'm asking.

If you want to, you can go ahead and implement either of the above solutions (numframes=None or record_chunk). @stvol said he'd be grateful if someone else took over.

I'm going to (try to) implement the record_chunk option for pulseaudio, I'll let you know when I went forward.

bastibe commented 6 years ago

I was talking about pulseaudio, and I guess it's not the same. I'm no expert in any of these three libraries, that's why I'm asking.

Sorry about that. There are two places where a sleep-wait like this happens, and the more egregious place is in the coreaudio code (which I had open in a separate tab, and confused it with your link). The reason is much the same, though: waiting for new data.

I don't want to busy-wait, as this would saturate the CPU unnecessarily and steal resources that other threads might want to use. I guess the correct solution in this case would be to register a callback using pa_stream_set_read_callback, which triggers an Event, and wait for that event trigger instead of sleeping.

(This is untested, but something I intend to try out next week or so)

I'm going to (try to) implement the record_chunk option for pulseaudio, I'll let you know when I went forward.

Cool! I'll look forward to it!

bastibe / SoundCard

Fixed blocksize support in _Recorder.record((with pulseaudio) #11