mozilla / cubeb-coreaudio-rs

The audio backend of Firefox on Mac OS X.
ISC License
25 stars 10 forks source link

Best strategy for pre-allocating buffers used in I/O callback #155

Closed eyeplum closed 2 years ago

eyeplum commented 2 years ago

Hi there,

I'm using cubeb-rs to build an audio application (for macOS only at the moment) capable of some level of audio processing. As a result, I'm writing my own audio processing graph which involves chaining a set of audio processing nodes together and run the audio input/output through them on each I/O callback. Some of the nodes require pre-allocating buffers in order to perform their tasks.

At the moment, I'm using a naive approach of pre-allocating the same amount of space as specified when creating the I/O stream (i.e. the StreamBuilder::latency() I specified when creating the Stream).

I realize there are a few possible issues with this approach:

  1. It's possible that the requested buffer size can not be fulfilled by the audio device (e.g. due to the size being too small or too large)
  2. It's also possible that the buffer size may work initially, but if the underlying device of the stream changes (e.g. due to the initial device being disconnected), then the new device could have a different sample rate, causing resampling happening internally in cubeb, then causes the I/O callback to use a different buffer size (for example, if the initially connected device has a sample rate of 48000 and a buffer size of 256, when it disconnects and the stream falling back to another device with sample rate of 44100, then the audio I/O will start to tick with input/output buffer size of 279, at which point the pre-allocated buffers in the audio processing nodes are no longer sufficient)

For 1), I tried to find an API for figuring out the actual buffer size (or latency) of the audio stream after it has been created. However neither Stream::latency() or Stream::input_latency() seem to return the actual buffer size.

For 2), the only way I can think of now is to write some code to stop the audio stream once a device change is detected (e.g. via the device changed callback of Stream), then re-allocate buffers if needed and try to restart the stream using the new device.

Would be keen to hear your thoughts.

Thanks.

padenot commented 2 years ago

For 1) it's generally the latency figure you pass to the initialization. It's not a very good name. You find the minimum latency by the call of the same name in the public API.

For 2), yes, that's the best approach, but it's also rather complex in practice.

In general, you'll find that the guarantee to have a fixed block size doesn't hold across environments (not a guarantee on linux or windows, for example). Either your system is simple enough that rebuilding the graph from scratch when the device changes is feasible (changing sample-rate often isn't possible because of the state of the processing nodes, e.g., the coefficients and memory of an IIR filter), or you'll have somehow mimick the characteristics of the old device, in terms of buffer size, for the new device (i.e. if it goes from 256 to 279, you still process in 256 frames chunk and take the latency/cpu hit).

eyeplum commented 2 years ago

Thanks @padenot, those are indeed very useful info!

You find the minimum latency by the call of the same name in the public API

Ah, I see, so instead of fetching the "actual latency" after the stream is created, I could simply use DeviceInfo::latency_lo(), correct?

For 2), yes, that's the best approach, but it's also rather complex in practice

Yes, I found it quite complex when I tried to implement it this way too.

or you'll have somehow mimick the characteristics of the old device, in terms of buffer size, for the new device

I had more progress trying out this way, I processed the audio I/O callback in small chunks (e.g. 32 frames), and if the I/O block is not evenly divided by the chunk size, I will just process whatever is left (e.g. for a callback with 70 frames, I will process it as 3 chunks [0..32, 32..64, 64..70]).

(Now that I think about it, the chunk size could potentially be aligned with the expected buffer size, i.e. it doesn't have to be as small as 32 - I think the key aspect here is to be able to break the audio I/O block into chunks that can fit in the pre-allocated buffers.)

This seems to work well so far.

I will close this issue for now as I think I have a way forward. Thanks for the help.