RustAudio / cpal

Cross-platform audio I/O library in pure Rust
Apache License 2.0
2.71k stars 358 forks source link

Plan for implementing an ALSA duplex API #628

Open nyanpasu64 opened 2 years ago

nyanpasu64 commented 2 years ago

This is a draft.

Overview and speculation

I'm told that JACK clients are fed input and output buffers synchronously, by jackd (the audio server), and that JACK's application-facing API abstracts away buffer size management from the app, and instead jackd (the server) handles routing and hardware IO/buffering. ALSA clients are not like that. You open independent input and output streams, and you have to align their block sizes, sampling rates, open them at the same time, read and write the same amount from both streams, etc. I hear that Apple's Core Audio exposes a JACK-like synchronous duplex API that communicates with physical hardware. (On Linux, you can have JACK's synchronous buffers, ALSA's direct hardware access, or neither when an ALSA app talks to pulseaudio-alsa or pipewire-alsa.)

On Linux, I get the impression that the only apps designed to be routable in a graph are JACK apps. PipeWire lets you route the inputs and outputs of Pulse/ALSA apps arbitrarily as well (in a patchbay app), but the apps were not written with this in mind. Worse, in ALSA's case, the application-facing API was written around timing being determined by hardware in real time, and the app managing data buffering itself. As a result, I think ALSA duplex can achieve the same round-trip latency on physical hardware (from hardware line in to speaker out) as a JACK client, but I'd be surprised if you can chain 3 ALSA duplex apps in a PipeWire graph and not get 1-2 periods of added latency per app, whereas 3 JACK duplex apps on pipewire-jack (see below) add zero latency compared to 1 app.

Helvum showing three cpal duplex clients chained in a row, in the PipeWire graph

jackd never changes buffer/period sizes. pipewire changes buffer/period sizes when you open and close apps. I'm not sure if/how it changes the period size of an ALSA device, but it seems buggy. Canberra notification sounds set the period to 8192 samples (absurdly high latency) after they start playing, there is/was audio glitches when periods get longer (https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/1436 ?), and the round-trip latency of jack_iodelay fed through speaker output, a physical aux cable, and line-in input can change (when input/output period sizes diverge? upon xrun?).

jackd (JACK server) and rtaudio (an audio library for apps, with Linux/Windows/Mac backends) are ooold, and both use threads but predate C++11 atomics. jackd uses volatile variables, rtaudio just uses data races. RtAudio has real-world race conditions as well (which I could trigger if I wanted with a crafted test app), and incomprehensible data ownership/sharing that I'd have to rewrite to fix.

This was a learning experience. But I'm really not the most qualified person to talk about ALSA. Sadly I don't know who else understands ALSA vs. JACK well, and is willing to share their insights publicly.

jackd notes

this is a summary (given my current understanding) of how jack2 (didn't look into jack1) handles ALSA duplex, and how I'd implement it in cpal #553, or improve RtAudio's duplex, etc:

Threads:

setup (alsa_driver_new() -> alsa_driver_set_parameters()):

beginning playback (alsa_driver_start()):

in the main loop (JackAudioDriver::Process()):

in jack2 synchronous mode (JackAudioDriver::ProcessSync()), the main loop waits for both input/output to be ready, then reads input from hardware, computes output, and writes output to hardware.

Details:

upon xrun:

I took a quick glance at what happens during xrun, and I think this is what happens: Stop and start both capture and playback streams, regardless of which one hit xrun. Don't close or recreate streams or any other state, though.

How does cpal handle xruns? Does it handle them at all? (TODO look into it)

nyanpasu64 commented 2 years ago

I'm probably not implementing alsa duplex until cpal can actually properly detect and open my hw devices. pipewire-alsa (and likely pulseaudio-alsa) are terrible apis for low latency output and duplex, because both the app and the audio server buffer audio (there are possible workarounds, which pipewire-alsa doesn't do, and handling duplex correctly is especially tricky and situational, and it's difficult to get a general solution).

Right now cpal doesn't detect hw out, and crashes trying to read from hw in (#630). Does cpal seek to target professional DAW use (jackd or alsa hw devices), or mainstream users without a spare audio interface (pulseaudio/pipewire servers, any protocol they support)? If the latter, I think adding a pulse backend is more important, at least until pipewire becomes mainstream (at which point cpal can use the jack backend or add a pipewire one).

bartkrak commented 1 year ago

What's the current state of cpal support for duplex streams? I'm working on an advanced audio app using cpal where i need to process audio from input and send it back to the same hardware device for output. I have input and output as separate streams and ringbuff in between, it kinda works but sometimes (totally random) i get "backend specific error: broken pipe" and input stream starts giving me silence, I have to restart my app to make it work again. Sometimes it works for a few hours, sometimes only few seconds. Any ideas how to handle this problem?