RustAudio / cpal

Cross-platform audio I/O library in pure Rust
Apache License 2.0
2.72k stars 361 forks source link

ALSA: invalid handling of capture overruns #730

Open codepainters opened 1 year ago

codepainters commented 1 year ago

I'm working on an application using cpal over ALSA to continuously capture audio stream on Raspberry Pi. I use both RPI and PC for testing. Note: I'm using a version from master branch, as I need both i16 and i32 sample formats, with i32 not being supported in stable version.

Unfortunately I'm experiencing odd behavior upon capture overruns (which I trigger setting buffer size too low).

  1. The most severe case is when poll() return EPOLERR which is not handled properly.
    476794 1669022562.665530 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
    476794 1669022562.666440 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT,      0x7f6b10495f88) = 0
    476794 1669022562.666532 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) =      0
    476794 1669022562.666577 ioctl(5, SNDRV_PCM_IOCTL_READI_FRAMES,      0x7f6b10496760) = 0
    476794 1669022562.666707 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
    476794 1669022562.669590 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
    476794 1669022562.669765 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])

    And then I no longer get any more callbacks from cpal, but my application is eating up 100% of CPU core, as poll() is executed in a tight loop.

The problem is here:

https://github.com/RustAudio/cpal/blob/7776c6634a8e21e2a585358545141fe091c38320/src/host/alsa/mod.rs#L746-L753

This match doesn't recognize POLLIN|POLLERR and returns PollDescriptorsFlow::Continue, causing the worker thread to poll() again immediately, leading to a tight loop.

The problem seems obvious, I'll try to come with a pull request.

  1. Another scenario is when the second ioctl call fails.
476936 1669022662.246764 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
476936 1669022662.247721 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f70e2810f88) = 0
476936 1669022662.247884 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = -1 EPIPE (Broken pipe)
476936 1669022662.248019 ioctl(5, SNDRV_PCM_IOCTL_PREPARE, 0) = 0
476936 1669022662.248112 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.448722 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
476936 1669022662.449074 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.649849 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
476936 1669022662.650342 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.851166 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79

In this scenario I no longer get any data callbacks, instead I get error callback periodically (poll timeout).

I believe the problem is here:

https://github.com/RustAudio/cpal/blob/7776c6634a8e21e2a585358545141fe091c38320/src/host/alsa/mod.rs#L604-L609

The call to prepare() passes OK, but the stream is stopped - according to ALSA, this call changes the state to SND_PCM_STATE_PREPARED, there also need to be a call to start() to transition to SND_PCM_STATE_RUNNING and re-enable data flow.

I'm a bit confused here about what is the intended cpal behavior here:

  1. The third scenario is when ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) fails with EPIPE:
476728 1669022521.635527 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN}])
476728 1669022521.636362 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f8b2368af88) = 0
476728 1669022521.636521 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = 0
476728 1669022521.636681 ioctl(5, SNDRV_PCM_IOCTL_READI_FRAMES, 0x7f8b2368b760) = -1 EPIPE (Broken pipe)
476728 1669022521.636777 write(1, "Die! A backend-specific error has occurred: ALSA function 'snd_pcm_readi' failed with error 'EPIPE: Broken pipe'\n", 113) = 113
476728 1669022521.636823 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN|POLLERR}])
476728 1669022521.636860 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN|POLLERR}])
476728 1669022521.636892 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN|POLLERR}])

Here, the error from process_input() is properly delivered to my error callback, but then we enter the tight loop scenario, due to POLLERR not being recognized.

ashleysmithgpu commented 9 months ago

Just a quick note that I ran in to this issue as-well. I was getting constant ALSA lib pcm.c:8545:(snd_pcm_recover) "overrun" occurred errors. I changed the code to add the suggested:

            PollDescriptorsFlow::XRun => {
                if let Err(err) = stream.channel.prepare() {
                    error_callback(err.into());
                }
                if let Err(err) = stream.channel.start() {
                    error_callback(err.into());
                }
                continue;
            }

And it seems to work fine now. I still get the overruns, but it gracefully continues. I am running on a raspberry PI zero w with the respeaker 2mic board attached.

codepainters commented 9 months ago

I have moved away from CPAL altogether. My application only needs to work on Linux (Windows was a nice-to-have feature), and I only need a simple record capability (and I also use Respeaker!).

I decided to switch to raw alsa, which solved all my problems. I was initially afraid of the low-level interface, but it turned out surprisingly easy.