mikebrady / shairport-sync

AirPlay and AirPlay 2 audio player
Other
7.2k stars 571 forks source link

[Problem]: Handling of USB disconnects in docker image #1576

Closed ntompson closed 1 year ago

ntompson commented 1 year ago

What happened?

First let me say how wrapt I am with shairport-sync. I installed the vanilla docker image, minimal config and I'm up and running with almost no effort. It's brilliant. I'm running it on a Debian (bullseye) host on an Intel NUC, connected to an SMSL SU-10 DAC.

The problem I'm having, though, is handling of USB disconnects. The reason this matters is that the DAC can take many inputs (couple of USB ports, coax, SPDIF etc). When a different input is selected on the DAC, the USB connection to the NUC is severed.

It depends a bit what's going on at the time as to what happens next, but I've noticed:

To work around this, I've built a simple supervisor tool which detects USB connection / disconnections and stops / starts the container. I did this because I couldn't figure out an easy way to determine whether SPS is healthy or broken. This mostly works, but it occurs to me that there may be other reasons SPS could become unhealthy.

So it would be good if either:

Relevant log output

Log when USB is disconnected during active play (AirPlay 2 buffered)

This log covers a sequence of events in which:

sps  |          8.003026473 "player.c:2572"          1.63          -39.7           39.7            7472            46             3025k         44100.14         44099.20
sps  |          8.011803182 "player.c:2572"          1.62          -28.3           28.3            7443            46             2980k         44100.12         44099.17
sps  |          8.001118536 "player.c:2572"          1.65          -25.5           25.5            7462            46             2994k         44100.11         44099.17
sps  |          8.019167236 "player.c:2572"          1.57          -14.2           14.2            7457            46             3006k         44100.08         44099.13
sps  |          0.646371759 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000148254 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 353 samples to alsa device.
sps  |          0.000110086 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000033231 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 353 samples to alsa device.
sps  |          0.000059339 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000029362 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 353 samples to alsa device.
sps  |          0.000029714 "player.c:2803" Large negative (i.e. early) sync error of -8446 frames (-0.191519 seconds), at frame: 3142222388.
sps  |          0.000323023 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000031907 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 8446 samples to alsa device.
sps  |          0.000036135 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000017292 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.009563403 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000036021 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.032097806 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000040981 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.000039027 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000019432 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.000113932 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000039678 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.000060339 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000033290 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.007833289 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000050439 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.032396072 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000046428 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.000029451 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000014618 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 352 samples to alsa device.
sps  |          0.000044127 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000021743 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 353 samples to alsa device.
sps  |          0.000098050 "audio_alsa.c:1747" alsa: DAC in odd SND_PCM_STATE_* 8 prior to writing.
sps  |          0.000028347 "audio_alsa.c:1798" alsa: error -19 ("No such device") writing 353 samples to alsa device.

Log when USB disconnected during no active play and play is subsequently attempted

This log covers a sequence of events in which:

sps  |          0.631337739 "player.c:2738" Connection 1: Playback Started -- AirPlay 2 Buffered.
sps  |          7.940045937 "player.c:2572" Sync Error ms | Net Sync PPM | All Sync PPM | Min DAC Queue | Min Buffers | Min Buffer Size | Output FPS (r) | Output FPS (c)
sps  |          0.000056503 "player.c:2572"          0.39            0.0            0.0            7394            46              102k              N/A              N/A
sps  |          8.730823468 "player.c:1728" Connection 1: Playback Stopped. Total playing time 00:00:16.
sps  |          0.012603395 "rtsp.c:1346" Connection 1: Closed by client: from fe80::1017:342b:b3ca:d2ed:55669 to self at fe80::a911:b69b:1287:132c:7000.
sps  |          0.163480109 "rtsp.c:2907" Connection 2: Remote Control connection from fe80::1017:342b:b3ca:d2ed:55678 ("Nick’s iPhone") to self at fe80::a911:b69b:1287:132c:7000.
sps  |         12.436966894 "rtsp.c:2877" Connection 3: AP2 PTP connection from fe80::1017:342b:b3ca:d2ed:55681 ("Nick’s iPhone") to self at fe80::a911:b69b:1287:132c:7000.
sps  |          0.001510416 "rtsp.c:5004" Connection 2 Closed by self: from fe80::1017:342b:b3ca:d2ed:55678 to self at fe80::a911:b69b:1287:132c:7000.
sps  |          0.053376863 "rtsp.c:4318" Connection 3: Current volume (-9.000000) requested
sps  |          4.695907903 "rtsp.c:3284" Connection 3. AP2 Buffered Audio Stream.
sps  | ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
sps  |          0.000412445 "audio_alsa.c:450" *fatal error: alsa: error -19 ("No such device") opening alsa device "hw:AUDIO".

Configuration Information.

         0.001772939 "shairport.c:2057" >> Display Config Start.
         0.001479463 "shairport.c:2057" 
         0.000030409 "shairport.c:2057" From "uname -a":
         0.000094630 "shairport.c:2057"  Linux mars 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 Linux
         0.003655010 "shairport.c:2057" 
         0.000028221 "shairport.c:2057" From /etc/os-release:
         0.000018814 "shairport.c:2057"  Alpine Linux v3.12
         0.001387988 "shairport.c:2057" 
         0.000020741 "shairport.c:2057" Shairport Sync Version String:
         0.000012568 "shairport.c:2057"  4.1-dirty-AirPlay2-alac-OpenSSL-Avahi-ALSA-dummy-stdout-pipe-soxr-convolution-metadata-mqtt-dbus-mpris-sysconfdir:/etc
         0.000071628 "shairport.c:2057" 
         0.000060275 "shairport.c:2057" Command Line:
         0.000202103 "shairport.c:2057"  shairport-sync --displayConfig
         0.000049980 "shairport.c:2057" 
         0.000014739 "shairport.c:2057" Configuration File:
         0.000045094 "shairport.c:2057"  /etc/shairport-sync.conf
         0.000013404 "shairport.c:2057" 
         0.000128117 "shairport.c:2057" Configuration File Settings:
         0.000010996 "shairport.c:2057"  general : 
         0.000014161 "shairport.c:2057"  {
         0.000010529 "shairport.c:2057"    name = "Lounge Hifi";
         0.000013765 "shairport.c:2057"  };
         0.000011785 "shairport.c:2057"  alsa : 
         0.000011830 "shairport.c:2057"  {
         0.000022704 "shairport.c:2057"    output_device = "hw:AUDIO";
         0.000013148 "shairport.c:2057"  };
         0.000011279 "shairport.c:2057"  mqtt : 
         0.000009172 "shairport.c:2057"  {
         0.000011487 "shairport.c:2057"    enabled = "yes";
         0.000010231 "shairport.c:2057"    hostname = "<redacted>";
         0.000011871 "shairport.c:2057"    port = 1883;
         0.000012563 "shairport.c:2057"    username = "sps";
         0.000021385 "shairport.c:2057"    password = "<redacted>";
         0.000014379 "shairport.c:2057"    topic = "sps/lounge";
         0.000026550 "shairport.c:2057"    publish_raw = "yes";
         0.000026953 "shairport.c:2057"    publish_parsed = "yes";
         0.000011817 "shairport.c:2057"    publish_cover = "yes";
         0.000012771 "shairport.c:2057"    enable_remote = "yes";
         0.000012028 "shairport.c:2057"  };
         0.000009832 "shairport.c:2057"  diagnostics : 
         0.000008952 "shairport.c:2057"  {
         0.000018831 "shairport.c:2057"    statistics = "yes";
         0.000010258 "shairport.c:2057"    log_verbosity = 1;
         0.000007656 "shairport.c:2057"  };
         0.000014436 "shairport.c:2057" 
         0.000010945 "shairport.c:2057" >> Display Config End.
         0.000019670 "shairport.c:2059" >> Goodbye!
         0.000050351 "shairport.c:1670" normal exit

How did you install Shairport Sync?

Docker

Check previous issues

mikebrady commented 1 year ago

Thanks for the kind words, and it's great to hear that you are enjoying using Shairport Sync ("SPS")!

To be honest, handling USB disconnects and reconnects, hot plugging, etc., would be outside the scope of SPS itself, but perhaps a solution would be this: if SPS is playing and a device suddenly becomes unavailable, SPS could be (optionally) made to exit immediately. If I understand things correctly (?), this would cause the container to exit. Does that sound plausible/useful?

If it was possible to see some of the logs of the various situations, that would be very useful.

ntompson commented 1 year ago

Thanks for the quick response Mike. I've added logs for the two failure cases above.

It makes perfect sense that you'd rather not handle USB connections within SPS. So this becomes more of a container lifecycle management question.

As I think more about it, I'm thinking the best thing would be some kind of health reporting to assist with container management. Some things are outside the control of SPS (if I disconnect the DAC, that's on me, and there's no point SPS providing an active AirPlay service while the DAC is disconnected).

So I think it makes sense for me to start / stop the SPS container as the DAC is connected / disconnected by way of an external supervisor.

But what would be nice is if there is something I could hook into to tell me whether SPS is healthy or not. I considered running a systemctl status on the SPS service inside the container, but the current image doesn't include systemctl. From my supervisor tool, I could periodically poll SPS health (or subscribe to an event) and kick the container if I need to.

mikebrady commented 1 year ago

Thanks for all that. It would be a straightforward matter to (optionally) exit Shairport Sync on receipt of an error -19 ("No such device") at those two points, the first when a play session is running, and the second when a play session is started.

The shairport-sync app is controlled by the s6-overlay system. My understanding is that when the app exits, the container also exits, so there wouldn't be any need for systemctl anyway. (?)

ntompson commented 1 year ago

Sorry for the slow reply. I think it's probably a good idea to exit SPS when a fatal error of some kind happens (it's the Docker Way... right?) - so yes, that makes sense. I think I need to externally check for USB connects / disconnects anyway to avoid the container not starting when it should (or restarting endlessly if the restart: unless-stopped policy kicks in while USB is still disconnected).

I'm still interested, though, in some kind of health check. Here, we're talking about USB disconnects, but I'm mindful that there are other ways a service can die. The other option I have been musing (but not attempted yet) is whether one of the dbus commands you already support might suffice.

ekozan commented 1 year ago

this probem is also present without docker

for me SPS need to handle error and recover

mikebrady commented 1 year ago

I've pushed a development version which should now call the run_this_if_an_unfixable_error_is_detected hook if the output device can't be found when it is to be opened or if errors occur while outputting to it. If no hook is provided, Shairport Sync should exit. Your feedback would be appreciated! See also #1609.

bverkron commented 1 year ago

@mikebrady I'd like to use the run_this_if_an_unfixable_error_is_detected feature to restart shairport when I get the error fatal error: the alsa output_device "default" can not be found. What is the best way to restart it when using the shairport docker container? From the host I can manually run docker restart shairport (I've named the container shairport) but of course that does not work from within the container because the docker CLI is not installed there.

I started poking around inside the docker image to see if there is a clean way to restart it from within but this does not seem like the right approach.

Seems like I may be able to setup a docker compose file with restart: unless-stopped set but that might cause endless restarts if the USB DAC is disconnected?

mikebrady commented 1 year ago

Thanks for the post. This looks like a Docker question more than a Shairport Sync one, TBH, and I'm not great on Docker.

The images use the s6 overlay system -- there might be something useful in there for your situation.

bverkron commented 1 year ago

Should s6 be stopping the SPS container when error like the one above are thrown? That seems to be one of the intents of s6 based on "Containers should stop when that thing stops" in their read me but it seems like that's not happening. I have --restart unless-stopped set for my container and it does not seem to be restarting. Maybe I'm misunderstanding the setup.

mikebrady commented 1 year ago

Yeah, that's my understanding of what should happen...

bverkron commented 1 year ago

Hmmm doesn't seem to be. This happens periodically, likely due to disconnecting and reconnecting my Qudelix 5K, and it does not seem to stop the container.

pi@OfficePi:~ $ docker logs shairport
...
warning: failed to find mixer control "PCM",0.
ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave
fatal error: alsa: error -524 ("No error information") opening alsa device "default".
pi@OfficePi:~ $ docker ps -a
CONTAINER ID   IMAGE                      COMMAND                  CREATED      STATUS        PORTS     NAMES
383b18477f9d   mikebrady/shairport-sync   "/init /usr/local/bi…"   2 days ago   Up 22 hours             shairport

I have a fair bit of docker experience but am new to s6 so not sure how to debug this.

Have you had to specifically code things to leverage s6 and could it be that it's not passings / handling this kind of failure correctly?

bverkron commented 1 year ago

As a side note the shairport seems to not complain if it's restarted manually while the DAC is unplugged.

Server startup complete. Host name is OfficePi.local. Local service cookie is 2933198431.
Service "OfficePi" (/etc/avahi/services/ssh.service) successfully established.
Service "OfficePi" (/etc/avahi/services/sftp-ssh.service) successfully established.

So if we could get s6 to stop the container upon failure and if docker can restart it then I think that will provide a clean solution, i.e. I don't think (at least in my case) that it would get stuck in an endless restart loop. The preferred resatrt option would probably be 'always' based on the doc. https://docs.docker.com/config/containers/start-containers-automatically/#restart-policy-details

...

always Always restart the container if it stops. If it is manually stopped, it is restarted only when Docker daemon restarts or the container itself is manually restarted. (See the second bullet listed in restart policy details)

...

A restart policy only takes effect after a container starts successfully. In this case, starting successfully means that the container is up for at least 10 seconds and Docker has started monitoring it. This prevents a container which does not start at all from going into a restart loop.

If you manually stop a container, its restart policy is ignored until the Docker daemon restarts or the container is manually restarted. This is another attempt to prevent a restart loop.

mikebrady commented 1 year ago

Thanks. I'll have to take a close look at it -- it might take a week or so...

github-actions[bot] commented 1 year ago

This issue has been inactive for 45 days so will be closed 7 days from now. To prevent this, please remove the "stale" label or post a comment.

bverkron commented 1 year ago

@mikebrady did you ever get a change to look at this? I get errors periodically (due to disconnecting / reconnecting my USB DAC) and SPS will error out but not exit the container as discussed earlier. I have to either power cycle the Pi or SSH in and fiddle around. Would be absolutely stellar if it existed the container.

mikebrady commented 1 year ago

Thanks. AFAIK, the Docker image of SPS now exits if the DAC is/becomes unavailable. Worth checking though!

bverkron commented 1 year ago

I've updated to the latest image (didn't realize there was a new one out) and I'm now getting a different error than what I used to. Not sure if it's related to the updated image or the specifics on the scenario changing on my side. Though I haven't changed how I connect / disconnect or use the DAC with the Pi. Either way the container does not seem to exit when it encounters the current error.

docker logs shairport shows a long flood of this error but the container is still running and although I can connect to it via AirPlay no audio is output to my DAC (same symptom as the error before).

ALSA lib confmisc.c:165:(snd_config_get_card) Cannot get card index for 2
ALSA lib confmisc.c:165:(snd_config_get_card) Cannot get card index for 2
ALSA lib confmisc.c:165:(snd_config_get_card) Cannot get card index for 2

This is my setup for context.

pi@OfficePi:~ $ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: Headphones [bcm2835 Headphones], device 0: bcm2835 Headphones [bcm2835 Headphones]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 1: vc4hdmi [vc4-hdmi], device 0: MAI PCM i2s-hifi-0 [MAI PCM i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 2: DAC [Qudelix-5K USB DAC], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #1247

pi@OfficePi:~ $ cat /etc/asound.conf
pcm.!default {
    type plug
    slave.pcm {
        type dmix
        ipc_key 1024
        slave {
            pcm "hw:2"
            period_time 0
            period_size 1920
            buffer_size 19200
        }
    }
}
ctl.!default {
    type hw
    card 2
}
docker run -d --restart always -v /etc/asound.conf:/etc/asound.conf -v /home/pi/shairport-sync.conf:/etc/shairport-sync.conf --name shairport --net host --device /dev/snd mikebrady/shairport-sync -a OfficePiDocker -- -c PCM

If I run docker restart shairport it starts working.

bverkron commented 1 year ago

Never mind! I still had the run_this_if_an_unfixable_error_is_detected line active in the config which was preventing the exit behaviour from executing. Once I commented that out it works as expected. Thank you @mikebrady !!

For others wondering the behaviour is now as follows...

Similarly if I disconnect my DAC when things are working, regardless if I'm playing music or not, Shairport container restarts. This puts things in basically the same state as the start of the previous bullet points and that process would repeat then re-connecting the DAC. So, it seems the DAC needs to be already connected when the shairport container starts in order for it to work. Otherwise we get into a restart cycle as described above.

Slightly annoying to have the shairport container shutdown every time the DAC is disconnected or reconnected (at the point when I try to play music anyway) but at least things are self healing and I don't need to manually reset things by power cycle the Pi or SSHing into it to restart the container.