Closed Philip-A closed 3 years ago
@Philip-A
That's an interesting guide - some of the services in both projects have changed, however, which might explain why it's not working. I may attempt this if I get some time next week, but let me try to answer some questions to see if I can get you a bit further.
I am new to Docker, but in my opinion network_mode: host means that the definition ports: - 80:80 is ignored, because the hosts network interface is used directly without any isolation.
This is correct.
03.06.21 10:02:44 (+0200) sound-supervisor Sound supervisor listening on port 80
This is a brand new change introduced with #453, so yeah you are on the right track by changing the host port since it would definitely conflict with pihole.
The - 80:80
added for that change didn't do anything though because as you've discovered, that container is running in host mode. The problem we're going to have here is that the pihole service is also running in host mode.
You can read this mega-thread where I kind of discover some of these things and it may give you some ideas: https://github.com/klutchell/balena-pihole/issues/66
I end up narrowing it down to a timing issue because I had spanning-tree running on my switch, but there are some pretty neat findings in there if you're the network type at all.
There appear to be a number of other ports that sound-supervisor
is actually listening on, however. If I stop all services and compare the listening TCP/UDP ports before starting sound-supervisor
this is what I get:
Before starting the sound-supervisor
container:
root@balena:~# lsof -iTCP -sTCP:LISTEN
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 53u IPv6 17021 0t0 TCP *:22222 (LISTEN)
dnsmasq 1501 nobody 5u IPv4 20815 0t0 TCP balena:domain (LISTEN)
dnsmasq 1501 nobody 7u IPv4 20817 0t0 TCP balena:domain (LISTEN)
node 3152 root 25u IPv6 29714 0t0 TCP *:48484 (LISTEN)
root@balena:~# lsof -iUDP -P -n | egrep -v '(127|::1)'
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 1403 root 5u IPv4 20594 0t0 UDP *:1234
chronyd 1403 root 6u IPv6 20595 0t0 UDP *:1234
NetworkMa 1492 root 23u IPv4 25413 0t0 UDP 10.9.1.62:68->10.9.1.1:67
dnsmasq 1501 nobody 4u IPv4 20814 0t0 UDP 10.114.102.1:53
avahi-dae 26920 avahi 11u IPv4 234858 0t0 UDP *:5353
avahi-dae 26920 avahi 12u IPv6 234859 0t0 UDP *:5353
avahi-dae 26920 avahi 13u IPv4 234860 0t0 UDP *:59008
avahi-dae 26920 avahi 14u IPv6 234861 0t0 UDP *:59252
After starting the sound-supervisor
container:
root@balena:~# lsof -iTCP -sTCP:LISTEN
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 53u IPv6 17021 0t0 TCP *:22222 (LISTEN)
dnsmasq 1501 nobody 5u IPv4 20815 0t0 TCP balena:domain (LISTEN)
dnsmasq 1501 nobody 7u IPv4 20817 0t0 TCP balena:domain (LISTEN)
node 3152 root 25u IPv6 29714 0t0 TCP *:48484 (LISTEN)
node 31752 root 20u IPv6 276413 0t0 TCP *:http (LISTEN)
node 31752 root 21u IPv4 276432 0t0 TCP *:8000 (LISTEN)
root@balena:~# lsof -iUDP -P -n | egrep -v '(127|::1)'
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 1403 root 5u IPv4 20594 0t0 UDP *:1234
chronyd 1403 root 6u IPv6 20595 0t0 UDP *:1234
NetworkMa 1492 root 23u IPv4 25413 0t0 UDP 10.9.1.62:68->10.9.1.1:67
dnsmasq 1501 nobody 4u IPv4 20814 0t0 UDP 10.114.102.1:53
avahi-dae 26920 avahi 11u IPv4 234858 0t0 UDP *:5353
avahi-dae 26920 avahi 12u IPv6 234859 0t0 UDP *:5353
avahi-dae 26920 avahi 13u IPv4 234860 0t0 UDP *:59008
avahi-dae 26920 avahi 14u IPv6 234861 0t0 UDP *:59252
node 31752 root 22u IPv4 276415 0t0 UDP *:12345
node 31752 root 23u IPv4 276433 0t0 UDP *:12345
So, at a bare minimum you need to forward to TCP 8000 and 80, and UDP 12345. The only port conflict that I can see with the pihole project is 80, so you are on the right track. Now, in theory none of those ports actually need to be reachable from outside of the balena network, but I'm not 100% sure about that.
All of that said, changing from host to bridge mode should not prevent sound-supervisor
from starting. I tried to test that here, but I can't get a successful push right now (Issue #462).
Once the device is loaded, what do you see in the logs when you manually try to start the service?
Hello @eiddor, thank you for your quick response. I am not sure, what I did wrong last time, but today the container starts with bridged network mode -- just as you said.
I added the three port forwardings you mentioned, but unfortunately some container communication is failing. Several containers are waiting for supervisor to start, but supervisor is not able to get a connection to audio block.
At least the balenaSound UI is reachable at the expected port 8080.
Here are snippets of my compose file and a selection of log statements. I think that should be reproducable.
services:
audio:
build: ./balena-sound/core/audio
privileged: true
labels:
io.balena.features.dbus: 1
io.balena.features.supervisor-api: 1
ports:
- 4317:4317
sound-supervisor:
build: ./balena-sound/core/sound-supervisor
#network_mode: host
ports:
- "8080:80/tcp"
- "8000:8000/tcp"
- "12345:12345/udp"
labels:
io.balena.features.balena-api: 1
io.balena.features.supervisor-api: 1
...
05.06.21 14:55:15 (+0200) Starting service 'audio sha256:c5374aa6a370c5dd53749748d8922fe1e9729174a81376e3854dbb350908d5af'
05.06.21 14:55:17 (+0200) <audio> --- Audio ---
05.06.21 14:55:17 (+0200) <audio> Starting audio service with settings:
05.06.21 14:55:17 (+0200) <audio> - pulseaudio 13.0
05.06.21 14:55:17 (+0200) <audio> - Pulse log level: NOTICE
05.06.21 14:55:17 (+0200) <audio> - Default output: AUTO
05.06.21 14:55:17 (+0200) <audio> Detected audio cards:
05.06.21 14:55:17 (+0200) <audio> 0 bcm2835-jack bcm2835_headphonbcm2835Headphones-bcm2835Headphones
05.06.21 14:55:32 (+0200) <audio> Waiting for sound supervisor to start
...
05.06.21 14:55:39 (+0200) Started service 'sound-supervisor sha256:14150a3797b915d654c9ceaf06bd7bce8d464b1b5be9066eb80a9dcb7c937c8e'
05.06.21 14:55:40 (+0200) <sound-supervisor> > sound-supervisor@1.0.0 start /usr/src
05.06.21 14:55:40 (+0200) <sound-supervisor> > node build/index.js
05.06.21 14:55:40 (+0200) <sound-supervisor>
05.06.21 14:55:41 (+0200) <sound-supervisor> Sound supervisor listening on port 80
05.06.21 14:55:41 (+0200) <sound-supervisor> Error connecting to audio block - Retry failed: connect ECONNREFUSED 172.17.0.7:4317
05.06.21 14:55:42 (+0200) <audio> Waiting for sound supervisor to start
05.06.21 14:55:42 (+0200) <sound-supervisor> Error connecting to audio block - Retry failed: connect ECONNREFUSED 172.17.0.7:4317
05.06.21 14:55:44 (+0200) <sound-supervisor> Error connecting to audio block - Retry failed: connect ECONNREFUSED 172.17.0.7:4317
....
05.06.21 15:05:41 (+0200) <sound-supervisor> (node:35) UnhandledPromiseRejectionWarning: Error: Timeout after 600000ms
05.06.21 15:05:41 (+0200) <sound-supervisor> at Timeout._onTimeout (/usr/src/node_modules/ts-retry-promise/dist/timeout.js:10:20)
05.06.21 15:05:41 (+0200) <sound-supervisor> at listOnTimeout (internal/timers.js:554:17)
05.06.21 15:05:41 (+0200) <sound-supervisor> at processTimers (internal/timers.js:497:7)
05.06.21 15:05:41 (+0200) <sound-supervisor> (node:35) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
05.06.21 15:05:41 (+0200) <sound-supervisor> (node:35) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
The hosts local IP is in 192.168.178.0/24
range.
I assume 172.17.0.7
to be the IP of the supervisor container (docker network).
I don't know how to tell the supervisor, where it should look for the audio block. Obviously the IP of the supervisor itself is wrong. It should be either the host IP or the audio container's IP
@Philip-A Yeah, I finally got a successful push and I'm seeing the same messages - I'm not even forwarding the ports.
don't know how to tell the supervisor, where it should look for the audio block. Obviously the IP of the supervisor itself is wrong. It should be either the host IP or the audio container's IP
So, with "traditional" Docker containers on the same bridged network can reach each other simply by service name, so you never have to worry about the container IP (since it can change easily). I'm not 100% sure if this works differently with balenaSound.
Actually, I suspect that it does - I just noticed that by default the sound-supervisor
container connects to host_ip:4317
which of course will forward to the audio
container in host mode. When we put sound-supervisor
into bridge mode, it will try to connect to itself and things won't work.
Let me keep digging (I'm not a coder, so I'm learning on the fly).
So, yeah... I figured it out and I'm kind of proud of myself considering I have no idea what I'm doing in Python, but I think I found the "offending" code.
in core/sound-supervisor/src/index.ts
line 9:
Change:
const audioBlock: BalenaAudio = new BalenaAudio(`tcp:${config.device.ip}:4317`)
to:
const audioBlock: BalenaAudio = new BalenaAudio(`tcp:audio:4317`)
This will force the audio block connection to the audio
container by service name instead of by local container ip (which changes when we change network modes).
I just tested with sound-supervisor
in bridge mode and ports forwarded, and everything seems to work.
Again, I'm no coder and I fully admit to not knowing exactly how this architecture is supposed to work, but I am usually quite wary of running any Docker container in host mode if I can help it.
I'll leave it to @tmigone and others to see if this something worth making permanent. I don't see a downside to it, but I don't know the original intent.
Hi, I thought that I had found two solutions. But non of both worked. So I am happy that you have another idea, which I will try.
1) I found the same line in the code as you did and tried to tell the supervisor to use another IP to look for the audio block by hard-coding my hosts IP there. https://github.com/balenalabs/balena-sound/blob/07677c6667411f95599fa5c0ad9f279946771640/core/sound-supervisor/src/index.ts#L9
And I was very happy to see the log line:
05.06.21 16:08:06 (+0200)
Connected to PulseAudio at 192.168.178.3:4317
But suprisingly (at least for me) I still got many messages like the first and other error messages:
05.06.21 16:16:57 (+0200)
Waiting for audioblock to start... 05.06.21 16:08:11 (+0200) Joining the fleet, requesting master info with fleet-sync... 05.06.21 16:20:30 (+0200) Malformed packet!
So my change had consequences I was not aware of. I rolled back. Now, I read you had success with that solution, so I should try again.
2) Second finding should solve my original goal to use other than port 80, too. (spoiler: it did not) I have seen an undocumented environment variable in the code that can be used to change the default web port: https://github.com/balenalabs/balena-sound/blob/07677c6667411f95599fa5c0ad9f279946771640/core/sound-supervisor/src/constants.ts#L15
This should allow me to keep using network_mode: host
and define a custom port on the host to provide UI and API.
But I still got the message "Waiting for sound supervisor to start...", because the shell scripts that check the availablilty do expect standard port 80 :-(
But suprisingly (at least for me) I still got many messages like the first and other error messages:
Ahh, yeah - that makes sense. There's really no reason to send that traffic outside of the balena/Docker network, since you may have issues with return traffic not getting back into sound-supervisor
Second finding should solve my original goal to use other than port 80, too. (spoiler: it did not) I have seen an undocumented environment variable in the code that can be used to change the default web port:
You should not have to do this - sound-supervisor
can still listen on port 80, and we use the -8080:80 line in docker-compose.yml
to avoid the conflict on the host. This is the "right" way to do it with Docker in general.
No, I do not get a running setup with your suggestion of using
const audioBlock: BalenaAudio = new BalenaAudio('tcp:audio:4317')
The log messages are similiar to a state I have documented above (four hours ago):
05.06.21 19:37:10 (+0200) multiroom-server Waiting for sound supervisor to start
05.06.21 19:37:12 (+0200) Started service 'sound-supervisor sha256:8980f16efbb534b0e4e2bda828cf25e5d4cef0d8569d1f6220f6687225e3bd3e'
05.06.21 19:37:13 (+0200) multiroom-client Waiting for sound supervisor to start
05.06.21 19:37:13 (+0200) airplay Waiting for audioblock to start...
05.06.21 19:37:13 (+0200) sound-supervisor
05.06.21 19:37:13 (+0200) sound-supervisor > sound-supervisor@1.0.0 start /usr/src
05.06.21 19:37:13 (+0200) sound-supervisor > node build/index.js
05.06.21 19:37:13 (+0200) sound-supervisor
05.06.21 19:37:15 (+0200) sound-supervisor Sound supervisor listening on port 80
05.06.21 19:37:15 (+0200) sound-supervisor Error connecting to audio block - Retry failed: connect ECONNREFUSED 172.17.0.5:4317
05.06.21 19:37:15 (+0200) multiroom-server Waiting for sound supervisor to start
05.06.21 19:37:16 (+0200) sound-supervisor Error connecting to audio block - Retry failed: connect ECONNREFUSED 172.17.0.5:4317
Can you confirm that 172.17.0.5 is the ip of your audio container? You can connect to it via the balena console and just type ip addr
.
Are you working with your custom docker-compose.yml file, or are you starting fresh until we can get things to start properly?
Yes, 172.17.0.5 is one IP if my audio container. And I used my custom docker-compose.yml
bash-5.0# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
152: eth0@if153: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
link/ether 02:42:ac:11:00:05 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.5/16 brd 172.17.255.255 scope global eth0
valid_lft forever preferred_lft forever
154: eth1@if155: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
link/ether 02:42:0a:72:68:02 brd ff:ff:ff:ff:ff:ff
inet 10.114.104.2/25 brd 10.114.104.127 scope global eth1
valid_lft forever preferred_lft forever
Today I removed all pihole services from the compose file and checked that the remaining content is analogous to the original compose file, except the sound-supervisor service (and the code change const audioBlock: BalenaAudio = new BalenaAudio('tcp:audio:4317')
):
sound-supervisor:
build: ./balena-sound/core/sound-supervisor
#network_mode: host
ports:
- 80:80
- 8000:8000
- 12345:12345
labels:
io.balena.features.balena-api: '1'
io.balena.features.supervisor-api: 1
This is pretty much the same setup as yesterday, but today the multiroom
services having trouble to connect to the audio
services. Don't ask me why... Maybe I should not have purged data.
06.06.21 18:18:24 (+0200) multiroom-server 2021-06-06 16-18-24.670 [Notice] (init) Settings file: "/var/cache/snapcast/server.json"
06.06.21 18:18:24 (+0200) multiroom-server 2021-06-06 16-18-24.671 [Error] (Avahi) Failed to create client: Daemon not running
06.06.21 18:18:24 (+0200) multiroom-server ALSA lib pulse.c:242:(pulse_connect) PulseAudio: Unable to connect: Connection refused
06.06.21 18:18:24 (+0200) multiroom-server
06.06.21 18:18:24 (+0200) multiroom-server 2021-06-06 16-18-24.691 [Notice] (Server) Server::start: Can't open device 'pulse', error: Connection refused
06.06.21 18:18:24 (+0200) multiroom-server 2021-06-06 16-18-24.692 [Error] (main) Exception: Can't open device 'pulse', error: Connection refused
06.06.21 18:18:24 (+0200) multiroom-server 2021-06-06 16-18-24.692 [Notice] (main) Snapserver terminated.
06.06.21 18:20:47 (+0200) multiroom-client 2021-06-06 16-20-47.883 [Error] (Controller) Error: Connection refused
06.06.21 18:20:47 (+0200) multiroom-client 2021-06-06 16-20-47.883 [Error] (Connection) Error in socket shutdown: Transport endpoint is not connected
06.06.21 18:20:48 (+0200) multiroom-client 2021-06-06 16-20-48.884 [Error] (Connection) Failed to connect to host '172.17.0.5', error: Connection refused
What did you do, that your setup was working in brdiged mode? Did you add this three port forwardings, too?
What did you do, that your setup was working in brdiged mode? Did you add this three port forwardings, too?
Honestly all I did was start with a fresh copy of the repo, made that line change, and then commented out network_mode: host
I did add the port forwards, but they weren't necessary for at least this part of testing.
sound-supervisor:
build: ./core/sound-supervisor
# network_mode: host
ports:
- 80:80
- 8000:8000
- 12345:12345
labels:
io.balena.features.balena-api: '1'
io.balena.features.supervisor-api: 1
This is pretty much the same setup as yesterday, but today the multiroom services having trouble to connect to the audio services. Don't ask me why... Maybe I should not have purged data.
Oh, I have multiroom disabled - let me enable that and see what happens.
I'm getting the same messages (repeating like crazy):
06.06.21 11:37:16 (-0500) multiroom-client 2021-06-06 16-37-16.071 [Error] (Connection) Failed to connect to host '172.17.0.4', error: Connection refused
06.06.21 11:37:16 (-0500) multiroom-client 2021-06-06 16-37-16.071 [Error] (Controller) Error: Connection refused
06.06.21 11:37:16 (-0500) multiroom-client 2021-06-06 16-37-16.071 [Error] (Connection) Error in socket shutdown: Transport endpoint is not connected
Let me keep digging.
I did notice that the multiroom Dockerfiles are using the same convention that we used for sound-supervisor
:
ENV PULSE_SERVER=tcp:audio:4317
So, I just tested with a fresh copy of the repo with no changes, and I still get those multiroom-client
messages. We might be chasing something unrelated.
Purge data fixed that.
Trying again with the "patch" and disabling host-mode.
Ok, so after purging data and downloading/pushing a fresh repo with only the "patch" and host mode disabled, everything seems to be load fine the first time, but I'm getting the same messages again after rebooting.
In theory, a change to sound-supervisor
shouldn't trigger an error between audio
and multiroom-client
, but as I said earlier, I'm not 100% familiar with how the architecture is supposed to work.
It's documented here, and I will try to take a look at it this week: https://github.com/balenalabs/balena-sound/blob/master/ARCHITECTURE.md
Hey guys @Philip-A @eiddor! Catching up with this thread.
First, I'd like to say pretty much all of the stuff you guys tried was spot on and were super close of getting it to work (both with sound-supervisor
in host and bridge mode). The problem really lies in the complexity of balenaSound, so let me elaborate a bit.
Theoretically you should be able to use both host and bridge modes for any service. I generally dislike using host mode since the whole idea of docker is to isolate stuff. So the services you see in balenaSound running with network_mode: host
are doing that because I found no alternative, some need to access hardware stuff that's 100x easier to do with host mode (bluetooth
), some rely on third party binaries (airplay
, spotify
) where I had little control over the code, and some make life easier (sound-supervisor
).
Focusing on sound-supervisor
, first you are completely right with:
ports:
- 80:80
Since the container is in host mode that's not doing anything, that's definitely an oversight on my part, I'll remove those lines to avoid further confusion (https://github.com/balenalabs/balena-sound/pull/467).
The reason sound-supervisor
uses host mode is because it needs to know the actual device IP address for multiroom, the sound supervisor basically needs to tell other devices what its IP address is for them to connect. At the moment I developed this I couldn't find a way of doing this reliably with the container in bridged mode. We could also get the IP address by asking balena-supervisor
but that would mean we now need to wait for it to start and we want to start as fast as possible so it's not an option. That leaves us with host mode as the only option (I've picked up some new tricks and with recent balenaOS changes it might be possible to do with bridged, but that's for another day lol. Also, I'm open to suggestions if you have!).
Question 2: How can I make this service use another host port?
Now going back to your original question... For the reasons stated above, not using network_mode: host
is a no-go unless we undergo major surgery. Quickest way to use another port is to modify the constants file as you found out (pro tip, you can set the environment variable SOUND_SUPERVISOR_PORT
instead of changing the code too) BUT you also need to update all instances where this port was hardcoded.
As you noted @Philip-A this includes the audio
block start script but also the start scripts for the multiroom services. You can check the PR that made that change to see where you need to go back to: https://github.com/balenalabs/balena-sound/commit/a0d8c874f8efbc308d2098e834404a495fef0b44. For example, SOUND_SUPERVISOR="$(ip route | awk '/default / { print $3 }')"
in the audio
service should be SOUND_SUPERVISOR="$(ip route | awk '/default / { print $3 }'):3000"
assuming you want to use port 3000.
So to recap... you need to use another port either by changing the constants.ts
file or by setting SOUND_SUPERVISOR_PORT
, and you need to edit start.sh
scripts for audio
, multiroom-client
and multiroom-server
services. That should do the trick as I don't think there is any other port overlap with the PiHole project.
Hi @tmigone, thank you for your advice. Even with you help it took me some time to solve it, but I think now all the obvious thinks work. I have created a pull request, because I think my changes are potentially an improvment of the code basis that is interesting for others as well.
So please check my changes and merge if you want. Thank you both for all the help and discussion. I very much appreciate your support.
Thanks for the explanation, @tmigone - Very good info!!
That leaves us with host mode as the only option (I've picked up some new tricks and with recent balenaOS changes it might be possible to do with bridged, but that's for another day lol. Also, I'm open to suggestions if you have!).
Can we just declare SOUND_MULTIROOM_MASTER
as mandatory for this purpose (assuming it's the same thing)? I know we try to make things as easy as possible for users, but this one doesn't seem like it would be a heavy lift and would free us to use bridge mode for the container.
I think no further work will be done here. A solution for my request has been found.
Hello, I need your support for proper docker compose config, please. I am trying to merge multiple apps to run them on one device with balenaOS. Currently these are balena-sound and balena-pihole. I started with this article: https://www.balena.io/blog/two-projects-one-device-turn-your-raspberry-pi-into-a-multitool/
My goal is, that balena-sound does not use host port 80. How can I achieve that?
Question 1: Which service is using port 80? Following log message lets me think that
sound-supervisor
is currently using port 80 (on host).Question 2: How can I make this service use another host port? Your checked-in docker-compose.yml uses following definition:
I am new to Docker, but in my opinion
network_mode: host
means that the definitionports: - 80:80
is ignored, because the hosts network interface is used directly without any isolation.Therefore changing the port forwarding to
ports: - 8080:80
did not the trick for me (in combination withnetwork_mode: host
).sound-supervisor
is still using port 80.Next, I tried to remove
network_mode: host
with the goal that the container uses bridged network mode. But unfortunately, the service does not start in this mode. At least it looks like that for me, because of log messages like:But I don’t get any log message from
sound-supervisor
itself, so I don’t know why the container is not coming up, yet.Then I searched the documentation for config parameters to change the web port
sound-supervisor
is binding its web server or disable it completely, but I did not find any.I am surprised that nobody else seamed have this problem yet. Did I miss something obvious? I appreciate your help.