Open Vorror opened 1 year ago
For some more info,
In addition to using Speed Test Tracker, I also confirmed my internet speed using Speedtest's cli tool and can confirm the internet speed reduction. The internet speed could sometimes stay permanently below 9 Mbits/s!
In fact, the only reason I noticed this issue was occurring was that when trying to play Navidrome music using bonob on my Sonos speakers, there would be "buffering"(delay before playing and frequent pausing mid-song). My music server, Navidrome, and Sonos speaker are all on the same Lan, so there should never be buffering. So, the issue affects all TCP connections on the server rather than just ones to the internet.
The issue takes several hours to appear, and Ubuntu will never reclaim either the system memory being used or the TCP memory, even after not playing music for several hours/days. You must either stop/restart the bonob container or restart your computer.
Thank you for the great analysis, that's an interesting issue.
How much memory do you give the bonob container?
With your listening pattern, are you simply listening to an album/playlist track by track or are you frequently changing tracks manually?
Unfortunately I cannot test this right now as my system is in a box whilst I move house.
@simojenki I use Portainer to managed my docker containers. I don't explicitly set a maximum size.
I have about 3000 songs in a queue and rarely manually change tracks. It simply plays for hours until I get bored.
I don't mean to be adding fuel to the fire here, but I can confirm this. I have a playlist with 5400 tracks. I left it playing overnight, and when I woke up in the AM, the TCP memory use was at 150MB.
Here is my setup: Debian 12 bookworm 16GB RAM Intel(R) Core(TM) i5-4570 Running Navidrome and Bonob using dockerr-compose.
Then I started doing my rating-my-songs behavior.
I left the playlist running, and just started rating the songs that are playing. Once I rated them, I would fast forward to the next song. After doing this for a few minutes, the TCP memory use is now 500MB.
At this time, the docker container is using 350MB, and TCP is using 500MB. So the rating and fast forwarding may cause memory to balloon. I have not done any more extensive testing here.
So that I've said it, I have created a playlist with about 3 hours of music in it. I start this in the morning and then start it again around lunch time. This seems to keep things behaving.
One thing I'd love to do is be able to attach a debugger to the containerized app, to see if I can identify where this problem is happening. But I've never debugged a container before, much less an app like this. So I'd need some help doing that.
Interesting...
Why do you say it is TCP memory usage? What command did you run to see that?
Are you transcoding your tracks? One thing I have noticed is that sometimes a transcoding (ffmpeg) process in the ND container will be still running well after the track has completed, I think some processes are being left behind, perhaps on skip track?
No, I'm not transcoding anything.
In the OP, @Vorror mentions the use of the /proc/net/sockstat to analyze TCP stack kernel memory use.
I've been watching the value that comes back from that, I have a oneliner:
awk '/TCP/ {print $NF}' /proc/net/sockstat
Evaluate that and do the math:
(sockstat reports memory in kernel pages.)
echo $(($(awk '/TCP/ {print $NF}' /proc/net/sockstat)* 4096 / 1024 ))"KBytes"
and you get back a value of the kbytes used by TCP to keep sockets working.
At this point, I'm not sure whether this is a bonob or a navidrome issue. I'm trying to do some packet capturing. I'm completely clueless about monitoring networking in docker containers.
Thoughts:
Perhaps the docker-compose.yaml that I'm suing needs to make reference to docker internal IP's of things, instead of external? Is it possible that there's a leak in the networking stack in docker that causes streams to consume more memory if you use external references?, if for example I set the BNB_SUBSONIC_URL to the internal IP of the navidrome server, instead of its external hostname?
I'm just throwing darts at this point, but anything that will help us eliminate possible causes can help use identify the actual cause here.
David
What happens if you restart one of either bonob or navidrome, does the memory get released? That may help determine which container is leaking.
re: docker-compose.yml, you should be able to just refer to the name of the container, rather than either internel or external IP.
Given the wide spread use of docker (and friends) I'd be surprised if the issue is caused by docker/Linux kernel, I'm guessing it's going to be an app issue somewhere.
I'll do some more testing this evening. When I restart the continers, the tcp memory is released. Tonight I will experiment with recreating the high memory condition, and I'll restart one each time, to see if I can discover which service is holding the memory.
Thanks! David
Here's what I can see:
dbuttric@jack2:/var/local/docker/navidrome$ /usr/local/sbin/gettcpmem
80784KBytes
dbuttric@jack2:/var/local/docker/navidrome$ docker restart 29b00e1ac55f_navidrome_navidrome_1
29b00e1ac55f_navidrome_navidrome_1
dbuttric@jack2:/var/local/docker/navidrome$ /usr/local/sbin/gettcpmem
44996KBytes
It looks like if I restart navidrome, all the memory does not get cleared, only about half.
Where as if restart bonob, all the memory gets cleared out...
dbuttric@jack2:/var/local/docker/navidrome$ /usr/local/sbin/gettcpmem
122340KBytes
dbuttric@jack2:/var/local/docker/navidrome$ docker restart navidrome_bonob_1
navidrome_bonob_1
dbuttric@jack2:/var/local/docker/navidrome$ /usr/local/sbin/gettcpmem
1392KBytes
Thoughts? Anything else you need me to do to help get to the cause of this, I'm here and available. Thanks alot! David
I guess for some reason bonob is keeping open a connection to navidrome, perhaps because it is unaware that the sonos device has moved onto the next track.
I've tidied up the way streams between bonob -> navidrome are handled, could you please try again with the 'master' build/release of bonob and see what behaviour you get?
This worked! I'm now seeing TCP memory utilization that looks like the other applications on my media server. Thank you!
I'm running different playlists to two separate speakers. So far, memory utilization is great! I see some times where memory use goes from 156K to 0K 10 seconds later. Definitely better behavior than before. I will let this run for a few more hours and report back.
I've been at this for a few hours now. I'm not seeing any of the same behaviors.
I think this is fixed!
Bear with me here since narrowing down what was happening took two weeks. Anyway, so some background:
I'm using bonob in a docker container on the following system(this system is my media server that's only used to run docker containers):
System:
So, I noticed my internet connection on my Ubuntu media server would become really slow, from 1 Gbs to ~20Mbs. I use an app called Speedtest Tracker to monitor this.
Screenshot from Speedtest Tracker:
I've been noticing the above behavior for weeks. My media server's download/internet speed would drop for the whole system and stay low; only a restart would fix it.
Cause
I figured out that the issue was bonob after killing apps one by one and waiting for the issue to reoccur; I figured out what was happening. There seems to be some TCP memory leak with bonob.
When bonob is first launched, my system looks like this(image from docker stats):
and my TCP Memory looks like this:
After listening to music for a few hours, my stats become this:
And inside my
/var/log/syslog
I'm seeing this message repeated:bonob is using 460MB of TCP Memory! This causes the network speed of my server to drop until a) Killing the bonob container or b) restarting the computer.
Killing bonob's container immediately restores my TCP memory and fixes my speed issue.