Closed jurney closed 3 years ago
A snip from top on the server with no users connected
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27813 root 20 0 10.2g 3.0g 59784 S 27.6 4.9 2639:10 /opt/valheim/plus/valheim_server.x86_64
This is on an old 2014 Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
.
Maybe share your config and we'll be able to dig into this.
Here's my docker-compose.yaml section... I think fairly vanilla:
valheim: container_name: valheim image: lloesche/valheim-server volumes:
Here's my inxi output... it's a NUC8 with an i7.
System: Host: spire Kernel: 4.15.0-137-generic x86_64 bits: 64 gcc: 7.5.0 Console: tty 0
Distro: Ubuntu 18.04.5 LTS
Machine: Device: un-determined System: Intel Client Systems product: NUC8i7BEH v: J72992-306 serial:
Thanks for taking a look, appreciate the work on the container.
Just checked Passmark and your quad core CPU has a total bechmark score of 8921 with a single thread score of 2599 whereas my 6 core Xeon has a total of 7971 and a single thread score of 1700.
So I'd expect you to see maybe around 20% CPU load.
Your config looks fine, nothing out of the ordinary. Maybe try setting the server public and see if that makes any difference. The reason why it might is because private servers seem to use a different networking library (Iron Gate's own) than public servers (Steam's networking library). Although I would be very surprised. I just tested and it made zero difference for me.
Valheim server is a blackbox so it's hard to debug. But we can ask the Kernel to tell us what the server is doing while idle.
One thing that's fairly low effort is to use strace
. Either inside the container or on the host.
If inside the container modify your compose file like so:
cap_add:
- sys_nice
- sys_ptrace
The sys_nice
should already be there, just add sys_ptrace
as a capability.
Then run inside the container:
apt update && apt -y install strace
strace -f -p $(< /var/run/valheim-server.pid) -c
for say 2 minutes and then abort it. This will produce statistics that should look something like this:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- -----------------------
78.85 291.469476 367 792429 366669 futex
10.37 38.342466 2055 18652 nanosleep
5.59 20.664232 5799 3563 poll
3.50 12.920542 10940 1181 epoll_wait
1.03 3.810021 200527 19 8 restart_syscall
0.65 2.391130 43 55045 sched_yield
0.02 0.055841 22 2525 2352 read
0.00 0.001167 19 60 prctl
0.00 0.000563 43 13 sendto
0.00 0.000174 19 9 write
0.00 0.000171 14 12 stat
0.00 0.000167 12 13 recvfrom
0.00 0.000138 11 12 clock_gettime
0.00 0.000132 44 3 sigaltstack
0.00 0.000113 56 2 munmap
0.00 0.000098 49 2 madvise
0.00 0.000092 30 3 sendmsg
0.00 0.000087 29 3 mprotect
0.00 0.000086 7 12 lstat
0.00 0.000074 74 1 clone
0.00 0.000069 5 12 geteuid
0.00 0.000064 3 18 gettid
0.00 0.000056 56 1 getpid
0.00 0.000052 17 3 sched_getaffinity
0.00 0.000035 35 1 getrusage
0.00 0.000022 22 1 set_robust_list
0.00 0.000017 17 1 mmap
0.00 0.000011 11 1 sched_setscheduler
0.00 0.000010 10 1 sched_get_priority_min
0.00 0.000009 9 1 sched_get_priority_max
0.00 0.000000 0 1 lseek
------ ----------- ----------- --------- --------- -----------------------
100.00 369.657115 423 873600 369029 total
Let's see if yours look much different from what my server produces.
Another thing you could try is run top
on the host instead of inside the container. Like my thought here is, did you maybe unintentionally restrict the containers CPU usage? If you were to only give the container access to e.g. 25% of a core then inside the container it would look like the server is using more CPU of that core than it really is.
And lastly see if your CPU is actually running at full speed (I bet it is, but you never know). Your sensor output shows the CPU at 46C which seems pretty chill for a CPU under load. I'd expect more like 70-90C. My thought here is similar to the one where the container doesn't have full CPU access. If the CPU is running in some sort of power saving / eco mode at like half clock rate then the server process would again look as if it is using more of that core than it really is.
You can check like so:
[lukas@bigmac ~]$ lscpu | grep MHz
CPU MHz: 1200.133
CPU max MHz: 3200.0000
CPU min MHz: 1200.0000
The CPU MHz
line shows the current clock speed.
Or depending on your system also:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
If these steps don't produce any results we can dig deeper using bpftrace
. It'll get a bit more complex than strace
and requires us to have debugfs
mounted.
That top was from the host, not the container.
Changing to public didn't have an effect.
lscpu explains it... Chip is running at 500Mhz, so 40-60% CPU I'm seeing is not that much in the grand scheme. Seems silly it's even doing that much work, but I expect that's on the game not this container.
Thanks for all the explanation. Very helpful. I saw the godot reference in another issue, but I'll second the request for a tip jar.
That top was from the host, not the container.
Oh then depending on your environment you might want to look into https://docs.docker.com/engine/security/userns-remap/ I guess with a private server it's not really that important but on a public one it makes sense to use subordinate UIDs/GIDs to map UID 0 of the container to some other UID on the host. I should add something to the README about that.
I'll close the issue in a day or two if nothing else comes up.
Oh, good catch... I didn't notice it was running as root. I'm no deep docker expert, so I expected PUID= and PGID= to set the user and group IDs for the container as that's how it's been configured for all my other containers. I'll check out those docs, thanks.
valheim_server is consuming most of a recent i7 core when idle with no players. I guessed this was a game issue, but the game forums imply this issue was fixed in early February. Maybe some issue with the config in the docker setup?