Closed jaytark closed 3 years ago
What update specifically are you referring to? The base functionality of the container hasn't changed since Jun 10th. Dockerhub had stopped automatic builds for non-paying customers a while back. That should be fixed with yesterdays commit 2d2a9ad8e516a8451b5febdee62e9ab1986ea770 which builds on Github and then pushes to Dockerhub. But maybe there's a difference between the images Github and Dockerhub builds? I don't have a Synology so can't test it myself.
Whichever is the latest image on Dockerhub. I pull "lloesche/valheim-server:latest". I had pulled it late yesterday evening after everyone got off the server from playing to update it. I only pulled it because your page said "updated xx hours ago", so I assumed you updated something to the image based on the new Valheim update. I turned off the active container, pulled lloesche/valheim-server:latest, reset the container, then started it back up. Then the error is as posted above.
I got the same problem, it sometimes freezes the whole Synology NAS.
The latest Valheim patch today changed something. IMHO it has nothing to do with the image.
I had HTOP running in a SSH window when my Synology became unresponsive, and I could see that a steamcmd process was taking all available CPU resources.
I couldn't create another SSH session, and all other containers and apps were unresponsive.
It happened after the Valheim server was updated to the latest patch today.
@slindebe Last night, my server was running it fine with my friends playing on it... I hadn't updated the docker image in some time (probably in a month or two) As soon as they got off last night, I updated the docker image and hasn't been able to run the game since... and I tested it immediately after I refreshed the container with the new image.... not saying it's the image, but not saying it isn't either... maybe just coincidence... Not sure what else to do at this point.
I haven't even fired up the Valheim client now on PC because I know my server isn't running it correctly and inaccessible based on the logs... and the RAM utilization only stays at around 56MB for the container... as before, it was well over 1Gig... the CPU utilization stays at around 40-56%, as before, around 15-20% when playing the game and when it was running correctly. It's as though it's caught in a loop and it freezes my server at times by possibly hammering the CPU.
All of my other container apps run fine.
I haven't updated the container for a long time, but the way it's set up it updates the Valheim server from Steam regularly. So my container hasn't changed but the Valheim patch through Steam changed something.
For the heck of it, I tried a different Valheim Docker Image through the Dockerhub. (It was the second most popular one. Lloesche's being the first most popular one based on star-ratings). After trying a different one, it works fine. I can get in and play. It's just Lloesche's image that's not working for me. Which I hate, because Lloesche's has better options with Discord Webhooks and such. If Lloesche's works for me at some point, I'll switch back to his.
Note: This isn't to disrespect Lloesche at all, I'm just desperately trying to troubleshoot this the best way I know how. I appreciate Lloesche's work and efforts.
Same here, also on Synology. "Failed to download Valheim server from Steam - retrying later - check your networking and volume access permissions" in the log. with all permissions I to the folders and with entries in syno firewall. I've even opened terminal with the container running and basic stuff like ping works fine so no idea how it could be network issue.
Left it running for a while and it was a mistake, 2 hard reboots today. The mbround18/valheim
seems to download the server fine, shame about the features I've grown to rely on
Experiencing the same issue here on Synology DS918+. I tried to start a fresh new container today using the included stock docker-compose, changing only my custom server name and password environment vars. Both the Docker daemon and Synology DSM itself hung after doing a docker-compose up -d
, ultimately forcing a hard reboot of the device.
The mbround18/valheim
image previously mentioned does work for the time being.
I even did a manual pull of lloesche's github image version through ssh and manually re-entered the environmental variables with a fresh install and it still hangs.
Can anyone post their entire container logs please? From startup to when the error occurs? I don't have access to a Synology but I noticed that on one of our servers valheim-updater fails with the following message
Sep 19 07:35:08 supervisord: valheim-updater Connecting anonymously to Steam Public...
Sep 19 07:35:10 supervisord: valheim-updater OK
Sep 19 07:35:10 supervisord: valheim-updater Waiting for client config...OK
Sep 19 07:35:10 supervisord: valheim-updater Waiting for user info...
Sep 19 07:35:16 supervisord: valheim-updater /opt/steamcmd/steamcmd.sh: line 38: 889 Killed $DEBUGGER "$STEAMEXE" "$@"
Sep 19 07:35:16 supervisord: valheim-updater ERROR - Failed to update Valheim server from Steam - however an existing version was found locally - using it
And on another machine the step Waiting for user info...
takes incredibly long but then succeeds.
When I manually run Steam to update valheim I get this
root@18229210b028:/opt/steamcmd# /opt/steamcmd/steamcmd.sh +login anonymous +force_install_dir /opt/valheim/dl/server +app_update 896660 validate +quit
Redirecting stderr to '/root/Steam/logs/stderr.txt'
[ 0%] Checking for available updates...
[----] Verifying installation...
Steam Console Client (c) Valve Corporation
-- type 'quit' to exit --
Loading Steam API...OK
Connecting anonymously to Steam Public...OK
Waiting for client config...OK
Waiting for user info...OK
Update state (0x3) reconfiguring, progress: 0.00 (0 / 0)
Update state (0x5) verifying install, progress: 16.17 (171258717 / 1059006974)
Update state (0x5) verifying install, progress: 56.73 (600796844 / 1059006974)
Update state (0x5) verifying install, progress: 97.98 (1037628119 / 1059006974)
Success! App '896660' fully installed.
Terminating m_ThreadClient, likely to crash down the line... but avoiding hang on exit
/opt/steamcmd/steamcmd.sh: line 38: 967 Killed $DEBUGGER "$STEAMEXE" "$@"
Anyone else seeing this too?
On the effected system I'm able to reproduce the issue just by entering and existing steam, not even attempting to download any update:
root@18229210b028:/opt/steamcmd# ./steamcmd.sh
Redirecting stderr to '/root/Steam/logs/stderr.txt'
Looks like steam didn't shutdown cleanly, scheduling immediate update check
[ 0%] Checking for available updates...
[----] Verifying installation...
Steam Console Client (c) Valve Corporation
-- type 'quit' to exit --
Loading Steam API...OK
Steam>quit
Work thread 'CJobMgr::m_WorkThreadPool:1' is marked exited, but we could not immediately join prior to deleting -- proceeding without join
Terminating m_ThreadClient, likely to crash down the line... but avoiding hang on exit
./steamcmd.sh: line 38: 1712 Killed $DEBUGGER "$STEAMEXE" "$@"
Doing the same on another system works just fine:
root@513e0c7d8b04:/opt/steamcmd# ./steamcmd.sh
Redirecting stderr to '/root/Steam/logs/stderr.txt'
[ 0%] Checking for available updates...
[----] Verifying installation...
Steam Console Client (c) Valve Corporation
-- type 'quit' to exit --
Loading Steam API...OK
Steam>quit
root@513e0c7d8b04:/opt/steamcmd#
Leading me to believe that there is currently a bug in Steam depending on the Kernel version it is running on. I would assume that Synology NAS appliances are rather conservative with their Kernel updates and don't update too often which might result in the same behavior? This is all just speculation however. I would need for several people to post their container startup logs and also their Kernel versions would be interesting. I'll actually add some debug output to the container so we'll have that automatically in the future.
I also opened https://github.com/ValveSoftware/steam-for-linux/issues/8083 to see if maybe this is a known issue.
Run from scratch (deleted the container, image, pulled fresh from dockerhub. I'd need to ssh over to pull from ghcr)
On start CPU usage briefly jumped to 22%, now hovers under 1%. 81 MB RAM, from what I remember on my 1 year old config it was ~20+ and 1-2 GB on idle
Log: lloesche-valheim-server.csv
I let it run for few min and it became quite unresponsive (CPU and RAM didn't jump, the Syno UI still worked but I was unable to stop the container normally, start a bash terminal inside the container... SSH asked for password and also hung). Force kill resulted in msg similar to "Docker's API interface crashed".
This post helps? https://old.reddit.com/r/synology/comments/cn9qnd/what_distribution_of_linux_is_synology_using/
From SSH (I'm an "admin" according to Synology UI but I'm not sitting on root account) I don't see mention of Debian anywhere. "toster" is the NAS's name so "Linux toster" won't mean anything to anybody.
eyescream@toster:~$ uname -a
Linux toster 3.10.105 #25556 SMP Sat Aug 28 02:13:34 CST 2021 x86_64 GNU/Linux synology_braswell_916+
eyescream@toster:~$ cat /etc/VERSION
majorversion="6"
minorversion="2"
major="6"
minor="2"
micro="4"
productversion="6.2.4"
buildphase="GM"
buildnumber="25556"
smallfixnumber="2"
nano="0"
base="25556"
builddate="2021/08/28"
buildtime="14:40:29"
I started the container for few min, took a snapshot and stopped it. I can then explore the snapshot's filesystem without impacting the container / letting it freeze the system again.
docker ps
docker commit d395d3d6293f mysnapshot
docker run -t -i mysnapshot /bin/bash
The Steam/logs/stderr.txt:
root@d716c7c91549:/# cat /home/valheim/Steam/logs/stderr.txt
src/tier0/threadtools.cpp (4071) : Assertion Failed: Probably deadlock or failure waiting for thread to initialize.
crash_20210919074534_5.dmp[78]: Uploading dump (out-of-process)
/tmp/dumps/crash_20210919074534_5.dmp
CWorkThreadPool::StartWorkThread: Thread creation failed.
crash_20210919074534_5.dmp[78]: Finished uploading minidump (out-of-process): success = yes
crash_20210919074534_5.dmp[78]: response: Discarded=1
crash_20210919074534_5.dmp[78]: file ''/tmp/dumps/crash_20210919074534_5.dmp'', upload yes: ''Discarded=1''
The dump file is weird part binary thing and my Linux skills end here, sorry. How do I get the file from Windows 10 cmd -> ssh -> sudo su -> container snapshot exploration back to Windows... I'll paste what I can here, if you actually need the binary form I'll need some tips :)
Message me if you need more info, I am willing to help but I rather not crash the whole Synology Unit again, I can even let you debug it since my set-up has been redone since yesterday and the only thing I started running was your Valheim server before I stopped.
$ uname -a
Linux DSM01 4.4.180+ #41890 SMP Thu Jul 15 03:43:42 CST 2021 x86_64 GNU/Linux synology_apollolake_218+
$ cat /etc/VERSION
majorversion="7" minorversion="0" major="7" minor="0" micro="0" productversion="7.0" buildphase="GM" buildnumber="41890" smallfixnumber="0" nano="0" base="41890"
$ docker -v
Docker version 20.10.3, build b455053
@tomekduda @kraaijmakers thank you, I have updated the issue at Valve Software with the information you provided. The issue is very likely a combination of Kernel version plus libc or sdl libraries used in the Docker container with a recent change in steamcmd.
I just had a thought related to this. This Docker container is using debian:stable-slim
as base image. Debian just released version 11 (Bullseye) which superseeds 10 (Buster) in August. Even though there were no changes to the container itself other than README updates in the past couple of months, since the container is using the debian:stable-slim
tag even a README update would have caused a rebuild against the new version of Debian. Maybe steamcmd is not compatible with the latest version of Debian stable. Or rather with the libraries used in it. Maybe it's even a combination of new Debian, old Kernel or something like that causing the issue.
I will see if I can downgrade to an older version of Debian and if that helps with the issue.
Ok, I have downgraded the Docker container from Debian stable 11.0 to 10.0. Can you see if that makes any difference?
FIXED! Great work time to get hunting again! For the record I nuked the container and image effectively starting from scratch but pointing at my old game data dirs.
Looks like it's solved. I also had to nuke, download, configure from scratch. Upgrade in place (as described in readme) didn't help, but that might be Synology caching playing tricks.
The CPU and memory usage are "more like it", 25-30% and 1.7-2.0 GB on idle.
Log (don't ask me why it's sorted by time descending): lloesche-valheim-server1.csv
I'll leave it running for few min, see if it freezes the system.
Sounds good! Would be great if @jaytark who originally reported the bug could validate if it is fixed in their system as well before we close the issue.
I also updated the issue at Valve with our findings. Should be easy for them to reproduce.
Are you guys it running through ghcr.io or Docker Hub because the ghcr.io image still freezes my Synology NAS. Gonna try the Docker hub now.
I'll be able to try the new fix later this evening and will report the results on this thread.
I only created an account to say thank you for the fast fix. My Synology DS920+ freezed and now works as before. I recreated the container from scratch, too. But the saved world works, so nothing missing. I used the Image from ghcr.io.
Server running in idle now for 9 hours. Steam Update every hour.
Hmm seems to work for now, when I did the ghcr.io image from the command line it somehow didn't redownload it I think. Via the UI of Synology it did. Thanks guys for the fix. and @lloesche thanks for the support on the container.
@tomekduda I was asked in the Valve issue if it's possible for you to provide the dmp binary file. They would need that for further debugging.
Bugger. I'm happy to help but I deleted the container's snapshot in the meantime and now of course I'm on your latest version. Can you make a broken build with different tag, do-not-use, debug or similar name? I could try forking your repo but I've never built anything on Docker
@tomekduda oh sure, Github containe registry keeps all versions:
docker pull ghcr.io/lloesche/valheim-server@sha256:2cc61cb267192d34c73526e7377a6cc7c3eefd843916af81047305ada14721d2
this is the build from 3 days ago. It will have the broken behaviour.
I'm pleased to confirm the current change fixed the issue for me. I'm able to enter the server's instance now. Will try and report of any issues with saves, etc as friends join and play, but I suspect there won't be any. Thanks @lloesche for the support and fine work on this matter and others who supplied supporting information.
This wasn't just docker images I don't believe. I was trying this image on Saturday (for the first time ever) running on a CentOS7 machine and I had the same issue. I manually downloaded steamcmd in the container and it would result in Assertion Failed: Probably deadlock or failure waiting for thread to initialize
error when I tried to run steamcmd alone. I could run the same outside the container on centos without issue.
I just tried the new container and it starts up with no issue.
I can confirm the issue before this fix on kubernetes on GKE. Using the latest docker image did not fix the valheim-updater issue. Will try to wipe the config and try again.
Edit: wiping out the persistent volume claim for /opt/valheim
fixed the issue for me.
I can confirm that the image with downgraded Debian works fine for me ina Synology environment.
Thanks @lloesche!
Had a friend try last night, and it still works well. So far, it's doing good.
Thanks everyone for testing and confirming the downgrade to Debian 10 helped. I'm keeping an eye on the Valve thread but seems this is a general issue with Debian 11 libc and older Linux kernels.
https://github.com/lloesche/valheim-server-docker/pull/508 broke the image for Synology.
Had to use valheim-server@sha256:17ba123cddda6af6408e407d24c9e0150db82534feae9fea3720bfe7e3aeff5a
Good afternoon. Trying to do a fresh install and getting full system freeze on Synology DSM 7.1 similar to what was mentioned last year.
Good afternoon. Trying to do a fresh install and getting full system freeze on Synology DSM 7.1 similar to what was mentioned last year.
Use lloesche/valheim-server:debian10 instead, kernel/docker version problem as far as I understand.
After making a valheim container using debian10, it runs, CPU hovers around 12%, and RAM at ~3.5GB, so I assume it's running as expected. However, when I try connecting from the Valheim Client, it spins at "Connecting" for a while, before throwing the error "Failed to connect".
I don't think its a networking issue because if I run a Valheim server on my PC itself (forwarding to my PC instead of Synology NAS, hosting the container) I'm able to get into my Valheim World.
Would anyone know what the cause could be?
@Pizzaboy140 try getting there from Steam's list of servers. It didn't want to work for me from in-game connector or a desktop shortcut ("C:\Program Files (x86)\Steam\steam.exe" -applaunch 892970 +connect IP.GOES.HE.RE:2456 +password hunter2
) but after I tried from Steam client's view-> servers something clicked.
Hmm, when I try that, it claims "Server is not responding". I am a docker newbie, but it seems odd that it seems like the world keeps getting rebuilt.
In case it means anything to anyone, please see the following for html of my container's logs: valheim-server.txt *html converted to .txt
It looks like the Server does "connect" in the logs, but there's probably something else going wrong that I can't figure out...
welp, figured it out. I had a firewall up on my NAS itself. I was letting in TCP ports 2456-2458.
However, Valheim uses UDP.
Once I made a UDP port rule, I could connect. Hence why the game attempted to connect, but got stopped...
FYI in case this saves others hours of troubleshooting hehe.
I just tried to set up a new Valheim server on my Synology NAS (218+) and had some problems downloading Valheim to the NAS. It started and after a few MB download it stopped.
Found this thread here and tried the debian10 image, unfortunately I still have the same problem, just a little bit later:
Anything I am doing wrong here? ;)
sudo docker run -d --name valheim-server --cap-add=sys_nice --stop-timeout 120 -p 2456-2457:2456-2457/udp -v /volume1/docker/valheim/config:/config -v /volume1/docker/valheim/data:/opt/valheim -e TZ="Europe/Berlin" -e SERVER_NAME="" -e WORLD_NAME="" -e SERVER_PASS="" -e VALHEIM_PLUS=true --restart=always lloesche/valheim-server:debian10
Ok, just restarted the container and this time it worked.. Nevermind :D
Hi,
Since the new updated container was released, I can no longer run the container. In fact, it sometimes hoses my server and I have to restart the server. I had to disable the ability to auto restart due to this. It was working beautifully before the recent container update, even on Hearth & Home client. I backed up the db and world files to another drive, and just deleted everything related to the docker image and reinstalled - same results.
I've even uninstalled Docker, deleted all associated files and directories, reinstalled docker, and reinstalled/configured the latest Valheim container... still no success and same error. Again, everything worked before the new container.
More Info: On Synology, I'm pulling from Docker Hub (as I've always have) and pulling the 'latest'. I'm not sure of a way to pull the old image back so we can continue to play as before. It doesn't give an option to pull a previous version.
**Update: Thumbing through real-time logs on Synology Docker, I came across this line: "valheim-updater ERROR - Failed to download Valheim server from Steam - retrying later - check your networking and volume access permissions**"
I've added firewall rule to Synology for ports 2456, 2457, although I've never needed to do that before... Problem still persists. Any help or advise would be greatly appreciated. Thanks in advance.