Open ckupe opened 2 months ago
Gonna start by saying my issue is similar but maybe not the same, also with enshrouded but on an Ubuntu based container (host is manjaro which is arch based).
My setup was working fine a couple of days ago and now updating to the newest available version got me here . I'm using mornedhels/enshrouded-server(:stable-proton) and I pretty much spent the past 2 days debugging everything else (server config, firewall, etc). I use podlet to generate systemd services. I did switch to the new enshrouded role system but reverting it doesn't seem to fix the problem either.
podlet --description enshrouded --file enshrouded.container --install --wanted-by multi-user.target --wanted-by default.target podman run --name=enshrouded \
--secret enshrouded-boot-sh,type=mount,target=/scripts/boot.sh,uid=1051,mode=0700 \
--secret enshrouded-post-update-sh,type=mount,target=/scripts/post-update.sh,uid=1051,mode=0700 \
--secret enshrouded-server-password,type=env,target=SERVER_PASSWORD \
--secret enshrouded-server-role0-password,type=env,target=SERVER_ROLE_0_PASSWORD \
--secret enshrouded-server-role1-password,type=env,target=SERVER_ROLE_1_PASSWORD \
--secret enshrouded-server-role2-password,type=env,target=SERVER_ROLE_2_PASSWORD \
-e PUID=1051 \
-e PGID=65537 \
-e SERVER_NAME="/redacted/" \
-e SERVER_SLOT_COUNT=5 \
-e SERVER_QUERYPORT=15637 \
-e SERVER_IP="0.0.0.0" \
-e UPDATE_CRON="31 */2 * * *" \
-e UPDATE_CHECK_PLAYERS=true \
-e BACKUP_CRON="0 */2 * * *" \
-e BACKUP_MAX_COUNT=24 \
-e GAME_BRANCH="public" \
-e STEAMCMD_ARGS="validate" \
-e SERVER_SAVE_DIR="/workdir/savegame" \
-e SERVER_LOG_DIR="/workdir/logs" \
-e BACKUP_DIR="/workdir/backups" \
-e BOOTSTRAP_HOOK=/scripts/boot.sh \
-e UPDATE_POST_HOOK=/scripts/post-update.sh \
-e SERVER_ROLE_0_NAME=Admin \
-e SERVER_ROLE_0_CAN_KICK_BAN=true \
-e SERVER_ROLE_0_CAN_ACCESS_INVENTORIES=true \
-e SERVER_ROLE_0_CAN_EDIT_BASE=true \
-e SERVER_ROLE_0_CAN_EXTEND_BASE=true \
-e SERVER_ROLE_0_RESERVED_SLOTS=1 \
-e SERVER_ROLE_1_NAME=Friend \
-e SERVER_ROLE_1_CAN_KICK_BAN=false \
-e SERVER_ROLE_1_CAN_ACCESS_INVENTORIES=true \
-e SERVER_ROLE_1_CAN_EDIT_BASE=true \
-e SERVER_ROLE_1_CAN_EXTEND_BASE=true \
-e SERVER_ROLE_1_RESERVED_SLOTS=3 \
-e SERVER_ROLE_2_NAME=Guest \
-e SERVER_ROLE_2_CAN_KICK_BAN=false \
-e SERVER_ROLE_2_CAN_ACCESS_INVENTORIES=false \
-e SERVER_ROLE_2_CAN_EDIT_BASE=false \
-e SERVER_ROLE_2_CAN_EXTEND_BASE=false \
-e SERVER_ROLE_2_RESERVED_SLOTS=0 \
-p 15637:15637/udp \
-v /poddata/enshrouded/workdir:/workdir \
-v /poddata/enshrouded/game:/opt/enshrouded \
--label "io.containers.autoupdate=registry" \
--restart=always \
docker.io/mornedhels/enshrouded-server:stable-proton
Server says UP at some point but is unreachable and stuff just hangs. This is htop:
AMD Ryzen 9 7950X and everything just flies to max.
Server says UP at some point but is unreachable and stuff just hangs. This is htop:
AMD Ryzen 9 7950X and everything just flies to max.
My htop looks identical. my CPUs are intel based (12th gen and 13th gen).
I did not test manjaro/arch based distros; I wonder what similarities between arch/manjaro and fedora exist that would cause this bug to occur.
The underlying distro should not be breaking or changing how the container functions.
Host Distribution: Fedora 40 Workstation and Server Linux: 6.10.4-200.fc40.x86_64 Podman: 5.2.0 podman-info.txt
Proton Version:
Game used: Enshrouded Dedicated Server (version fd563a0389c99a6ba9ec59b8a233fe9df17e892d (master))
Container Dockerfile: https://github.com/steamutils/runner
Environment Config: https://github.com/steamutils/apps/tree/main/enshrouded
Issue:
Running proton natively on host for this game dedicated server works fine, but in a container it completely hangs and uses 100% CPU, spawning tons of child processes. Troubleshooting has limited root cause to being specific on Fedora Workstation/Server 40 and isolated to Proton, not wine.
Fresh install of Fedora 40 from ISO (which ships with linux 6.8.5) container works perfectly fine. Certain patches to the distro and kernel break containerized proton functionality. same linux kernel versions on different distros do not exhibit this behavior.
How to recreate:
Install Fedora 40 Workstation, patch to latest patches.
Run the standardized container which has steamcmd and proton installed.
podman logs -f enshrouded
will show steamcmd updating, downloading the dedicated server files, but when it comes to running the dedicated server through proton in a container specifically, the process hangs and tons of threads/child processes spawn. CPU utilization stands at 100%.This shows proton running natively on this host (left) as well as in a container (right) and compares PROTON LOGS of the same binaries side by side at the point where they break.
Troubleshooting attempted:
wine
andWINEPREFIX=...
using the same wine binaries in container runs perfectly fine and does not hang or cause the issues. This suggests an issue with proton.Notes: Fedora, being an Enterprise Linux upstream, has opinionated configurations and hardening that eventually impacts downstream derivatives such as CentOS Stream and Red Hat Enterprise Linux. I suspect there was a distro-specific hardening configuration or sub package that changed what capabilities are exposed to containers that somehow breaks what Proton is doing.
Attached is a list from
sysctl -a > log
to show all sysctl variables configured on this host system. sysctl.variables.txt