Closed bhechinger closed 1 year ago
Are you missing a d
here? mountPath: /arkserver/ShooterGame/Save
I know this name is awkward, but that's how it is named for ARK.
Ah, no, sorry. Copy-pasta error. The d
is there.
edit: Plus, that would be a "not found" error if that were the case I believe.
Found a bug... will fix it shortly.
https://github.com/SickHub/arkserver/blob/7e0d3ed5e94db95c5627ce64db4c58ace3a328f7/run.sh#L34
sudo
is missing.
Ah ha! I look forward to the fixed version.
While I've got your attention I had a quick question. What is the reasoning behind setting the replicas to 0 on chart deployment? Does it harm things to have it set that to 1?
done, please try with the new image and if this works, you can close the issue.
Thanks for raising this issue! This was also part of #22 I believe.
It's currently downloading the bits from steam, so looking good. Thanks!
One thing I see in the logs that isn't a show stopper but may be an issue is this:
touch: cannot touch '/arkserver/.startAfterUpdate-main': Permission denied
Not sure how important that is.
New issue. This happens after it's fetched everything.
2023-01-29 15:54:06: start
2023-01-29 15:54:06: Running /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer TheIsland\?ServerPassword=password\?GameModIds\?MaxPlayers=10\?RCONEnabled=True\?ServerAdminPassword=password\?AltSaveDirectoryName=SavedArks\?SessionName=TheIsland\?QueryPort=27015\?Port=7777\?RCONPort=32330\?GameModIds\?listen -clusterid=arkcluster -log
/usr/local/bin/arkmanager: line 1333: /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer: No such file or directory
2023-01-29 15:54:06: Server PID: 212
2023-01-29 15:54:11: Bad PID ''; expected '212'
2023-01-29 15:54:11: exited with status 0
The logs lines from the end of fetching (the container restarted) are this:
Update state (0x61) downloading, progress: 89.92 (16928167224 / 18825081206)
Update state (0x61) downloading, progress: 90.27 (16994272563 / 18825081206)
Update state (0x61) downloading, progRunning command 'broadcast' for instance 'main'
[ WARN ] Your ARK server exec could not be found.
Error connecting to server: Connection refused at -e line 33.
This is how it should look:
[...]
Update state (0x61) downloading, progress: 33.22 (1456428304 / 4383962465)
Update state (0x61) downloading, progress: 35.93 (1575021540 / 4383962465)
Update state (0x41) staging, progress: 38.70 (1696640487 / 4383962465)
Update state (0x41) staging, progress: 41.74 (1829909093 / 4383962465)
Update state (0x41) staging, progress: 44.27 (1940625932 / 4383962465)
Update state (0x41) staging, progress: 49.48 (2169270810 / 4383962465)
Update state (0x41) staging, progress: 55.61 (2437708203 / 4383962465)
Update state (0x41) staging, progress: 63.57 (2786884011 / 4383962465)
Update state (0x41) staging, progress: 68.66 (3010230699 / 4383962465)
Update state (0x41) staging, progress: 77.20 (3384572331 / 4383962465)
Update state (0x41) staging, progress: 88.16 (3864998351 / 4383962465)
Update state (0x41) staging, progress: 98.29 (4309092820 / 4383962465)
Update state (0x81) verifying update, progress: 0.30 (13181278 / 4383962465)
Update state (0x81) verifying update, progress: 8.09 (354592187 / 4383962465)
Update state (0x81) verifying update, progress: 14.63 (641493812 / 4383962465)
Update state (0x81) verifying update, progress: 21.70 (951378973 / 4383962465)
Update state (0x81) verifying update, progress: 27.89 (1222749036 / 4383962465)
Update state (0x81) verifying update, progress: 34.68 (1520347766 / 4383962465)
Update state (0x81) verifying update, progress: 40.59 (1779305051 / 4383962465)
Update state (0x81) verifying update, progress: 47.20 (2069445133 / 4383962465)
Update state (0x81) verifying update, progress: 49.23 (2158197401 / 4383962465)
Update state (0x81) verifying update, progress: 54.10 (2371842395 / 4383962465)
Update state (0x81) verifying update, progress: 63.62 (2789095178 / 4383962465)
Update state (0x81) verifying update, progress: 77.40 (3393074954 / 4383962465)
Update state (0x81) verifying update, progress: 84.67 (3711842058 / 4383962465)
Update state (0x81) verifying update, progress: 84.79 (3717084938 / 4383962465)
Update state (0x81) verifying update, progress: 93.69 (4107520274 / 4383962465)
Update state (0x101) committing, progress: 0.00 (0 / 4383962465)
Update state (0x101) committing, progress: 0.00 (0 / 4383962465)
Update state (0x101) committing, progress: 100.00 (4383962465 / 4383962465)
Success! App '376030' fully installed.
Update to 10405504 complete
The server is starting...
Hmm, I wonder what's gone wrong then. I'll try re-deploying it.
Nope, same behavior. It's stuck in an infinite loop of running the downloader/installer which fails with:
Update state (0x61) downloading, progress: 88.65 (16688350988 / 18825081206)
Update state (0x61) downloading, progress: 89.09 (16771282187 / Running command 'broadcast' for instance 'main'
[ WARN ] Your ARK server exec could not be found.
Error connecting to server: Connection refused at -e line 33.
The pod restarts and it fails with:
➜ kubectl -n ark logs -f ark-test-ark-cluster-theisland-577947495b-zmdg8
###########################################################################
# Ark Server - Sun Jan 29 16:54:50 UTC 2023
###########################################################################
Ensuring correct permissions...
Shared server files in /arkserver...
Shared clusters files in /arkserver/ShooterGame/Saved/clusters...
Cleaning up any leftover arkmanager files...
Creating arkmanager.cfg from environment variables...
Creating crontab...
Starting cron service...
* Starting periodic command scheduler cron
...done.
Loading crontab...
Save file validation is not enabled.
Backup on start is not enabled.
Running command 'start' for instance 'main'
[ WARN ] Your ARK server exec could not be found.
touch: cannot touch '/arkserver/.startAfterUpdate-main': Permission denied
Checking for updates before starting
Checking for update; PID: 47
sed: can't read /arkserver/steamapps/appmanifest_376030.acf: No such file or directory
The server is already stopped
Performing ARK updateExecuting /usr/games/steamcmd +@NoPromptForPassword 1 +force_install_dir /arkserver +login anonymous +app_update 376030 +quit
Redirecting stderr to '/home/steam/.local/share/Steam/logs/stderr.txt'
[ 0%] Checking for available updates...
[----] Verifying installation...
Steam Console Client (c) Valve Corporation - version 1669935972
-- type 'quit' to exit --
Loading Steam API...OK
"@NoPromptForPassword" = "1"
Connecting anonymously to Steam Public...OK
Waiting for client config...OK
Waiting for user info...OK
Update state (0x3) reconfiguring, progress: 0.00 (0 / 0)
Update state (0x3) reconfiguring, progress: 0.00 (0 / 0)
Update state (0x3) reconfiguring, progress: 0.00 (0 / 0)
Update state (0x5) verifying install, progress: 29.24 (5503812107 / 18825081206)
Error! App '376030' state is 0x202 after update job.
Update to complete
The server is starting...
2023-01-29 16:55:28: start
2023-01-29 16:55:28: Running /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer TheIsland\?ServerPassword=password\?GameModIds\?MaxPlayers=10\?RCONEnabled=True\?ServerAdminPassword=password\?AltSaveDirectoryName=SavedArks\?SessionName=TheIsland\?QueryPort=27015\?Port=7777\?RCONPort=32330\?GameModIds\?listen -clusterid=arkcluster -log
/usr/local/bin/arkmanager: line 1333: /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer: No such file or directory
2023-01-29 16:55:28: Server PID: 211
2023-01-29 16:55:33: Bad PID ''; expected '211'
2023-01-29 16:55:33: exited with status 0
Restarts and goes back to the installer which fails, etc.
This is my values file: https://github.com/bhechinger/argo-helm-wrappers/blob/1a84b97fd25c5135d4be22fdd0cb8dbfe5c4fb75/charts/ark/values-raw.yaml
It never finishes downloading. Could a probe be restarting it because it's taking too long?
Yes that could be, depending on the download speed, try increasing the startupProbe
initialDelaySeconds
. kubectl get events -w
should also show if that is the reason it's being aborted.
Well, that definitely helped with it not completing the download, however, it's still not quite working.
Update state (0x101) committing, progress: 99.34 (18700301218 / 18825081206)
Success! App '376030' fully installed.
Update to complete
The server is starting...
2023-01-29 20:44:54: start
2023-01-29 20:44:54: Running /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer TheIsland\?ServerPassword=password\?GameModIds\?MaxPlayers=10\?RCONEnabled=True\?ServerAdminPassword=password\?AltSaveDirectoryName=SavedArks\?SessionName=TheIsland\?QueryPort=27015\?Port=7777\?RCONPort=32330\?GameModIds\?listen -clusterid=arkcluster -log
/usr/local/bin/arkmanager: line 1333: /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer: No such file or directory
2023-01-29 20:44:54: Server PID: 266
2023-01-29 20:44:59: Bad PID ''; expected '266'
2023-01-29 20:44:59: exited with status 0
Then we're back to the same infinite loop as before.
while the server is starting up/updating, can you open a shell in the container and check the filesystem? The file /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer
must be there after the update is complete.
Also check if permissions of /arkserver
are correct (steam:steam).
/arkserver
is root:root
/arkserver/ShooterGame
is steam:steam as is everything under it.
It restarted as I was waiting to see what would happen. 10 minutes wasn't enough to finish downloading. :(
I've set it to 30 minutes but I won't get back to this until the morning.
Thanks for all your help!
Actually got a chance to watch this before tomorrow. It's never creating /arkserver/ShooterGame/Binaries/Linux/ShooterGameServer
What I can offer is that we look on it together. Half an hour spent in a meet might lead to a fast solution than chatting over days. Poke me on slack -at- drsick.net to get an invite to my slack space or to setup a meet directly.
We found that the root folder of the mounted PVC (ceph) needs to be explicitly mounted as the steam user:
securityContext:
fsGroup: 1000
I actually cheated on that when I was using ceph and used a pod to mount as root and then explicitly chown to 1000:1000. it's odd that fsgroup didn't work for me. But I suspect that was in the deployment stack in my case.
Trying to deploy this into my kubernetes cluster I get this error at pod startup:
I have this in
values.yaml
:edit: added the d which is actually there and got lost in the copy/paste.
I have been completely unable to figure out why this is happening or how to fix it. :(