Acekorneya / Ark-Survival-Ascended-Server

Ark Survival Ascended Server Docker Image for running a server on Linux
MIT License
130 stars 23 forks source link

Restart cycle freezes debian #65

Open locint opened 2 months ago

locint commented 2 months ago

I experienced something today and not for the first time but this time I dedided to look into system logs.

The restart cycle somehow manages to cause my debian home server to freeze. I've been running a cluster of three servers without any issues.

Memory usage is around 30GB/64GB but apparently I ran out of memory.

CPU: 5600g

So far I can tell that there was a lot of restart spam in the logs as well for serveral hours.

Sep 18 14:18:58 debian-homeserver kernel: [351120.866164] [1717124] 1000 1717124 1062 635 53248 0 0 restart_server.

Snapshot of the logs moments before the system freezes:

Sep 18 15:41:15 debian-homeserver kernel: [356057.516154] [2170526] 1000 2170526 1062 146 45056 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516158] [2170527] 1000 2170527 1 1 12288 0 0 restart_server. Sep 18 15:41:15 debian-homeserver kernel: [356057.516162] [2170528] 1000 2170528 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516167] [2170529] 1000 2170529 1062 146 49152 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516171] [2170530] 1000 2170530 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516175] [2170531] 1000 2170531 1062 146 53248 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516179] [2170532] 1000 2170532 1062 147 45056 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516183] [2170533] 1000 2170533 1062 147 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516187] [2170534] 1000 2170534 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516191] [2170535] 1000 2170535 1062 148 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516195] [2170536] 1000 2170536 1 1 12288 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516197] [2170537] 1000 2170537 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516200] [2170538] 1000 2170538 1062 148 49152 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516202] [2170539] 1000 2170539 1062 147 49152 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516204] [2170540] 1000 2170540 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516207] [2170541] 1000 2170541 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516209] [2170542] 1000 2170542 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516211] [2170543] 1000 2170543 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516213] [2170544] 1000 2170544 1062 148 49152 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516215] [2170545] 1000 2170545 1062 145 45056 2 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516218] [2170546] 1000 2170546 1062 146 45056 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516220] [2170547] 1000 2170547 1062 148 53248 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516222] [2170548] 1000 2170548 1062 147 40960 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516225] [2170549] 1000 2170549 1062 145 45056 2 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516229] [2170550] 1000 2170550 1062 148 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516233] [2170551] 1000 2170551 1062 148 40960 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516237] [2170552] 1000 2170552 1062 147 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516242] [2170553] 1000 2170553 1062 147 45056 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516246] [2170554] 1000 2170554 1 1 12288 0 0 update_server.s Sep 18 15:41:15 debian-homeserver kernel: [356057.516250] [2170555] 1000 2170555 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516254] [2170556] 1000 2170556 1062 148 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516258] [2170557] 1000 2170557 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516263] [2170558] 1000 2170558 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516267] [2170559] 1000 2170559 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516272] [2170560] 1000 2170560 1 1 12288 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516275] [2170561] 1000 2170561 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516280] [2170562] 1000 2170562 1062 147 45056 1 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516283] [2170563] 1000 2170563 1062 120 40960 0 0 restart_server. Sep 18 15:41:15 debian-homeserver kernel: [356057.516287] [2170564] 1000 2170564 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516291] [2170565] 1000 2170565 1062 148 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516295] [2170566] 1000 2170566 1062 150 45056 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516299] [2170567] 1000 2170567 1062 121 45056 0 0 restart_server. Sep 18 15:41:15 debian-homeserver kernel: [356057.516303] [2170568] 1000 2170568 996 78 45056 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516307] [2170569] 1000 2170569 996 74 40960 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516311] [2170570] 1000 2170570 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516316] [2170571] 1000 2170571 996 77 49152 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516320] [2170572] 1000 2170572 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516324] [2170573] 1000 2170573 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516328] [2170574] 1000 2170574 996 73 40960 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516331] [2170575] 1000 2170575 1029 111 45056 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516333] [2170576] 1000 2170576 996 73 40960 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516336] [2170577] 1000 2170577 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516338] [2170578] 1000 2170578 1062 150 40960 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516340] [2170579] 1000 2170579 1 1 12288 0 0 launch_ASA.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516342] [2170580] 1000 2170580 1029 112 45056 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516345] [2170581] 1000 2170581 996 77 45056 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516347] [2170582] 1000 2170582 1 1 12288 0 0 restart_server. Sep 18 15:41:15 debian-homeserver kernel: [356057.516349] [2170583] 1000 2170583 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516351] [2170584] 1000 2170584 1 1 12288 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516354] [2170585] 1000 2170585 1 1 12288 0 0 restart_server. Sep 18 15:41:15 debian-homeserver kernel: [356057.516356] [2170586] 1000 2170586 1 1 12288 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516359] [2170587] 1000 2170587 996 78 45056 0 0 monitor_ark_ser Sep 18 15:41:15 debian-homeserver kernel: [356057.516364] [2170588] 1000 2170588 996 76 40960 0 0 init.sh Sep 18 15:41:15 debian-homeserver kernel: [356057.516368] [2170589] 1000 2170589 1 1 12288 0 0 launch_ASA.sh

Acekorneya commented 2 months ago

have you ever look at the docker logs to see what is doing.. and could it be cause by the auto updater what you can do is on the docker compose that says UPDATE_SERVER=TRUE change it to FALSE and see if that helps with the issue only thing is the server will only update when the container is restarted

locint commented 2 months ago

Good idea. Docker logs reveals that the connection to steam and epic was lost which could be the root of this issue? Turning off UPDATE_SERVER let's see if that helps. Perhaps the script could check for connection before updating?

Acekorneya commented 1 month ago

did you ever figure out what happen or did it continue to happen made a few update to the container to stop it from causing issues when you have multiple of them running at the same time

locint commented 1 month ago

I've been doing manual updating and there has been no issues so far. How does auto update process happen? It closes all the containers, asks steacmd to update and then starts all containers at once? I see how straightforward this is its really hard to pinpoint how it could somehow corrupt the serverfiles. However it could corrupt mods if one container is updating the mod and then the other container is trying to do the same. Perhaps I could try the fix you've provided. :)

Acekorneya commented 1 month ago

Thank you for your feedback and for sharing your experience. I've made some improvements to address the issues you've described, particularly regarding multiple containers trying to update simultaneously.

To clarify the auto-update process in this project:

  1. The script periodically checks for updates based on the CHECK_FOR_UPDATE_INTERVAL set in the Docker Compose file.

  2. When an update is found, the script does the following:

    • Sends an RCON message to notify players of the impending restart
    • Initiates a restart command with a countdown based on the RESTART_NOTICE_MINUTES set in the Docker Compose file
  3. Once the countdown reaches zero:

    • The PID file is deleted
    • This triggers the monitor script to detect that the server is not running
    • The monitor then initiates a server restart
  4. During the restart process, the init script checks for updates:

    • If an update is needed, it creates a lock file (updating.flag)
    • It then updates the server files using SteamCMD
    • The lock file is removed once the update is complete
  5. The server then starts up with the updated files

The lock file system (updating.flag) in a shared directory prevents race conditions when multiple instances try to update simultaneously. Other instances will wait if they detect the lock file, preventing simultaneous updates to the shared server files.

These changes should significantly reduce the risk of file corruption or boot loops caused by simultaneous updates. The lock file system ensures that only one instance at a time can update the shared server files.