Closed games647 closed 5 months ago
I haven't looked into it, but freezing the server process did cross my mind.
I'm not familiar with checkpointing, but I assume that to have similar effects (on the server).
I don't how Minecraft would react to multiple hours of missing CPU time (e.g. disable the watchdog is must here).
Right! The watchdog kills the server when a tick takes more than a minute. This is probably the only issue, missed ticks are just delayed. Is there a way to disable the watchdog?
There is currently no way to configure other 'methods' for managing the server. Currently only a 'start command' can be configured. @NotJustPizza suggested to implement Terraform, to spawn a server instance in the cloud, which I would consider a different 'method' for server management as well.
I'd have to figure out a nice way to configure such 'methods'.
Anyway, if something like this (freezing, checkpointing) works, that would be brilliant!
BTW: Great project idea.
:smile:
Is there a way to disable the watchdog?
Not that I'm aware of. At least Paper uses environment variable (ref). However the JVM might cache the variables at startup. So the only solution without any code changes is disabling it in the config.
Is there a way to disable the watchdog?
Not that I'm aware of. At least Paper uses environment variable (ref). However the JVM might cache the variables at startup. So the only solution without any code changes is disabling it in the config.
Apparently there's max-tick-time
, which you can set to -1
to disable the watchdog. This is also supported on Vanilla servers.
Rewriting the server.properties
file on start to set this property is fine, because lazymc
already does that for other settings as well.
Yeah I was considering to only temporarily disable it, so you wouldn't loose the feature. However your approach would be more compatible.
Yeah I was considering to only temporarily disable it, so you wouldn't loose the feature. However your approach would be more compatible.
I see! Something to try out.
Just did a quick test with the watchdog disabled. Sending SIGSTOP
and SIGCONT
seems to work fine. Only froze it a few minutes, but everything still works as expected afterwards.
It obviously prints a tick warning, but that should be fine.
[18:17:09] [Server thread/WARN]: Can't keep up! Is the server overloaded? Running 160462ms or 3209 ticks behind
@timvisee @games647 I think it would be more reliable to use Linux 5.10+ (see: this path) with hibernation to swap file instead of experimental checkpoint of single process. You could easily setup and manage it using one of provisioners (e.g. bash scripts, Ansible, Chef) under Terraform stack.
However, please keep in mind that probably 99% of servers won't gain much from using this feature, because potential implementation and maintenance effort won't justify small startup time reduction.
It is also worth considering that when your instance/pod is starting, your automation may be busy doing other things in parallel like changing DNS and configuring firewall rules. This scenario makes speed gain from using hibernation even more questionable and using it may even turn out to be slower, because of complexity.
Last, but not least important thing that comes to my mind is that Minecraft servers, especially modded ones need regular restarts due to frequent CPU and memory leaks, so hibernating just to do restart later may be actually worse for user experience than waiting a few seconds more at beginning.
Freezing the process looks like a good feature to have in the config to me, I might try making a scuffed PR to make it work
@games647 if you still need this you could compile my fork from source or wait until I add a config option and it gets merged after the holidays
I've release v0.2.8
which includes #37 to freeze a server on Unix platforms rather than shutting it down. This implements freezing as described above. Thanks @obj-obj !
@timvisee Should this be closed? My PR solves the issue, but maybe you should leave it open because there isn't support for snapshotting the process on-disk yet. If on-disk snapshotting is needed, CRIU could be used (but it's also linux-only, just like the current implementation).
Have you also looked into freezing or checkpointing the server process? If we consider larger server setups, running a full server startup could take a while. Alternatives are freezing the server process (
SIGSTOP
andSIGCONT
) and checkpoints (Docker/Linux Generic with Java). The first approach frees up the CPU but keeps the data in memory, while the latter approach dumps the current state to disk.In both cases this could improve the startup time by having an already initialized state if we consider fast disk I/O for checkpoints. However, obvious downsides are the the wasted disk or memory space and further compatibility issues. I don't how Minecraft would react to multiple hours of missing CPU time (e.g. disable the watchdog is must here).
This, of course, depends on the scope of this project, as it's a massive complexity component. This was just an idea (analog to serverless) that came up for big modpack servers or small-scale public servers, where a long join time could be an annoying.
BTW: Great project idea.