OpenFactorioServerManager / factorio-server-manager

A tool to help manage Factorio multiplayer servers including mods and save games.
MIT License
541 stars 130 forks source link

Autorestart factorio on crash #318

Open ikiris opened 2 years ago

ikiris commented 2 years ago

Need an option to automatically restart factorio child service after a crash

ikiris commented 2 years ago

or at least panic quit from the manager so autostart can work :)

mroote commented 2 years ago

Hi @ikiris,

Could you send some sample output from the console when you're seeing the Factorio process crash? That would be helpful in implementing this feature request.

Thanks!

ikiris commented 2 years ago

console log from the time period: https://pastebin.com/uWrGfCRd

knoxfighter commented 2 years ago

Since the server-manager is now correctly setting the status to stopped when the server stopped (and not too early), we can just restart it at that point. We should also add a "stop trying to restart" button, so it doesn't result in an endless loop.

ikiris commented 2 years ago

ideally you'd try X times to restart within a given timewindow. Human action shouldn't ever be nessecary to handle events.

Mattie112 commented 2 years ago

Or use something that increase the time between restarts.

Docker also uses this: https://docs.docker.com/engine/reference/run/#restart-policies---restart

An increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart to prevent flooding the server. This means the daemon will wait for 100 ms, then 200 ms, 400, 800, 1600, and so on until either the on-failure limit, the maximum delay of 1 minute is hit, or when you docker stop or docker rm -f the container.

If a container is successfully restarted (the container is started and runs for at least 10 seconds), the delay is reset to its default value of 100 ms.

Perhaps: https://github.com/avast/retry-go

ikiris commented 2 years ago

That is the gold standard exponential back off. Go should have a library for it since that's the expected wait handling for rpc errors at Google.

Edit: maybe https://pkg.go.dev/github.com/cenkalti/backoff/v4

ikiris commented 11 months ago

Just wanted to wish this bug a belated happy birthday