Bisa / factorio-init

Factorio init script
MIT License
415 stars 82 forks source link

Started server after update while using systemd does not get discovered by systemd. #145

Closed EraYaN closed 4 years ago

EraYaN commented 5 years ago

So after an update, the server is restarted, and this causes systemd to "lose" sight of the server.

It would be great if the script would call systemd/systemctl for starting and stopping when systemd is in use.

tmzasz commented 5 years ago

sounds like you did not follow the setup instructions?

Remember to enable the service at startup if you want that: $ systemctl enable factorio

EraYaN commented 5 years ago

My excuses for my long pause in response.

@tmzasz This has nothing to do with the issue at ALL.

This is purely that systemd can not discover processes that it it self did not start. So the status reporting breaks when the script runs the update and restarts the server, the systemd marks the unit as failed.

Bisa commented 5 years ago

if there's a way around this - please let me know, else I'm inclined to simply state it as a known issue and advice ppl to use systemctl restart after a update or something.

EraYaN commented 5 years ago

To solve this issue, I started to look into redoing this script in python3 and supporting sdnotify properly. This way it would be systemd native and the update task could be a systemd timer based task as well.

I got some part of the way, but then life called. Essentially, it could also be done with this script, though I think it might complicate things a bit, since this would mean extra configuration switches and changing the start, stop and is_running logic.

I picked python since the updater is already in python and there is also a mod updater in python so that would all integrate very nicely, and also possibly be cross-platform.

tmzasz commented 5 years ago

main issue with python3 is centos/redhat and their derivatives ship with 2.7 so you would have to reduce to at least 2.7 for centos/redhat support ( safe to estimate 50% or more of linux servers run some form of centos/redhat ) otherwise its another side by side install ( the glib issue ) that redhat/centos users would have to do which isint something this script does or should automate so it would have to direct users to yet another external resource just to make it work.

tmzasz commented 5 years ago

also systemd losing "sight" of the factorio server is not that big an issue as it still "sees" the init script (the script run by systemd). so for the scope of the script i personally wouldn't worry about it but its up to Bisa to determine if it should or shouldent.

EraYaN commented 5 years ago

Well I for one feel like python2.7 should just be left to die, I know some guys really love it and want to keep maintaining it, but python3.6 is available as default on RHEL/CentOS 8 I believe, although no python is bundled at all. But I would really only run this on Debian/Arch derivatives anyway. Plus Windows (Server) and macOS for when I'm on the go. So I abstracted the service manager interaction in my start of a script away, so even windows services can be supported, including status reports.

The problem with systemd losing sight is that I can not use systemctl (or any of the other tooling that is available for it) to get status info or restart the server, or in my case have my mod update script restart it. Since systemd will think the unit has failed it will not properly shutdown the currently running server and just start a new one and that will obviously break.

Example (When the restart is done, and fails due to existing pid file and fifo):

● factorio.service - Factorio Server
   Loaded: loaded (/etc/systemd/system/factorio.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Sun 2019-06-16 01:25:00 CEST; 18s ago
  Process: 80736 ExecStart=/opt/factorio-init/factorio start (code=exited, status=1/FAILURE)
 Main PID: 80613 (code=exited, status=1/FAILURE)

Or when it for some reason does not try to restart.

● factorio.service - Factorio Server
   Loaded: loaded (/etc/systemd/system/factorio.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Sun 2019-06-16 01:43:41 CEST; 13s ago
  Process: 84389 ExecStop=/opt/factorio-init/factorio stop (code=exited, status=0/SUCCESS)
  Process: 84111 ExecStart=/opt/factorio-init/factorio start (code=exited, status=0/SUCCESS)
 Main PID: 84141 (code=exited, status=0/SUCCESS)

And this is what is should look like:

● factorio.service - Factorio Server
   Loaded: loaded (/etc/systemd/system/factorio.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2019-06-16 01:39:51 CEST; 2s ago
  Process: 84111 ExecStart=/opt/factorio-init/factorio start (code=exited, status=0/SUCCESS)
 Main PID: 84141 (factorio)
    Tasks: 4 (limit: 2181)
   CGroup: /system.slice/factorio.service
           ├─84140 tail -f /opt/factorio/bin/x64/../../server.fifo
           └─84141 /opt/factorio/bin/x64/factorio --config /opt/factorio/config/config.ini --port 34197 --start-server-load-latest --server-settings /opt/factorio/config/server-settings.json --server-adminlist /opt/factorio/config/server-adminlist.json

As you can see it monitors the PID of the factorio server directly. This is due to the Type=forking behavior of the systemd service. Which seems correct since the script itself does not pull any active duties in managing the server.

tmzasz commented 5 years ago

still seems like this is over complicating the issue. systemd dosent start/stop/restart the factorio server the init script does which is what systemd starts/stops. to have the server respond to systemctl commands directly would need wube to implement a systemd library for it but the systemd integration is to restart the server on a BOX reboot not to be used to control the server itself the systemctl commands and centos 8 is not released yet so centos 7 ( current live and long term support version ) uses 2.7

EraYaN commented 5 years ago

As you can see in the status reports I posted, the init script is not running after the server is started, it quits after starting the server. As I mentioned this is because the Type=forking line in the systemd unit.

Bisa commented 4 years ago

@EraYaN - I opted to allow you guys to choose if you want to use the automatic restarts or not, by default the script will just restart - but setting UPDATE_PREVENT_RESTART=1 will force you to handle the start/stop logic elsewhere.