decred / dcrd

Decred daemon in Go (golang).
https://decred.org
ISC License
731 stars 288 forks source link

Ungraceful shutdown during SIGHUP #2644

Closed jrick closed 3 years ago

jrick commented 3 years ago

If dcrd receives SIGHUP, for example when running inside tmux and the server process receives SIGTERM during system shutdown, dcrd will die due to the unhandled signal.

I see two possibilities:

  1. Handle SIGHUP the same way we do signal.Interrupt and SIGTERM.
  2. Ignore SIGHUP, and wait for SIGTERM to be signaled later (when shutdown(8) or reboot(8) is used).

The 2nd option might be preferable, depending on how we want to handle other hangups, for example running dcrd "normally" in a ssh session (without nohup and not daemonized by tmux), where the ssh session is lost due to network connectivity. I don't see any reason why the process must terminate if the attached TTY disappears, as it is not vital to the process' functionality.

Whichever option we choose, we should add the same fix to dcrwallet which suffers from the same problem.

tylerchambers commented 3 years ago

For whatever an internet rando's opinion is worth, I prefer the 1st one.

It keeps dcrd's behavior in line with other *nix programs, and makes it work as expected for those familiar with that environment.

nohup, disowning etc can be used to manually get the desired behavior.

having a --daemon flag like bitcoind/litecoind/monerod might be a nice-to-have for new users (even though it's technically "bad practice", not sure if this has been discussed among decred devs before).

Would be interested in taking this on, btw.

jrick commented 3 years ago

The first option would be an easy one liner; all we'd need to do is add syscall.SIGHUP to https://github.com/decred/dcrd/blob/44600dd6784894c87a55831122a6f2a5e3bdd781/signalsigterm.go#L15 (and perhaps rename that file...). The second may get a little more involved since we need to not only handle the signal, but also change the logic of what we do when we receive it, and possibly use some conditional compilation tricks to ensure that we don't break the build on windows.

I am fine with the first approach, but will note that the recommended way to run dcrd in production is as a system service and depending on the particular OS and init system in use, we're already treating it as a background daemon anyways, and it should always ignore SIGHUP in this situation. I just tested the dcrd that I have running under OpenBSD's rc and it seems to survive manually sending the PID a SIGHUP (not sure how this is accomplished, though). But for other service systems (systemd on linux needs to be tested at the very least) we may want to also verify that SIGHUP there won't bring down the process. As long as that is the case, I think it makes sense to start clean shutdown when SIGHUP is seen, as that may only be the case when the program is running interactively, and I am ok with doing shutdown in that situation.

As for self daemonizing, that is not something that we can really do properly ourselves. Go software is already multithreaded before main so there is no separate fork and exec calls. The best alternative would be to spawn argv[0] with different arguments or environment variables that cause it to run the main server process (I have done this trick in one of my own tools if you want to get an idea of what it would take).

marcopeereboom commented 3 years ago
  1. nohup dcrd & is a fine solution
  2. Is a bit more UNIX daemon-y but, as you alluded, it is more involved and go is really not built that way. I too, have spawned the same same process to achieve fine grained behavior (like privdrop). This makes running the tool inherently more complicated because it adds flags and environment.

Both are ok with me but unless 'dI need something super specific I'd go with 1.