djdv / go-filesystem-utils

ISC License
10 stars 2 forks source link

daemon: interrupt handler doesn't seem to work properly #15

Closed djdv closed 1 year ago

djdv commented 2 years ago

The daemon is supposed to have a tiered interrupt handler. Sending an interrupt signal to the process should tell it to start shutting down. We initialize this starting here: https://github.com/djdv/go-filesystem-utils/blob/7fcd9da457b6419655d276c24f1d098ddfad734c/internal/commands/daemon.go#L109

Internally this spins up a watcher routine which counts and emits (over a channel) how many times the process received a signal. https://github.com/djdv/go-filesystem-utils/blob/7fcd9da457b6419655d276c24f1d098ddfad734c/internal/commands/daemon.go#L142

This count is received and handled in this function: https://github.com/djdv/go-filesystem-utils/blob/7fcd9da457b6419655d276c24f1d098ddfad734c/internal/commands/daemon.go#L285-L287

The expectation is that if the daemon isn't busy, it will just shut down if it sees an interrupt. If it is busy, 1 interrupt will cause all listening sockets to close / stop accepting new connections, but leave existing connections alone. A second interrupt spawns a timer that will close connections after some duration. And a third will close all connections immediately.

This worked at some point but then broke at a later point and I'm not sure why. This needs investigating and more importantly, automated tests to make sure it doesn't break again.

Graceful shutdown does not really matter for mounting our API, but this same socket managing code is going to be used to manage network APIs like 9P, NFS, etc. that will be mounted by remote clients. We need to be able to gracefully handle this, or risk unflushed data being lost, the daemon hanging around forever, and other possible problems.

djdv commented 1 year ago

This was fixed somewhere in the large API refactor (https://github.com/djdv/go-filesystem-utils/pull/28). In addition to handling interrupts, we now also handle SIGTERM, and on Windows we handle window events. In particular those related to closing a console window, shutting/restarting the system, and logging off. These help assure that we at least attempt to exit cleanly before the host kills our process.

There's still a secondary issue with WinFSP driver hijacking interrupts once it mounts something, but that's unrelated to this. Users on Windows should try to use the fs shutdown command instead of sending interrupts or breaks to the console if they're using WinFSP.