Closed zwass closed 6 years ago
If you remove the signal handling, what happens when launchd sends a SIGTERM?
The process will terminate because the signal is unhandled. Though I probably wrote it originally, it makes me very uneasy that we have signal handling in library code.
If the process terminates before osqueryd has a chance to send the Shutdown signal, is that OK?
I think the only consequence of the process terminating before osquery sends the shutdown signal is that osquery may log that it didn't shut down properly. I think this is preferable to handling the signal and therefore interfering with something a user of this library expects.
Agree with @zwass. Ideally we'd get the shutdown command and handle it properly, but making sure the process itself is actually shut down is most important.
That said, it seems like there are two items here:
Launchd send a SIGTERM when you unload the service (which we do when adding a new extension to reload osquery)
osquery restarts itself outside of launchd (Does this actually happen? If so, does it send a SIGTERM?)
Would it be possible to both handle a signal from launchd and shut down if the osqueryd service goes away? (Or am I missing some design nuance with how the extensions work?)
@natewalck I added a continuous check for osqueryd still being up last week in https://github.com/kolide/osquery-go/pull/57. I wanted to get this PR merged and then have you test things out again.
Based on the research I did in relation to facebook/osquery#4577, I believe that in any normal shutdown circumstances, osqueryd will send the Thrift shutdown message. I don't think there are any circumstances in which osqueryd will send a signal to the extension process.
Got it. Looking at the code in that commit, it appears that if osqueryd goes away for any reason (Launchd, kill -9, whatever), it will close itself. I think that covers both cases just fine and is more aggressive, which I like.
The only edge case I can see is if osqueryd gets into some weird zombie state where it responds to a ping but is otherwise non-functional...but if this happens, we have bigger problems than the extension not shutting down :)
Update the shutdown mechanism per discussion in https://github.com/facebook/osquery/issues/4577