Monitoring for nighly update failures

@legoktm I'm happy if you want to keep it self-contained here, I'm just not sure about the most elegant way to do that (since right now, I think all the slack notifications happen inside the VM, and in this case, you probably want the check/alert to fire from the ESXi host).

Did you have any ideas around that? I think it basically would involve adding slack API request logic to the run.py around the spot it throws the error.

Alternatively, indeed something we could do is add an Elastalert check for the vim.vm.guest.ProcessManager.ProcessInfo that occurs in the log on error. It's not Icinga, but it's the shortest path to getting the alert detected and firing in the appropriate Slack channel. This might also capture other errors than just nightlies. It's also probably better than Icinga in that regard as there isn't really a 'recovery' state to return to (except on the next night) - it'd just alert whenever there's a problem.

So that's a very quick solution I can put together, but it does mean it's not as 'self contained'.

Let me know your thoughts.

freedomofpress / securedrop-workstation-ci

Monitoring for nighly update failures #78