globaleaks / GlobaLeaks

GlobaLeaks is free, open source software enabling anyone to easily set up and maintain a secure whistleblowing platform.
https://www.globaleaks.org
Other
1.21k stars 267 forks source link

If GlobaLeaks process hang, there's no action over this event: Software Watchdog #673

Open fpietrosanti opened 10 years ago

fpietrosanti commented 10 years ago

If GlobaLeaks process hang, there's no action over this event.

This ticket is about implementing a software watchdog component that monitor globaleaks process functionalities and, if it's detected to be down or to not be responding properly, take an action.

The action can be to:

The software implementation design for this feature is to "hook" a master cronjob process that get executed every minute and look for additional crontab entry specific to globaleaks, saved in /usr/share/globaleaks/crontab/ .

A specific log entry has to be written.

An alert about the process hang/crash with the debug information has to be sent.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

volpino commented 10 years ago

I don't really think this should be a GlobaLeaks component. First of all, a monitor should not be installed on the same machine (if the machine goes down also the monitor goes down) and as there are a lot of monitor tools already available it doesn't make so much sense imho to build a new one. Also the fact of restarting the process automatically scares me a bit, the process should not hang and if it does we should fix the issue, not just restarting the process. An alert email/sms is better imho.

For this purpose I suggest to install a monitor (like sensu, and if needed write a sensu plugin for GL) and a logger for getting information about exceptions (like sentry).

fpietrosanti commented 10 years ago

@volpino Globaleaks application design goal is to be as much self-contained and self-managed as possible, so we should avoid adding external dependencies unless strictly required.

This feature is not to introduce a "network monitoring system" (like sensu, nagios, etc) but an "application watchdog" (like the linux kernel watchdog that is "part of the linux kernel") that "act" on the application itself trying to bring it working again (for example the linux kernel watchdog by default reboot the machine).

I think that providing plugin for commonly used network monitoring system would be useful, but this would be a different issue to be opened.

vecna commented 10 years ago

well... we can start a process, not dependent on twisted, from bin/globaleaks, and keep this process as monitor/checker ?

fpietrosanti commented 10 years ago

@vecna a software watchdog must be done using a system process that's started independently from globaleaks. For that reason we need to use a system process and that's the reason of using the design based on crontab.

fpietrosanti commented 10 years ago

If the globaleaks software is "hanged" (so the process is up, but it's not answering), the watchdog should also collect the python stack trace with gdb in an automated way, and send it along with the alert.

More info at: https://wiki.python.org/moin/DebuggingWithGdb