wincent / masochist

⛓ Website infrastructure for over-engineers
MIT License
78 stars 26 forks source link

Do a better job of monitoring Git daemon on git.wincent.com #79

Closed wincent closed 5 years ago

wincent commented 8 years ago

Just noticed it had become unresponsive again, thanks to a user report. This sucks.

The monitoring I have via Monit is evidently not up to scratch:

if failed port 9418 then restart

What it looks like on the client side:

$ git clone git://git.wincent.com/buildtools.git                                                                                                                   0.02s /tmp
Cloning into 'buildtools'...
fatal: read error: Connection reset by peer
zsh: exit 128   git clone git://git.wincent.com/buildtools.git

On the server (/var/log/messages) we just see this:

Jun 12 08:32:00 hostname git-daemon[2699]: Too many children, dropping connection

No idea what causes these idle children to linger around. I wonder if an intermediate kludge would be to just restart the frickin' deamon once per day...

wincent commented 8 years ago

Note that I am on monit 5.2.5, and in the very next version (5.3), the ability to run a check with an external program was added (https://bitbucket.org/tildeslash/monit/src/932e0dbd40354cc7720d3f98a8386220fface8da/CHANGES?fileviewer=file-view-default#CHANGES-1195). Will need to upgrade monit as part of all this.

wincent commented 8 years ago

Lots of good info on the old issue tracker archive here: https://wincent.com/issues/1909

wincent commented 8 years ago

As I don't really want to build monit from source, going with the crudest of local hacks for now. This crontab:

# Check four times a day for now.
* 8,12,16,20 * * *  $HOME/bin/check-git

And this script at $HOME/bin/check-git:

#!/bin/sh

set -e

cd $(/usr/bin/mktemp -d)

/usr/local/bin/git clone \
  --bare --depth 1 --quiet --single-branch git://git.wincent.com/wincent.git || \
  $HOME/bin/dialog \
    -title 'git.wincent.com clone test failed' \
    -message 'Unable to complete clone of git://git.wincent.com/wincent.git'
wincent commented 5 years ago

Non-issue now that I have timeouts correctly configured. My crappy local monitoring it good enough.