Yelp / pgctl

Manage sets of developer services -- "playground control"
http://pgctl.rtfd.org
MIT License
32 stars 15 forks source link

Services should not be "ready" if they're continually crashing #208

Open chriskuehl opened 5 years ago

chriskuehl commented 5 years ago

A service which keeps crashing but has a ready check that's exiting 0 will show as ready:

$ cat playground/foo/run
#!/bin/bash
exec pgctl-poll-ready bash -c 'sleep 1 && exit 1'
$ cat playground/foo/ready
#!/bin/bash
exit 0
$ pgctl start
[pgctl] Starting: foo
[pgctl] Started: foo
$ pgctl status
 ● foo: ready
   └─ pid: 28346, 0 seconds

In real life this often occurs when the service is crashing continually with an EADDRINUSE error because there's already an (orphaned) process bound to that port from a previous pgctl incantation, but the ready check succeeds because it can hit the status endpoint of the (orphaned, old!) process.

This one is kind of tricky to solve since we really have no way to deal with crashes / restarts in pgctl (s6 just restarts them forever), so I don't know that we can actually fix this. This would be on my wish-list for a pgctl replacement though, since this causes us a lot of confusion.