LearnBoost / up

Zero-downtime reloads and requests load balancer based on distribute.
540 stars 73 forks source link

Discussion: should the master spawn a new worker if an exception kills one? #21

Open brianloveswords opened 12 years ago

brianloveswords commented 12 years ago

Is it/should it be one of the design goals of up to add a new worker to the pool if one dies unexpectedly?

A contrived example, but a server like the following can only survive N requests, where N is the initial number of spawned workers.

// server.js
function hdlr (request, response) {
  response.end(''+Math.random());
  process.nextTick(function () { crash(); })
}
module.exports = require('http').createServer(hdlr);

I think it'd be great if up could ensure that it never gets to a point where there are zero active workers. I've been experimenting with using up in a semi-production environment and I've been burnt by this before, but I wanted to make sure this wasn't a WONTFIX type of issue before putting any time into patching it.

arohter commented 12 years ago

I'd definitely love support for dead worker respawning.

For me, a simple pair of config options would work, as per https://github.com/LearnBoost/up/issues/2 discussion:

1) max respawn rate - don't respawn more frequently than once per XX sec

2) worker listen timeout - if a spawned worker fails to listen in XX sec, kill worker

Both of those are really just primitive safety measures, and could just be based on a single interval config value.

Not looking for anything sophisticated - just a simple and reliable way to occasionally restart dead processes in an otherwise-healthy system - more permanent/disastrous problems can be detected an dealt with elsewhere.

brianloveswords commented 12 years ago

Ahh I should have read more closely, I didn't realize issue #2 was basically talking about this. I suppose I can take that to mean that it's a goal waiting for an implementation?