doxout / recluster

Node clustering library with support for zero downtime reloading
522 stars 44 forks source link

respawn and backoff with multiple workers #36

Closed durkes closed 9 years ago

durkes commented 9 years ago

During testing, backoff seems to work properly with 1 worker, but with 4 workers, the backoff (max respawn time) is exceeding the set time.

For example: var opt = { workers: 4, /seconds/ timeout: 300, respwan: 2, backoff: 10 };

If the app is crashed repeatedly (for testing), the respawn time ends up being over 30 seconds and more, exceeding the 10 second backoff setting. I'll be happy to review the code to track down the problem, but I wanted to first make sure this was in fact a bug and not a misunderstanding on my part.

Thank you for a great module.

spion commented 9 years ago

The respawn delay is shown relative to the moment of crash, but the delay capped by backoff is the time after the latest scheduled respawn. Therefore if a respawn has already been scheduled 10s from now, the next crash will schedule a respawn 20s from now (resulting with 10s space between the two respawns). If immediately afterwards there is a 3rd crash, the next respawn will be scheduled 30s from now (10s after the latest scheduled one).

The idea is spread out respawns in order to avoid overwhelming the server if multiple workers crash simultaneously.

durkes commented 9 years ago

Got it. It is working correctly then! Thank you