cloudfoundry / gorouter

CF Router
Apache License 2.0
441 stars 224 forks source link

fix: Don't retry more often than endpoints available #397

Closed domdom82 closed 7 months ago

domdom82 commented 7 months ago

Fixes issue 385

With a change to error classifications, errors such as "IdempotentEOF" are no longer prunable. They only are retry-able now (which is correct, because we don't know for sure if the endpoint is unusable and must be pruned).

This has the side-effect that apps that are slow and buggy tend to exhaust config.backends.maxAttempts, which can be dangerous if maxAttempts is set to a high value or even unlimited, which could cause Gorouter to take extremely long to respond. This is because prior to this PR, Gorouter would have only stopped trying once maxAttempts had been reached, regardless whether or not all of the endpoints were non-working.

This PR makes it so that Gorouter will at most try the number of endpoints in the route before giving up. If there is only one endpoint and it's broken, there is no point in trying it 5 times in a row, it just slows everything down.

  1. Deploy an app with 3 endpoints, all of them unresponsive
  2. Set config.backends.maxAttempts to 5
  3. Curl the app

Gorouter tries 3 times until all endpoints are exhausted

Gorouter tries 5 times, retrying endoints it already tried and failed

routing-release PR