Do not attempt more often than endpoints available on the route
Fixed test suite to actually use the number of backends as claimed in the tests (instead of trying 5 times on one endpoint, we will now actually try on 5 different endpoints)
An explanation of the use cases your change solves
With a change to error classifications, errors such as "IdempotentEOF" are no longer prunable. They only are retry-able now (which is correct, because we don't know for sure if the endpoint is unusable and must be pruned).
This has the side-effect that apps that are slow and buggy tend to exhaust config.backends.maxAttempts, which can be dangerous if maxAttempts is set to a high value or even unlimited, which could cause Gorouter to take extremely long to respond. This is because prior to this PR, Gorouter would have only stopped trying once maxAttempts had been reached, regardless whether or not all of the endpoints were non-working.
This PR makes it so that Gorouter will at most try the number of endpoints in the route before giving up. If there is only one endpoint and it's broken, there is no point in trying it 5 times in a row, it just slows everything down.
Instructions to functionally test the behavior change using operator interfaces (BOSH manifest, logs, curl, and metrics)
Deploy an app with 3 endpoints, all of them unresponsive
Set config.backends.maxAttempts to 5
Curl the app
Expected result after the change
Gorouter tries 3 times until all endpoints are exhausted
Current result before the change
Gorouter tries 5 times, retrying endoints it already tried and failed
Fixes issue 385
Remove unlimited retries as it is dangerous. companion PR for spec on routing-release
Minimum number of attempts is 1
Do not attempt more often than endpoints available on the route
Fixed test suite to actually use the number of backends as claimed in the tests (instead of trying 5 times on one endpoint, we will now actually try on 5 different endpoints)
An explanation of the use cases your change solves
With a change to error classifications, errors such as "IdempotentEOF" are no longer prunable. They only are retry-able now (which is correct, because we don't know for sure if the endpoint is unusable and must be pruned).
This has the side-effect that apps that are slow and buggy tend to exhaust
config.backends.maxAttempts
, which can be dangerous ifmaxAttempts
is set to a high value or even unlimited, which could cause Gorouter to take extremely long to respond. This is because prior to this PR, Gorouter would have only stopped trying oncemaxAttempts
had been reached, regardless whether or not all of the endpoints were non-working.This PR makes it so that Gorouter will at most try the number of endpoints in the route before giving up. If there is only one endpoint and it's broken, there is no point in trying it 5 times in a row, it just slows everything down.
config.backends.maxAttempts
to 5Gorouter tries 3 times until all endpoints are exhausted
Gorouter tries 5 times, retrying endoints it already tried and failed
routing-release PR
[x] I have viewed signed and have submitted the Contributor License Agreement
[x] I have made this pull request to the
main
branch[x] I have run all the unit tests.
[ ] (Optional) I have run Routing Acceptance Tests and Routing Smoke Tests
[ ] (Optional) I have run CF Acceptance Tests