Closed Raynos closed 9 years ago
supremo
:+1:
Good start... but don't we actually want to have this be a wider class of "any retryable error" and not just busy?
@jcorbin
Agreed; Starting with busy; not sure which other error frames to penalize; maybe Declined.
This will need to rebased on github.com/uber/tchannel-node
This will penalize a peer that has busy frames by giving him more pending counts and selecting him less.
Once that peer stops returning busy frames or the tombstones time out we will start selecting him again.
This allows Hyperbahn to route requests around a busy worker (same with xlate) without having to have retries on.
In the long term this will make less requests fail when you have 100 exit nodes and 4 of them are busy.
r: @jcorbin @kriskowal @rf
cc @anson627