uber / tchannel

network multiplexing and framing protocol for RPC
MIT License
1.15k stars 129 forks source link

Take busy frames into account for peer selection #1347

Closed Raynos closed 9 years ago

Raynos commented 9 years ago

This will penalize a peer that has busy frames by giving him more pending counts and selecting him less.

Once that peer stops returning busy frames or the tombstones time out we will start selecting him again.

This allows Hyperbahn to route requests around a busy worker (same with xlate) without having to have retries on.

In the long term this will make less requests fail when you have 100 exit nodes and 4 of them are busy.

r: @jcorbin @kriskowal @rf

cc @anson627

kriskowal commented 9 years ago

supremo

anson627 commented 9 years ago

:+1:

jcorbin commented 9 years ago

Good start... but don't we actually want to have this be a wider class of "any retryable error" and not just busy?

Raynos commented 9 years ago

@jcorbin

Agreed; Starting with busy; not sure which other error frames to penalize; maybe Declined.

Raynos commented 9 years ago

This will need to rebased on github.com/uber/tchannel-node