Open conradludgate opened 1 year ago
We don't need to use tonic's Channel abstraction since Grpc client makes use of tower services directly. That let's us implement this logic ourselves.
Now for how we communicate load. An idea I had is to have a well-known header in responses for how much load the server is currently experiencing.
This can be cached by the client. In high load situations this will be accurate. In low load situations it will be less accurate but arguably less important.
We would need to decide on a good header for this.
An alternative is to have an alternative well known load look up RPC. We can call that on a regular basis to update load values
If the load endpoint or header doesn't exist, we can stop checking for it and fallback on the original strategy of random or round robin. This can be configured dynamically
Have you seen this which seems to be what Linkerd use?
Using latency is clever, but I'm not sure exactly how the pending calculation works. I'm probably missing something.
Ahh I see. It's not 'pending', it's 'in flight requests'.
So load=latency*in_flight
.
I have no objections to this, since it makes it only a client side approach.
It would be good to mention this in the docs. Currently ginepro doc says it's using using The Power of Two Choices, which is misleading.
Bug description
ginepro indirectly uses
tower::Balance
which makes use a best-of-two random load balancing strategy.Unfortunately, tonic has no built in mechanism for determining load, so this is hard coded to be 0 always.
https://docs.rs/tonic/0.8.3/src/tonic/transport/service/connection.rs.html#114-120
This means we have randomise distribution. This is still a balancing technique but is known to not be very good