tower-rs / tower

async fn(Request) -> Result<Response, Error>
https://docs.rs/tower
MIT License
3.53k stars 282 forks source link

Weighted balance::p2c #696

Open samvrlewis opened 2 years ago

samvrlewis commented 2 years ago

First of all, thanks for the great library!

I'm looking into using the tower::balance::p2c load balancer with weighted services. This is primarily to allow for a use case of canary deployments, where one service that the load balancer serves should receive (a configurable amount of) less traffic than the other services.

This particular use case of a single, less weighted, service mostly works through implementing a weight tower::Load implementation that can scale down a particular service's load. I've put up an initial attempt of doing this in this PR which is heavily inspired by the implementation that used to exist in tower.

However, because of the way p2c balancing works, this implementation can have unexpected results in the general case.

For example, if there are 3 services: [A, B, C] with weights [0.99, 0.005, 0.005] then a user would likely expect that (assuming everything else is equal):

What will actually occur with a p2c load balancer, because of the "randomly pick any two" heuristic, is that:

This is because the permutations AB, AC, BC are all equally likely - so BC will account for a third of requests.

The only nice way around this that I can think of is to also allow the p2c load balancer's random picking to be weighted, probably by using weighted random sampling.

The way this could work is that each service would implement weight, which would be used in the weighted random sampling and in the load comparisons. Note that it's still not enough to just do weighted random sampling because the load balancer will equally load the pairs of services it picks.

Anyway, I thought it'd be worth checking in with the community to see what would be a desired path forward before I start working on any implementation. Curious to hear other's thoughts, or if there's a more elegant solution that I'm missing.

I also see that there was a little bit of talk about this topic back in 2019: https://github.com/tower-rs/tower/issues/286#issue-449394052 maybe @olix0r has some thoughts here as well. 😄

hawkw commented 2 years ago

The only nice way around this that I can think of is to also allow the p2c load balancer's random picking to be weighted, probably by using weighted random sampling.

Hmmm, so, the primary benefit of P2C load balancing is that because only two endpoints are ever compared, choosing an endpoint is O(1). That is to say, the main reason this strategy is desirable is that the time taken to choose an endpoint does not increase as the size of the set of endpoints to choose from increases. I don't believe it's possible to do a weighted random sampling where all endpoints have their own weights in O(1) time — at least, the code in rand you linked to states that it requires O(n) time.

If there isn't a way to do weighted sampling in the P2C balancer without doing a linear scan over the endpoints, I'm not sure if it makes sense to add this functionality to the P2C balancer. Doing a linear scan kind of defeats the purpose of doing P2C balancing in the first place. In fact, if we added weights to the P2C balancer using a naive approach where a weighted random sample is generated twice to generate two indices and then those indices are compared, we would actually be doing two linear scans — so using P2C would actually make the balancer's time complexity worse than an approach that just always does a single linear scan over all endpoints.

This isn't to say that tower shouldn't have some form of weighted load balancing: it would be great to have that! But, I think that if we can't add weighted balancing without doing an O(n) linear scan over all available endpoints, then I don't think that the P2C load balancer is the correct place to implement weighted balancing. Instead, if weighted balancing requires an O(n) balancer, we should probably just add a separate load balancer implementation for users who can accept _O(n) balancing, rather than trying to add it to the P2C balancer. That balancer could implement weighted load balancing (and potentially, other features that don't make sense in P2C).

hawkw commented 2 years ago

Doing a bit of additional research, it seems like there might be some methods for generating weighted samplings in O(1) time by pre-computing probability tables, as described here. If we're willing to accept the space overhead of storing these tables, and additional time overhead for service discovery updates (in order to regenerate the tables), it's possible we could use one of the approaches in the page I linked to generate weighted samples without negating the benefits of P2C balancing. In particular, the "Alias method" described there seems worth investigating...

samvrlewis commented 2 years ago

Hmmm, so, the primary benefit of P2C load balancing is that because only two endpoints are ever compared, choosing an endpoint is O(1).

Interesting, that's good to know. I had assumed that, for most applications, the primary benefit of P2C would be to avoid herd behavior when there's multiple load balancers (as nicely described in this NGINX article). Certainly, in my use-case, that's the property I care most about - but I'm fortunate enough to be load balancing over a small enough amount of services that the difference between O(1) and O(n) is pretty insignificant.

Instead, if weighted balancing requires an O(n) balancer, we should probably just add a separate load balancer implementation for users who can accept _O(n) balancing, rather than trying to add it to the P2C balancer.

I think this makes sense either way - it seems like it would likely be beneficial to have a separate balancer (maybe something like tower::balance::weighted?) rather than add a bunch of different ways the P2C balancer can be configured. Even if we could get P2C to work in O(1) it seems like it might A) make the implementation messy and/or B) make users of P2C that don't care about weights have to carry the overhead of the weighted implementation. What do you think?

As a path forward, perhaps I can add work towards developing a new tower::balance:weighted module that: