akka / akka-http

The Streaming-first HTTP server/module of Akka
https://doc.akka.io/docs/akka-http
Other
1.34k stars 596 forks source link

Load balancing pool for HTTP/1.1 and HTTP/2 #3505

Open jrudolph opened 3 years ago

jrudolph commented 3 years ago

Background Information

Load balancing clients are a de-facto standard in data center environments. Compared to load balancers for external traffic that often comes in through a central point in the infrastructure, client-based load-balancing inside the data center is possible under different preconditions and with different advantages:

One main property is that load balancing logic is distributed to the clients (with all the advantages and disadvantages it brings).

(In managed environments, the load balancing logic might be implemented inside of a service mesh in which case a client will not need its own balancing logic)

Some references:

Akka HTTP Implementation Ideas

One main question is where to put load balancing logic in the Akka HTTP client implementation stack:

I currently favor the second option mainly because of separation of concerns: the pool handles the connection lifecycle and slot management, the load balancing component would handle the distribution of work. Open questions:

Marcus-Rosti commented 3 years ago

@jrudolph I've seen this issue on scale ups that my service will still only request the original set. client calls 5 backend servers, the server has more load and scales to 6, but the original 5 are the only that get traffic.

When I 'kill' one of the services it rebalances across all the nodes

Marcus-Rosti commented 3 years ago

in akka-grpc ^^^ re: @raboof

jrudolph commented 3 years ago

@jrudolph I've seen this issue on scale ups that my service will still only request the original set. client calls 5 backend servers, the server has more load and scales to 6, but the original 5 are the only that get traffic.

When I 'kill' one of the services it rebalances across all the nodes

Interesting. In some way, the behavior makes sense: you don't want to query the set of backend servers for every request but need some kind of trigger to query what the current set of backend servers is. You could do it regularly or wait for some event like a server going down.

So far akka-grpc uses the grpc-java client, so that's something we can only fix here once we have an akka-http client backend for akka-grpc.

Marcus-Rosti commented 3 years ago

The way we've solved it is to pass in a name resolver https://grpc.github.io/grpc-java/javadoc/io/grpc/NameResolver.html that continually pings the kubernetes api for the ips of the pods that come up or down and replacing them reactively. But like you said that backend service is NOT designed for that.

I was trying to figure out a way to either use the akka-management kubernetes module to do it but it has the same problem where it queries only when a pod goes away. The other idea I had was calling https://github.com/akka/akka-grpc/blob/88252782b64809d3d44d7510f2e648c13aa5aa96/runtime/src/main/scala/akka/grpc/internal/AkkaDiscoveryNameResolver.scala#L36 but as far as I can tell this isn't used anywhere that I can interact with it.

Anyway, I don't know what solution works best in a library management sense but it's something I've been thinking about.

ignasi35 commented 3 years ago

it could be a component on top of the pool (one pool for each backend with a component on top for routing requests)

+1 to using this approach.

Supporting client load balancing opens the door for a huge list of requirements and customizations. A pluggable layer on top of the pool(s) allowing users to implement routing logic sounds like the best way forward.

Take for example the options introduced in gRPC where clients may: (1) consume load reports from the server (on a side-channel), or even (2) defer the routing decisions to an external component (aka, Lookaside LB). I am not saying we should implement any of this but just provide the infrastructure for people to support them.

ignasi35 commented 3 years ago

(continuing..., should have been a single comment) Another consideration is whether such a component should bring circuit-breaking out of the box or not.

There are two options:

The options above are not either-or, though. In any case, if we were to add such a feature in the new client-side component I think it should be part of a reference implementation and not the pluggable layer.

Summing up, I think we should have a component that given a request passes the request to the appropriate pool following some externally-plugged logic (sticky session, load balancing, ...). Separately, we should provide a single implementation or some basic implementations for the initial use cases.