How to configure load balancing of HTTP2?

vasicvuk commented 2 years ago

Some details

I tried to find in documentation how should we configure the Round-robin Load balancing of HTTP2. We are using IHttpForwarder but each time the connection is established all the requests are going to the one instance of the service which has multiple instances.

Since we use Kubernetes, we noticed that on the GRPC client for .NET there is the possibility to set a DNS-based Round-Robin load balancer based on Kubernetes Headless service.

Can I somehow configure this using YARP, I tried searching the documentation but I didn't find anything about it?

How many backends are in your application?

[X] 1-2
[ ] 3-5
[ ] 6-10
[ ] 10+

How do you host your application?

[X] Kubernetes
[ ] Azure App Service
[ ] Azure VMs
[ ] Other Cloud Provider (please include details below)
[ ] Other Hosting model (please include details below)

MihaZupan commented 2 years ago

So you have only one destination url (what you pass to the forwarder), but that destination itself has load-balancing?

This is essentially the issue discussed in #1555:

The problem is that your L4 load distributor probably isn't designed for use when you have a reverse proxy - it works when load comes from a large number of clients, but in this case, the load is coming from one machine, and its designed to be as efficient as possible with connection re-use so that it can get the best performance,

HttpClient will only open up new connections when it needs them. In case of HTTP/2, that's only when you have over 100 concurrent requests. You can set SocketsHttpHandler.EnableMultipleHttp2Connections = true, and you will see new connections being opened once you reach sufficient load.

Is this just an observation when testing, or do you need load balancing on a more granular level?

vasicvuk commented 2 years ago

Hi @MihaZupan,

I definitely want to reuse the Http2 connection for multiple requests but in a way so that I keep a Pool of opened connections based on DNS record and then always round-robin requests between those connections.

This is basically client-based load balancing.

Docs for implementation of this in GRPC client is here:

https://docs.microsoft.com/en-us/aspnet/core/grpc/loadbalancing?view=aspnetcore-6.0#dnsresolverfactory

Tratcher commented 2 years ago

Your best option right now is to do the DNS lookup before IHttpForwarder and put the IP directly in the destination url. That way you can manually round robin between IPs.

vasicvuk commented 2 years ago

@Tratcher I expected this to already be implemented behind IHttpForwarder as an option since in GRPC Dotnet library this is not a few lines of code, it's very hard logic with a lot of knowledge behind it. Also, I see that YARP users would benefit having this feature.

MihaZupan commented 2 years ago

There is no built-in DNS round-robin in .NET after Framework. This is the issue discussing adding support for it: https://github.com/dotnet/runtime/issues/68967

Even with that, doing per-request round-robin over a list of HTTP/2 connections is not something HttpClient supports. If there is sufficient load to warrant having multiple H2 connections open, it will still use connections to their allowed maximum. We haven't seen this as an issue since given a sufficient amount of load, different connections will eventually end up being used at approximately the same rate.

samsp-msft commented 2 years ago

I'm not sure I'd want to build this directly into SocketsHttpHandler, but could be created as a wrapper layer around it/HttpClient:

When a URL is requested see if there is already a dns cache for that hostname. If not, or if the cache has timed out do a DNS lookup, and cache the results based on the hostname.
Pick an IP from the cache based on an algorithm - random or if being more sophisticated least loaded of 2 random results.
Modify the URL to be IP based, but use a host header to indicate the correct DNS name
Make the call via SocketsHttpHandler - let it manage the connection cache.
If the connection fails, then force a new DNS lookup and use the results of that for subsequent requests.

MihaZupan commented 2 years ago

Dns round-robin is definitely something you can do. I have an implementation here that effectively does what you describe and plugs into ConnectCallback: https://github.com/MihaZupan/DnsRoundRobin The only thing you mentioned it doesn't do is picking IPs based on load - it's always just round-robin.

What I think does belong inside SocketsHttpHandler is how available HTTP/2 connections are selected for each request (at least until we let the user plug in their own connection pooling).

vasicvuk commented 2 years ago

Hi @MihaZupan, Thanks for sharing your implementation. I will give it a try. I hope we will have a built-in mechanism in the near future.

vasicvuk commented 2 years ago

@MihaZupan I tried your NuGet package but I getting log:

Connected to 192.168.87.81:8080

Only on the first call, and after that it is kept to use only a single IP address from DNS results

UPDATE

I understood from your example that I need to create SockerHttpHandler per each request? Will this mean that every time a new port for connection is opened? I think this will lead to the issue of opening too many ports since OS needs time to release the port that was used.

MihaZupan commented 2 years ago

Enabling DnsRoundRobin means that if we end up opening multiple connections, you would see that those go to different IPs[^1].

The second part of the problem is whether we will open multiple connections or not. By default, SocketsHttpHandler will only open one HTTP/2 connection per host (see https://github.com/microsoft/reverse-proxy/issues/1726#issuecomment-1129895775). If you set handler.EnableMultipleHttp2Connections = true, you will see multiple connections being opened if the number of requests is too much for the single connection to handle (over 100 concurrent requests by default).

So you would see the desired behavior if:

You use DnsRoundRobin
Enable EnableMultipleHttp2Connections
Generate enough load to warrant opening multiple connections

For nr. 3, there is currently no way to force the handler to be more aggressive in opening connections if it doesn't think it needs them. As a temporary workaround, you can create multiple handlers yourself and cycle between them to get the effect of multiple connections being opened.

I understood from your example that I need to create SockerHttpHandler per each request?

No, the handler should be created once and reused.

[^1]: If connection attempts don't happen often enough, the DnsRoundRobin implementation will clean up the cached state and the next connection will start with the first IP again. By default that's 1/min.

vasicvuk commented 2 years ago

@MihaZupan I misunderstood it then. If I have to generate "enough load" then I don't need a DNS load balancer then.

davidfowl commented 2 years ago

We implemented gRPC load balancing @JamesNK might be able to share some experience here.

karelz commented 2 years ago

Triage: It is worth revisiting the BCL feature for HttpClient built-in load-balancing. This is 3rd customers / scenario -- gRPC, YARP and YARP via IHttpForwarder. Let's keep this in YARP backlog and let's resurrect the idea in Runtime repo - @samsp-msft will file new issue.

JamesNK commented 2 years ago

gRPC load balancing resolves an address (e.g. an-example-dns-host) to a list of IPs (e.g. 80.80.80.81, 80.80.80.82, etc). Then gRPC calls are load balanced across those IP addresses based on configuration, e.g. use first healthy address, or round-robin across all healthy addresses. Because the request address is sent directly to the IP address then HttpClient has a different connection for each.

Relying on EnableMultipleHttp2Connections to create multiple connections, and then have those connections be load balanced (L4 load balancing) isn't a good solution. All requests will go to one endpoint, leaving others idle, unless you have over 100 requests in flight at once.

jernejg commented 2 years ago

Hey @vasicvuk I think the problem you are having is because of k8s as described here gRPC Load Balancing on Kubernetes without Tears. We solved it by forwarding every outgoing gRPC call through an Envoy sidecar (The article suggests Linkerd though).

If I understand correctly, if we want to solve this problem using YARP (HttpClient) we would need to integrate with the k8s API?

samsp-msft commented 2 years ago

If I understand correctly, if we want to solve this problem using YARP (HttpClient) we would need to integrate with the k8s API? To make this work efficiently, you will need YARP to have destinations configured for the number of destinations that exist within the k8s cluster. YARP will load balance against each of the destinations that are supplied. They can be specified by IP address, not just domain name.

The obvious way to do this is with the k8s API, this is being worked on as part of #1254. A more hacky solution would be to use DNS to query for the hosts for the cluster, which would enumerate them, but can run into issues with DNS caching as the TTL isn't exposed via the DNS APIs. You can probably rely on the passive health checks to remove dead entries pretty quickly, and use polling to determine when new entries need to be added to configuration.

microsoft / reverse-proxy

How to configure load balancing of HTTP2? #1726

Some details