square / okhttp

Square’s meticulous HTTP client for the JVM, Android, and GraalVM.
https://square.github.io/okhttp/
Apache License 2.0
45.74k stars 9.15k forks source link

Document ConnectionPool behavior with different IP sets #7675

Open costela opened 1 year ago

costela commented 1 year ago

I was a bit reluctant between a feature request and a bug, but since it's technically docummented, let's go with feature request for now.

We've run into some corner-case behavior which might warrant either some documentation improvement or maybe even some new feature. Context:

The documentation for ConnectionPool mentions this:

HTTP requests that share the same Address may share a Connection.

This means if we have two services whose DNS records point to some overlapping set of IPs (e.g. because they share a load-balancer), we might reuse a connection first used for service A when accessing service B later. If we're using HTTP2 this can even happen over with TLS (HTTP1 would force a different connection to allow for SNI; HTTP2 can multiplex).

This seems like an unlikely scenario at first, but we ran into it while rolling out IPv6 support for some of our services:

Looking back it's clear we were reusing an existing v4 connection to some previous service to talk to a dualstacked service, which "shadowed" v6 support in almost all cases (except when we happened to run into connection-pool limits between calls).

I'm reluctanct to call this bug because in most cases an existing connection is what you want. However, I still see two possible improvements:

  1. expand the docs to include a mention to this edge case. Something like:

    "HTTP requests that share the same Address may share a Connection. Conversely, if one request resolves to address A, a subsequent request resolving to addresses A and B may never see any connections to address B."

  2. optionally extend the ip-to-connection mapping to consider all resolved IPs (becoming maybe "hash-of-ips"-to-connection?). This would incur some minor overhead at the advantage of ensuring connection reuse is tried only when it's sane.

wdyt?

yschimke commented 1 year ago

I'm not clear the connection coalescing is wrong, or should be improved. And while it could be configurable, we avoid this and try to implement something correct and inline with user agents like Firefox and Chrome. It's deliberate that if there is some overlap and and existing connection in that overlap, we will use it. From memory we are less aggressive that Firefox or Chrome (but I forget which).

So I think updating the documentation and perhaps adding a sample to highlight this is the way to go. The other aspect of this is observability, how using the EventListener or the new ConnectionListener to make it obvious what is going on.

costela commented 1 year ago

Yes, I wouldn't say it's wrong either. Just slightly misleading if the setup is non-orthodox.

In hindsight the problem we ran into is pretty obvious, but unfortunately if you start debugging from the other end (it's always DNS, right? :grin: ), it takes a while to get down to the client's connection pooling.

yschimke commented 1 year ago

Do you want to go back on forth on a sample that would explain the behaviour using ConnectionListener and some hosts known to work with coalescing?

I also wonder whether Dns and Certificates should be exposed in the https://github.com/square/okhttp/blob/5402ebaaa67ae05975e7d02490ef4003fdf3a4da/okhttp/src/jvmMain/kotlin/okhttp3/ConnectionListener.kt for this reason.

yschimke commented 1 year ago

BTW not sure these are still the implementations but from https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/

The Firefox way

Host A has two addresses, host B has two addresses. The lists of addresses are not the same, but there is an overlap – both lists contain 192.168.0.2. And the host A has already stated that it is authoritative for B as well. In this situation, Firefox will not make a second connect to host B. It will reuse the connection to host A and ask for host B’s content over that single shared connection. This is the most aggressive coalescing method in use.

The Chrome way

Chrome features a slightly less aggressive coalescing. In the example above, when the browser has connected to 192.168.0.1 for the first host name, Chrome will require that the IPs for host B contains that specific IP for it to reuse that connection. If the returned IPs for host B really are 192.168.0.2 and 192.168.0.3, it clearly doesn’t contain 192.168.0.1 and so Chrome will create a new connection to host B.

Chrome will reuse the connection to host A if resolving host B returns a list that contains the specific IP of the connection host A is already using.

swankjesse commented 1 year ago

Yep, I can believe this was very surprising! In general we’re also nondeterministic for dualstacked hosts. With OkHttp 5 and happy eyeballs OkHttp will attempt IPv6 first, and IPv4 concurrently after a brief delay.

I think we could write a bunch more documentation here. What did you read while you were looking into this? We should put the docs where people are already looking.