dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

Improve handling of DNS changes by revalidating pooled SocketsHttpHandler connections periodically #60390

Open antonfirsov opened 3 years ago

antonfirsov commented 3 years ago

Copying @geoffkizer 's idea from https://github.com/dotnet/dotnet-api-docs/issues/7285#issuecomment-942695230 into a new issue so we can discuss and track it:

The problem occurs when you switch your DNS mappings and remove an existing IP entry (or entries), but the server at that IP address continues to handle requests on the previously-established connections. In that case, SocketsHttpHandler will just keep using those previously-established connections, unaware that they are no longer valid according to the latest DNS values.

PooledConnectionLifetime was introduced to solve this problem. It does so by simply discarding connections once they hit a certain age. Eventually the old connections will be discarded and new ones will be created, and the new connections will respect the new DNS values when they are created.

This isn't ideal for a couple reasons: (1) Connections get discarded regardless of whether they are still valid per current DNS or not. (2) This introduces a trade-off between performance and correctness. If you want SocketsHttpHandler to respond quickly to DNS changes, then you need to set this to a relatively small value. But the smaller you set it, the more often you'll need to create new connections, which will impact performance.

On top of that, we clearly don't do a good job of explaining this.

I wonder if we could handle this better in SocketsHttpHandler. One idea is to add a new setting like PooledConnectionDNSRevalidationPeriod. This would be similar to PooledConnectionLifetime, except that when the specified time period expires, we would "revalidate" the connection by checking whether the IP address used to establish it is still valid in DNS or not. If it is, we can keep using it. If not, just discard it, like we do with PooledConnectionLifetime today.

The nice thing about this approach is that, assuming the revalidation logic is relatively cheap, we could enable this by default with a relatively low time period, such that it would hopefully just work for most users. If you don't care about DNS changes, then there's a small added cost to the revalidation, but hopefully it's so small you wouldn't notice (and you can always disable it completely if you want). If you do care about DNS changes, then the default behavior hopefully works reasonably well for you, and you can tune it further if necessary.

ghost commented 3 years ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
Putting @geoffkizer 's idea from https://github.com/dotnet/dotnet-api-docs/issues/7285#issuecomment-942695230 into a new issue so we can discuss and track it: > The problem occurs when you switch your DNS mappings and remove an existing IP entry (or entries), but the server at that IP address continues to handle requests on the previously-established connections. In that case, SocketsHttpHandler will just keep using those previously-established connections, unaware that they are no longer valid according to the latest DNS values. > > `PooledConnectionLifetime` was introduced to solve this problem. It does so by simply discarding connections once they hit a certain age. Eventually the old connections will be discarded and new ones will be created, and the new connections will respect the new DNS values when they are created. > > This isn't ideal for a couple reasons: > (1) Connections get discarded regardless of whether they are still valid per current DNS or not. > (2) This introduces a trade-off between performance and correctness. If you want SocketsHttpHandler to respond quickly to DNS changes, then you need to set this to a relatively small value. But the smaller you set it, the more often you'll need to create new connections, which will impact performance. > > On top of that, we clearly don't do a good job of explaining this. > > I wonder if we could handle this better in SocketsHttpHandler. One idea is to add a new setting like `PooledConnectionDNSRevalidationPeriod`. This would be similar to `PooledConnectionLifetime`, except that when the specified time period expires, we would "revalidate" the connection by checking whether the IP address used to establish it is still valid in DNS or not. If it is, we can keep using it. If not, just discard it, like we do with `PooledConnectionLifetime` today. > > The nice thing about this approach is that, assuming the revalidation logic is relatively cheap, we could enable this by default with a relatively low time period, such that it would hopefully just work for most users. If you don't care about DNS changes, then there's a small added cost to the revalidation, but hopefully it's so small you wouldn't notice (and you can always disable it completely if you want). If you do care about DNS changes, then the default behavior hopefully works reasonably well for you, and you can tune it further if necessary.
Author: antonfirsov
Assignees: -
Labels: `area-System.Net.Http`, `untriaged`
Milestone: -
karelz commented 3 years ago

Triage: If we have DNS client (with TTL) - #19443, we won't need this.

edit by Cory: the full strategy we agreed to is in https://github.com/dotnet/runtime/issues/60390#issuecomment-952361723.

geoffkizer commented 3 years ago

Even if we could get TTL info from DNS (and I'm skeptical we will be able to do this any time soon), the ability to "revalidate" a connection is still useful.

With TTL info, we would know when a connection may have become invalid because the DNS info has changed. But just because the TTL has expired doesn't necessarily mean the DNS info has actually changed. Most of the time, it's the same, with an extended TTL.

If we always just drop the connection when the DNS TTL expires, then we still suffer from this problem above:

Connections get discarded regardless of whether they are still valid per current DNS or not.

So I think the ideal solution would be having TTL info + revalidation -- that is, when the TTL expires, we revalidate the DNS info exactly as described above.

In the absence of TTL info, it seems like having the user provide something like PooledConnectionDNSRevalidationPeriod is a pretty good fallback. Roughly speaking, the user is manually giving us the TTL info instead of automatically retrieving the TTL from DNS.

scalablecory commented 3 years ago

So I think the ideal solution would be having TTL info + revalidation -- that is, when the TTL expires, we revalidate the DNS info exactly as described above.

In the absence of TTL info, it seems like having the user provide something like PooledConnectionDNSRevalidationPeriod is a pretty good fallback. Roughly speaking, the user is manually giving us the TTL info instead of automatically retrieving the TTL from DNS.

The triage comment does not reflect this, but this is the agreement we came to during triage.

Gladskih commented 4 months ago

DNS is not TCP. DNS TTL is not a reason to close a connection. If a server is not able to handle connection anymore it should close it. If it does not - it a bug of the server, not a client. "PooledConnectionLifetime workaround" is just a step in a wrong direction.