When using Stubby as a system DNS over TLS resolver with a Internet
connection that disconnects and reconnects from time to time there is often
a long waiting time (~20 minutes) after the connection reconnects before
DNS queries start to work again.
This is because in this particular case all the upstream TLS TCP
connections in Stubby are stuck waiting for upstream server response.
Which will never arrive since the host external IP address might have
changed and / or NAT router connection tracking entries for these TCP
connections might have been removed when the Internet connection
reconnected.
By default Linux tries to retransmit data on a TCP connection 15 times
before finally terminating it.
This takes 16 - 20 minutes, which is obviously a very long time to wait for
system DNS resolving to work again.
This is a real problem on weak mobile connections.
Thankfully, there is a "TCP_USER_TIMEOUT" per-socket option that allows
explicitly setting how long the network stack will wait in such cases.
Let's add a matching "tcp_send_timeout" option to getdns that allows
setting this option on outgoing TCP sockets.
For backward compatibility the code won't try to set it by default.
With this option set to, for example, 15 seconds Stubby recovers pretty
much instantly in such cases.
When using Stubby as a system DNS over TLS resolver with a Internet connection that disconnects and reconnects from time to time there is often a long waiting time (~20 minutes) after the connection reconnects before DNS queries start to work again.
This is because in this particular case all the upstream TLS TCP connections in Stubby are stuck waiting for upstream server response. Which will never arrive since the host external IP address might have changed and / or NAT router connection tracking entries for these TCP connections might have been removed when the Internet connection reconnected.
By default Linux tries to retransmit data on a TCP connection 15 times before finally terminating it. This takes 16 - 20 minutes, which is obviously a very long time to wait for system DNS resolving to work again. This is a real problem on weak mobile connections.
Thankfully, there is a "TCP_USER_TIMEOUT" per-socket option that allows explicitly setting how long the network stack will wait in such cases.
Let's add a matching "tcp_send_timeout" option to getdns that allows setting this option on outgoing TCP sockets. For backward compatibility the code won't try to set it by default.
With this option set to, for example, 15 seconds Stubby recovers pretty much instantly in such cases.