ShipeiXu commented 5 months ago

Describe the bug Scenario, when the authoritative server does not respond to a specific domain name, unbound will cause a very serious penalty to the authoritative server. And the process of marking authoritative ns as timed out is very fast, it is an exponential process. eg: taobao.com. When I initiate a request to unbound https://taobao.com, unbound will poll the four ns servers of taobao.com, ns4.taobao.com., ns5.taobao.com., ns6.taobao.com., ns7.taobao.com. initiated https://taobao.com requests, and all requests timed out. Causes rto to double. When I request again, rto is triggered to double again. Soon, taobao.com will be marked as timed out in infra. Normal requests under the taobao.com domain will also respond to serverfail quickly. This is a loophole. We cannot control client-side requests, but we must ensure that client requests do not affect unbound's normal services.

2401:b180:4100::5 taobao.com. ttl 3299 ping 0 var 94 rtt 376 rto 376 tA 0 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
140.205.122.36 taobao.com. ttl 2184 ping 3 var 78 rtt 315 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 1 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
106.11.41.149 taobao.com. ttl 3290 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
140.205.122.34 taobao.com. ttl 3289 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
106.11.41.150 taobao.com. ttl 2095 ping 0 var 72 rtt 288 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 1 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
2401:b180:4100::6 taobao.com. ttl 3293 ping 0 var 94 rtt 376 rto 376 tA 0 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
106.11.35.26 taobao.com. ttl 3291 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 1 lame dnssec 0 rec 0 A 0 other 0
140.205.122.33 taobao.com. ttl 2095 ping 5 var 64 rtt 261 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 1 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
2401:b180:4100::4 taobao.com. ttl 3295 ping 0 var 94 rtt 376 rto 376 tA 0 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
2401:b180:4100::7 taobao.com. ttl 3293 ping 0 var 94 rtt 376 rto 376 tA 0 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
106.11.35.25 taobao.com. ttl 3292 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
47.88.74.35 taobao.com. ttl 3293 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
47.241.207.13 taobao.com. ttl 3290 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
47.88.74.33 taobao.com. ttl 2095 ping 21 var 113 rtt 473 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 1 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
47.241.207.15 taobao.com. ttl 3291 ping 0 var 94 rtt 376 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 0 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0
140.205.122.35 taobao.com. ttl 2094 ping 8 var 54 rtt 250 rto 120000 tA 3 tAAAA 0 tother 0 ednsknown 1 edns 0 delay 0 lame dnssec 0 rec 0 A 0 other 0

To reproduce Steps to reproduce the behavior:

dig @127.0.0.1 -p 53 https://a.taobao.com
dig @127.0.0.1 -p 53 https://b.taobao.com
dig @127.0.0.1 -p 53 https://c.taobao.com
..... This goes on until rto reaches the upper limit

Expected behavior Like bind, rto is growing, but will not increase to the upper limit any time soon. Until infra's ttl times out, rto is reset to a smaller level.

System:

Unbound version:Version 1.20.0
OS: centos 7
unbound -V output:
```
Version 1.20.0
```

Configure line: --with-libnghttp2 --prefix=/usr/unbound/ --enable-subnet --with-pthreads --with-libevent --enable-dnstap --enable-cachedb --enable-ipsecmod --enable-ipset --enable-linux-ip-local-port-range --enable-dnscrypt --enable-systemd --with-pythonmodule Linked libs: libevent 2.0.21-stable (it uses epoll), OpenSSL 1.0.2k-fips 26 Jan 2017 Linked modules: dns64 python cachedb ipsecmod subnetcache ipset respip validator iterator DNSCrypt feature available

BSD licensed, see LICENSE in source package for details. Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues

**Additional information**

The possible problem should be in the rtt_lost function:

void rtt_lost(struct rtt_info rtt, int orig) { / exponential backoff */

/* if a query succeeded and put down the rto meanwhile, ignore this */
if(rtt->rto < orig)
    return;

/* the original rto is doubled, not the current one to make sure
┆* that the values in the cache are not increased by lots of
┆* queries simultaneously as they time out at the same time */
orig *= 2;
if(rtt->rto <= orig) {
    rtt->rto = orig;
    if(rtt->rto > RTT_MAX_TIMEOUT)
        rtt->rto = RTT_MAX_TIMEOUT;
}

}

ShipeiXu commented 5 months ago

There are several v6 addresses. Since my machine does not have a public IPv6 environment, udp connect will fail. In this case the rto of the v6 address should also grow. But there is actually no growth.

ShipeiXu commented 5 months ago

gthess commented 5 months ago

As I understand it, the servers are intentionally dropping packets causing timeout logic in Unbound. Unbound then applies the exponential backoff timer to wait more and more between each packet until the servers reach the configured upper limit. In that case Unbound will consider the servers offline and not waste traffic there.

infra-keep-probing: yes allows Unbound to send probes to a down server in case it is up and useful again. The probes are guaranteed to happen at least each infra-host-ttl (900 seconds by default). (The probe can also happen earlier, but if the server is still down the next probe will likely be at infra-host-ttl.)

Now with that out of the way, failure to communicate and introducing resolver timeouts by the upstream nameservers is not Unbound's problem. There is also a relevant RFC about that https://datatracker.ietf.org/doc/rfc8906/.

The exponential backoff logic happens when a request times out, if your system does not support IPv6 the query is likely not getting out at all. In that case do-ip6: no (or prefer-ip4: yes if you want to serve IPv6 clients) help with upstream server selection.

I would like to close this as a non-issue but I leave it open in case I misunderstood something from your text :)

ShipeiXu commented 5 months ago

you're right. I can understand the logic. But I still expect a better logic. I tested bind and knot-resolver. They have better anti-attack performance in the same scenario. As a DNS resolver, we cannot control the requests received. When a large number of customers initiate requests for https://*.taobao.com, it will cause unbound to mark all NS of taobao.com as timed out. This process happens very quickly. Is exponential backoff a best practice? I'm negative about this implementation. bind seems to use logarithmic avoidance. Even if the authority loses a lot of packets, serverfail will not happen quickly.

ShipeiXu commented 5 months ago

"infra-keep-probing: yes " configuration is not what I expected. I set infra-cache-max-rtt to 2000 to reproduce the problem faster. When all ns times out, I request a normal domain name again, and unbound always responds to serverfail. I don't understand the role infra-keep-probing plays. According to my understanding, after each request is received, unbound should initiate an additional probe.

gthess commented 5 months ago

infra-keep-probing: yes will not probe always with each request. You can try bringing the infra-host-ttl down to see it probing when the infra record is expired.

If servers timeout Unbound would first try to aggressively back off to give them time to come back up or deal with a potential traffic spike. If they still timeout after the configured max (12 seconds by default) they are considered down and may be probed in the future. This is fine for a server under pressure, it can stay out of Unbound's selection and may be reprobed based on configuration.

This case however is for all misbehaving upstream nameservers. Fixing this case just for them brings unneeded server selection shenanigans and retries to Unbound for all other nameservers that are simply down.

I understand the "attack" you are talking off, but I don't see how this is Unbound's problem.

catap commented 5 months ago

@ShipeiXu see https://github.com/NLnetLabs/unbound/issues/908 where do-ip6: no as a kind of workaround, and https://github.com/NLnetLabs/unbound/issues/362 where we have very long discussion regarding the unbound logic as well.

ShipeiXu commented 5 months ago

362

ShipeiXu commented 5 months ago

infra's rto exponential backoff is not a good algorithm. I insist that this should be logarithmic backoff. It is infinitely close, but will not reach rto.

gthess commented 5 months ago

With exponential Unbound is trying to be generous to poorly connected nameservers by doubling the timeout while waiting for an answer. It also makes Unbound aggressive to drop non-responding nameservers from server selection by reaching the top configured timeout faster. Non-responding nameservers are servers that are either under load and can't keep up, or broken nameservers. For the former it is good that Unbound stops contacting them and preferring other nameservers. For the latter it is good that Unbound stops contacting them because they are broken and wasting Unbound's time.

Also a server's timeout needs to reach the top configured timeout (after several timeouts) since this is the criteria for a non-responsive nameserver, so that the server gets removed from the selection. Then Unbound can spend time on responsive nameservers.

There are two distinct cases in your issue though: a) a server explicitly drops packets based on qname, qtype or similar instead of replying with an appropriate answer like REFUSED for example, and b) all the nameservers for a delegation are considered down.

For a) there is nothing for Unbound to do. There is an RFC that clearly states this is wrong behavior. The faulty behavior is with the upstream.

For b) maybe Unbound needs to do something differently and facilitate more attempts, than the current none, to such delegation but this needs some thinking because it can have unexpected results in certain scenarios.

We have plans to augment server selection for configured forward/stub zones in the future and we can also revisit server selection for common nameservers.

NLnetLabs / unbound

unbound is not friendly enough to support certain non-standard zones. #1074

362