Open phonedph1 opened 11 months ago
FWIW recursor does packet cache these.
for x in $(seq 1 1337); do dig @127.0.0.1 -p 5555 txt apple.com +nocookie +ignore; done
packetcache-hits 1336
packetcache-misses 1
We do see truncation happen when the backend recursive resolver sends a truncate. But as soon as the packet cache has filled, the back-end resolvers are not consulted and all truncation duties are handled by dnsdist.
We see attackers sending advertised buffer size of very large values, which means that spoofed UDP for large returns (say, TXT apple.com) ends up getting transmitted as UDP even though it's larger than our preferred 1232. I will say I'm still confused as to why TXT apple.com does not trigger a truncate with these large buffer sizes, but TXT atlassian.com does trigger a truncate. I will note that atlassian.com seems to always be handed back to the back-end systems (which truncate) while apple.com gets cached by the dnsdist packet cache, and I'm missing the reason for those having different behaviors.
Another possible dimension to this would be to have dnsdist have a maximum configurable UDP size cap, and to send back tc=1 if that cap is exceeded by an answer that is known in the packet cache. That seems overly simplistic, but maybe it isn't.
ph1 points out the answer to my prior question, which is why TXT atlassian.com is not packet-cached but TXT apple.com is:
./dnsdist.hh:static const size_t s_maxPacketCacheEntrySize{4096}; // don't cache responses larger than this value
This may also want to be a configurable setting somewhere. There are risks either way - very large values present cache exhaustion risks, but smaller values may create forced-resource usage attacks against backend recursive resolvers. Maybe some operators would actually want to use a smaller value, some larger.
We do see truncation happen when the backend recursive resolver sends a truncate. But as soon as the packet cache has filled, the back-end resolvers are not consulted and all truncation duties are handled by dnsdist.
This does not make sense to me, sorry. Which version of dnsdist are you running? Is there any way I can reproduce this locally? Because dnsdist does not decide whether or not to truncate, unless told to do so by an explicit rule/dynamic block action. The packet cache works by hashing the content of the query, so a different query will not match, even if the only difference is the UDP maximum payload size. There is no code in dnsdist that knows how to look for a similar answer with a different UDP maximum payload size. So if a query received over UDP gets a non-truncated answer larger than 1232, that answers came from the backend. The only possible source of confusion that I see in dnsdist would be a bug mixing responses received over UDP with ones received over TCP, where the UDP maximum payload size does not apply, but looking at the 1.8.x code I don't see it.
Another possible dimension to this would be to have dnsdist have a maximum configurable UDP size cap, and to send back tc=1 if that cap is exceeded by an answer that is known in the packet cache. That seems overly simplistic, but maybe it isn't.
It looks like we are missing a selector on the DNS payload size, I'll open a PR adding such a selector shortly.
ph1 points out the answer to my prior question, which is why TXT atlassian.com is not packet-cached but TXT apple.com is:
./dnsdist.hh:static const size_t s_maxPacketCacheEntrySize{4096}; // don't cache responses larger than this value
This may also want to be a configurable setting somewhere. There are risks either way - very large values present cache exhaustion risks, but smaller values may create forced-resource usage attacks against backend recursive resolvers. Maybe some operators would actually want to use a smaller value, some larger.
tc=1
responses when received over udp are not cached. Maybe they should be?
Yep, I think it makes sense indeed. If one really does not want this behaviour this can be changed via a SetSkipCacheAction
rule.
tc=1
responses when received over udp are not cached. Maybe they should be?Yep, I think it makes sense indeed. If one really does not want this behaviour this can be changed via a
SetSkipCacheAction
rule.
A (maybe) better place to check this would be to have getDNSPacketMinTTL()
return a value for TC=1 packets as we are already looking at the header in that function. There are very few callers of this function so it looks like it will have no bad consequences elsewhere .
The behavior is seen on dnsdist 1.8.2, with unbound 19.2 and powerdns 4.7.3 in the backend in a pool. I think the discussion above is becoming disconnected with the core issue and finding other problems, so it may more useful just to create the experiment yourself given what I'm hoping to achieve. My goal is to have dnsdist send truncation responses even if the back end resolver is not involved.
Try this:
dig +bufsize=10000 @
Doing that several times will eventually get no truncation in the response.
My hope is that there is a configuration option such that even repeatedly doing this query within the TTL of the record should result in a truncate reply from dnsdist, even if the packet is in the packet cache and not handed to the backend recursive resolver. This should override the client's maximum EDNS reply size declaration.
Currently, today, in my test environment the back-end resolvers always properly reply with a truncate reply, even when clients set bufsize=10000. So far, so good. But after dnsdist has filled its cache, future replies coming from the packet cache get transmitted to the client as UDP, with no truncation reply.
Due to the issue described in https://github.com/PowerDNS/pdns/issues/11563 there is some relief when the replies are very large - above 4096 - but that is because the packet cache isn't used for those large replies, so the truncation is being demanded by the back-end pool recursive resolver. This is not the case, though, for our example of TXT apple.com which is only ~1810 bytes.
PS: While I initially thought my goal could be accomplished using Lua and a packet action, I don't think it can be. Is there a rule that allows taking an action based on the response size?
I suppose the brute-force way of doing this today would be to do this:
./dnsdist.hh:static const size_t s_maxPacketCacheEntrySize{1232};
and recompile, which would get the desired result in a very primitive way that would shift burden to the back-end recursive resolvers.
But after dnsdist has filled its cache,
Do you mean filled to max size here?
No, just after cache entries have been created for "TXT apple.com".
But after dnsdist has filled its cache, future replies coming from the packet cache get transmitted to the client as UDP, with no truncation reply.
If this is indeed the case, there is a very problematic bug in dnsdist. Unfortunately I have tried to reproduce this several times and I have not been able to, so I have no clue what's going on.
Weirdly enough, if I send these queries to 9.9.9.9 I get different responses (different ordering of the TXT records in the RRset) with the same NSID:
$ dig +bufsize=10000 @9.9.9.9 TXT apple.com +nsid
; <<>> DiG 9.18.20 <<>> +bufsize @9.9.9.9 TXT apple.com +nsid
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25369
;; flags: qr rd ra; QUERY: 1, ANSWER: 18, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; NSID: 72 65 73 37 33 30 2e 63 64 67 2e 72 72 64 6e 73 2e 70 63 68 2e 6e 65 74 ("res730.cdg.rrdns.pch.net")
;; QUESTION SECTION:
;apple.com. IN TXT
;; ANSWER SECTION:
apple.com. 2813 IN TXT "webexdomainverification.8C462=b728ec3f-dfc9-42f9-92cb-9ba8853cbee8"
apple.com. 2813 IN TXT "google-site-verification=8M6XjQCzydT62jk8HY3VXPAG-nKDllTRV-JpA3-Ktyw"
apple.com. 2813 IN TXT "google-site-verification=L5kkMdiFI8npvb6KlHui84fJaCw5G64DWhaDRIAT4_c"
apple.com. 2813 IN TXT "google-site-verification=zBSq1mG5ssu2If-C17UAz_MzSZDcx03MVxmeDwMNc5w"
apple.com. 2813 IN TXT "cisco-ci-domain-verification=6f3bfb849796a518061f8e8c4356f687a138502d86db742791685059176547dd"
apple.com. 2813 IN TXT "Dynatrace-site-verification=7d881a7c-c13f-4146-9d27-2731459e2509__iqls0105tagglcsaul0m16ibrf"
apple.com. 2813 IN TXT "atlassian-domain-verification=mLabq99iaT8kquJechF6l31FAYoNUe3WB7tLpLFUiUYVJCse9SKq83hOJzFkwqrh"
apple.com. 2813 IN TXT "json:eyJ3aHkiOiJUaGlzIGlzIHRvIHRydW5jYXRlIFVEUCByZXNwb25zZXMgZm9yIFRYVCBxdWVyaWVzIHRvIGFwcGxlLmNvbSIsInBhZGRpbmciOiJpZW4wYWVHaGF0aG9oNmhhaHZpZWphaTNlYXkwYWh2YWhjaGFocXVhZWxlZTBZdWw0cGhpZXRoMHNvNXZpZXllZWNvaDRpZThzaGVlcGllVDNwYWVjaGVpVjZqb2h3aWVwaG82In0K"
apple.com. 2813 IN TXT "json:eyJ3aHkiOiJUaGlzIGlzIHRvIHRydW5jYXRlIFVEUCByZXNwb25zZXMgZm9yIFRYVCBxdWVyaWVzIHRvIGFwcGxlLmNvbSIsInBhZGRpbmciOiJxdWFoMGVpamFhNGVlajh0aWVkYWlnaG9jZWljaGFlOGVUb3ppZTVmdTVhaFRoMldlaU00aWsyaHVxdThpZXBoaWVxdW9oc2hlaXBhZWdoOUthZWw3b2NoaWVuZ2llem9lc2g1In0K"
apple.com. 2813 IN TXT "77a4a6de-da14-449c-83c4-85366e0f55f9"
apple.com. 2813 IN TXT "apple-domain-verification=X5Jt76bn3Dnmgzjj"
apple.com. 2813 IN TXT "cerner-client-id=22dd1d8a-5e8b-4e1e-80ef-39bcdfd42798"
apple.com. 2813 IN TXT "cerner-client-id=ce3abf18-ee87-43b9-9927-9eb24b4bac4a"
apple.com. 2813 IN TXT "ValidationTokenValue=77a4a6de-da14-449c-83c4-85366e0f55f9"
apple.com. 2813 IN TXT "miro-verification=2494d255c4c50b1e521650a0659cbf3fa08b0072"
apple.com. 2813 IN TXT "facebook-domain-verification=n6cqjfucq6plswmtfbwnbbeu1qiq3v"
apple.com. 2813 IN TXT "v=spf1 include:_spf.apple.com include:_spf-txn.apple.com ~all"
apple.com. 2813 IN TXT "adobe-idp-site-verification=6bd5e74c-a3a0-4781-b2e1-e95399b5e11c"
;; Query time: 16 msec
;; SERVER: 9.9.9.9#53(9.9.9.9) (UDP)
;; WHEN: Fri Nov 24 21:54:14 CET 2023
;; MSG SIZE rcvd: 1838
$ dig +bufsize=10000 @9.9.9.9 TXT apple.com +nsid
; <<>> DiG 9.18.20 <<>> +bufsize @9.9.9.9 TXT apple.com +nsid
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32165
;; flags: qr rd ra; QUERY: 1, ANSWER: 18, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; NSID: 72 65 73 37 33 30 2e 63 64 67 2e 72 72 64 6e 73 2e 70 63 68 2e 6e 65 74 ("res730.cdg.rrdns.pch.net")
;; QUESTION SECTION:
;apple.com. IN TXT
;; ANSWER SECTION:
apple.com. 2812 IN TXT "google-site-verification=8M6XjQCzydT62jk8HY3VXPAG-nKDllTRV-JpA3-Ktyw"
apple.com. 2812 IN TXT "google-site-verification=L5kkMdiFI8npvb6KlHui84fJaCw5G64DWhaDRIAT4_c"
apple.com. 2812 IN TXT "google-site-verification=zBSq1mG5ssu2If-C17UAz_MzSZDcx03MVxmeDwMNc5w"
apple.com. 2812 IN TXT "cisco-ci-domain-verification=6f3bfb849796a518061f8e8c4356f687a138502d86db742791685059176547dd"
apple.com. 2812 IN TXT "Dynatrace-site-verification=7d881a7c-c13f-4146-9d27-2731459e2509__iqls0105tagglcsaul0m16ibrf"
apple.com. 2812 IN TXT "atlassian-domain-verification=mLabq99iaT8kquJechF6l31FAYoNUe3WB7tLpLFUiUYVJCse9SKq83hOJzFkwqrh"
apple.com. 2812 IN TXT "json:eyJ3aHkiOiJUaGlzIGlzIHRvIHRydW5jYXRlIFVEUCByZXNwb25zZXMgZm9yIFRYVCBxdWVyaWVzIHRvIGFwcGxlLmNvbSIsInBhZGRpbmciOiJpZW4wYWVHaGF0aG9oNmhhaHZpZWphaTNlYXkwYWh2YWhjaGFocXVhZWxlZTBZdWw0cGhpZXRoMHNvNXZpZXllZWNvaDRpZThzaGVlcGllVDNwYWVjaGVpVjZqb2h3aWVwaG82In0K"
apple.com. 2812 IN TXT "json:eyJ3aHkiOiJUaGlzIGlzIHRvIHRydW5jYXRlIFVEUCByZXNwb25zZXMgZm9yIFRYVCBxdWVyaWVzIHRvIGFwcGxlLmNvbSIsInBhZGRpbmciOiJxdWFoMGVpamFhNGVlajh0aWVkYWlnaG9jZWljaGFlOGVUb3ppZTVmdTVhaFRoMldlaU00aWsyaHVxdThpZXBoaWVxdW9oc2hlaXBhZWdoOUthZWw3b2NoaWVuZ2llem9lc2g1In0K"
apple.com. 2812 IN TXT "77a4a6de-da14-449c-83c4-85366e0f55f9"
apple.com. 2812 IN TXT "apple-domain-verification=X5Jt76bn3Dnmgzjj"
apple.com. 2812 IN TXT "cerner-client-id=22dd1d8a-5e8b-4e1e-80ef-39bcdfd42798"
apple.com. 2812 IN TXT "cerner-client-id=ce3abf18-ee87-43b9-9927-9eb24b4bac4a"
apple.com. 2812 IN TXT "ValidationTokenValue=77a4a6de-da14-449c-83c4-85366e0f55f9"
apple.com. 2812 IN TXT "miro-verification=2494d255c4c50b1e521650a0659cbf3fa08b0072"
apple.com. 2812 IN TXT "facebook-domain-verification=n6cqjfucq6plswmtfbwnbbeu1qiq3v"
apple.com. 2812 IN TXT "v=spf1 include:_spf.apple.com include:_spf-txn.apple.com ~all"
apple.com. 2812 IN TXT "adobe-idp-site-verification=6bd5e74c-a3a0-4781-b2e1-e95399b5e11c"
apple.com. 2812 IN TXT "webexdomainverification.8C462=b728ec3f-dfc9-42f9-92cb-9ba8853cbee8"
;; Query time: 3 msec
;; SERVER: 9.9.9.9#53(9.9.9.9) (UDP)
;; WHEN: Fri Nov 24 21:54:15 CET 2023
;; MSG SIZE rcvd: 1838
$ dig +bufsize=10000 @9.9.9.9 TXT apple.com +nsid
; <<>> DiG 9.18.20 <<>> +bufsize @9.9.9.9 TXT apple.com +nsid
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22263
;; flags: qr rd ra; QUERY: 1, ANSWER: 18, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; NSID: 72 65 73 37 33 30 2e 63 64 67 2e 72 72 64 6e 73 2e 70 63 68 2e 6e 65 74 ("res730.cdg.rrdns.pch.net")
;; QUESTION SECTION:
;apple.com. IN TXT
;; ANSWER SECTION:
apple.com. 2710 IN TXT "adobe-idp-site-verification=6bd5e74c-a3a0-4781-b2e1-e95399b5e11c"
apple.com. 2710 IN TXT "webexdomainverification.8C462=b728ec3f-dfc9-42f9-92cb-9ba8853cbee8"
apple.com. 2710 IN TXT "google-site-verification=8M6XjQCzydT62jk8HY3VXPAG-nKDllTRV-JpA3-Ktyw"
apple.com. 2710 IN TXT "google-site-verification=L5kkMdiFI8npvb6KlHui84fJaCw5G64DWhaDRIAT4_c"
apple.com. 2710 IN TXT "google-site-verification=zBSq1mG5ssu2If-C17UAz_MzSZDcx03MVxmeDwMNc5w"
apple.com. 2710 IN TXT "cisco-ci-domain-verification=6f3bfb849796a518061f8e8c4356f687a138502d86db742791685059176547dd"
apple.com. 2710 IN TXT "Dynatrace-site-verification=7d881a7c-c13f-4146-9d27-2731459e2509__iqls0105tagglcsaul0m16ibrf"
apple.com. 2710 IN TXT "atlassian-domain-verification=mLabq99iaT8kquJechF6l31FAYoNUe3WB7tLpLFUiUYVJCse9SKq83hOJzFkwqrh"
apple.com. 2710 IN TXT "json:eyJ3aHkiOiJUaGlzIGlzIHRvIHRydW5jYXRlIFVEUCByZXNwb25zZXMgZm9yIFRYVCBxdWVyaWVzIHRvIGFwcGxlLmNvbSIsInBhZGRpbmciOiJpZW4wYWVHaGF0aG9oNmhhaHZpZWphaTNlYXkwYWh2YWhjaGFocXVhZWxlZTBZdWw0cGhpZXRoMHNvNXZpZXllZWNvaDRpZThzaGVlcGllVDNwYWVjaGVpVjZqb2h3aWVwaG82In0K"
apple.com. 2710 IN TXT "json:eyJ3aHkiOiJUaGlzIGlzIHRvIHRydW5jYXRlIFVEUCByZXNwb25zZXMgZm9yIFRYVCBxdWVyaWVzIHRvIGFwcGxlLmNvbSIsInBhZGRpbmciOiJxdWFoMGVpamFhNGVlajh0aWVkYWlnaG9jZWljaGFlOGVUb3ppZTVmdTVhaFRoMldlaU00aWsyaHVxdThpZXBoaWVxdW9oc2hlaXBhZWdoOUthZWw3b2NoaWVuZ2llem9lc2g1In0K"
apple.com. 2710 IN TXT "77a4a6de-da14-449c-83c4-85366e0f55f9"
apple.com. 2710 IN TXT "apple-domain-verification=X5Jt76bn3Dnmgzjj"
apple.com. 2710 IN TXT "cerner-client-id=22dd1d8a-5e8b-4e1e-80ef-39bcdfd42798"
apple.com. 2710 IN TXT "cerner-client-id=ce3abf18-ee87-43b9-9927-9eb24b4bac4a"
apple.com. 2710 IN TXT "ValidationTokenValue=77a4a6de-da14-449c-83c4-85366e0f55f9"
apple.com. 2710 IN TXT "miro-verification=2494d255c4c50b1e521650a0659cbf3fa08b0072"
apple.com. 2710 IN TXT "facebook-domain-verification=n6cqjfucq6plswmtfbwnbbeu1qiq3v"
apple.com. 2710 IN TXT "v=spf1 include:_spf.apple.com include:_spf-txn.apple.com ~all"
;; Query time: 6 msec
;; SERVER: 9.9.9.9#53(9.9.9.9) (UDP)
;; WHEN: Fri Nov 24 21:55:57 CET 2023
;; MSG SIZE rcvd: 1838
$
dnsdist doesn't know how to shuffle records, so it might indicate that the response is not cached by dnsdist and that the query is hitting a backend. I have the same result by adding +nocookie
, for what it is worth. But it might also indicate that there is several dnsdists with the same NSID, I don't know.
The current quad9 unbound instances don’t truncate until 4097 from my testing.
Yeah unbound had some issues, but those are solved in the latest release (which is not in production yet on Quad9's network) so just ignore those truncate failures towards unbound, I think. There are multiple recursive resolvers behind the same dnsdist instance (both powerdns and unbound) which may be some of the issue, and also we send to N (usually 4) identical (same order, same weight) backends to spread load across threads.
I tried this with dnsdist 1.8.3 and two pdns rec's (4.9.2) as backend. I do not see the full replies over UDP. On the dnsdist side (with verbose logging) I see repeatedly:
packet cache miss for query for apple.com|TXT from 127.0.0.1:17888 (DoUDP, 38 bytes)
Got query for apple.com|TXT from 127.0.0.1:17888, relayed to 192.168.178.6:53
Got answer from 192.168.178.6:53, relayed to 127.0.0.1:17888 (UDP), took 948.693 usec
Got TCP connection from 127.0.0.1:19688
Packet cache hit for query for apple.com|TXT from 127.0.0.1:19688 (DoTCP, 1810 bytes)
Closing TCP client connection with 127.0.0.1:19688: EOF while reading message
So this is all as expected. I do wonder what's different at q9. Configs, logs and/or packet captures might help.
In my further tests I only managed to reproduce if I mix in a resolver that does send large UDP replies, eg. quad8:
// first UDP query with empty cache
Packet cache miss for query for apple.com|TXT from 127.0.0.1:32804 (DoUDP, 38 bytes)
Got query for apple.com|TXT from 127.0.0.1:32804, relayed to 192.168.178.28:53
Got answer from 192.168.178.28:53, relayed to 127.0.0.1:32804 (UDP), took 3931.78 usec
// Followup over TCP gets cached
Got TCP connection from 127.0.0.1:5638
Packet cache miss for query for apple.com|TXT from 127.0.0.1:5638 (DoTCP, 38 bytes)
Got query for apple.com|TXT from 127.0.0.1:5638 (TCP, 40 bytes), relayed to 192.168.178.28:53
Got answer from 192.168.178.28:53, relayed to 127.0.0.1:5638 (TCP, 1812 bytes), took 7264.28 usec
Closing TCP client connection with 127.0.0.1:5638: EOF while reading message
// 2nd UDP query, cache miss
Packet cache miss for query for apple.com|TXT from 127.0.0.1:28170 (DoUDP, 38 bytes)
Got query for apple.com|TXT from 127.0.0.1:28170, relayed to 192.168.178.28:53
Got answer from 192.168.178.28:53, relayed to 127.0.0.1:28170 (UDP), took 3762.23 usec
// Followup over TCP, cache hit
Got TCP connection from 127.0.0.1:37705
Packet cache hit for query for apple.com|TXT from 127.0.0.1:37705 (DoTCP, 1810 bytes)
Closing TCP client connection with 127.0.0.1:37705: EOF while reading message
// 3rd query, packet cache miss, gets routed to quad8 which ansers with a big UDP reply that gets cached
Packet cache miss for query for apple.com|TXT from 127.0.0.1:10886 (DoUDP, 38 bytes)
Got query for apple.com|TXT from 127.0.0.1:10886, relayed to 8.8.8.8:53
Got answer from 8.8.8.8:53, relayed to 127.0.0.1:10886 (UDP), took 12285.5 usec
// 4th and 5th UDP query get answer from PC
Packet cache hit for query for apple.com|TXT from 127.0.0.1:25341 (DoUDP, 1810 bytes)
Packet cache hit for query for apple.com|TXT from 127.0.0.1:9854 (DoUDP, 1810 bytes)
Otto: Are you trying your queries with +bufsize=10000 ? If you are then I'll send you via mail the full (password redacted) config files for rec and dnsdist.
Yes I am testing with +bufsize=10000
Mail sent with example configs.
It's going to be Monday at earliest for me. You might want to set verbose mode in dnsdist and peek (or post) the logs, maybe they reveal something.
I don't know where we left this - was there any further examination of this issue after the mail I sent? I know it's getting quite dusty in my mind, but I don't recall a solution.
As far as I can remember we never managed to reproduce the issue, but Otto might know more.
I remember looking at this and not coming to any conclusive answers.
Short description
tc=1
responses when received over udp are not cached. Maybe they should be?Usecase
Some operators see lots of requests for say
txt apple.com
which produces atc=1
response.Description
Perhaps cache for the temp failure ttl time, or perhaps this idea is stupid.
Similar to the below, but perhaps more performant :)