spamhaus / rbldnsd

A small and fast DNS daemon especially made to serve DNSBL zones.
https://rbldnsd.io/
GNU General Public License v2.0
56 stars 12 forks source link

rbldnsd returns NXDOMAIN for ENTs (Empty Non-Terminals) #17

Open twesterhever opened 4 years ago

twesterhever commented 4 years ago

Quoted from RFC 7816, section 3:

A problem can also appear when a name server does not react properly to ENTs (Empty Non-Terminals). If ent.example.com has no resource records but foobar.ent.example.com does, then ent.example.com is an ENT. Whatever the QTYPE, a query for ent.example.com must return NODATA (NOERROR / ANSWER: 0). However, some name servers incorrectly return NXDOMAIN for ENTs. If a resolver queries only foobar.ent.example.com, everything will be OK, but if it implements QNAME minimisation, it may query ent.example.com and get an NXDOMAIN. See also Section 3 of [DNS-Res-Improve] for the other bad consequences of this bad behaviour.

It seems like rbldnsd shows exactly the same behaviour:

user@work:~$ host -t A 2.0.0.127.zen.spamhaus.org
2.0.0.127.zen.spamhaus.org has address 127.0.0.4
2.0.0.127.zen.spamhaus.org has address 127.0.0.2
2.0.0.127.zen.spamhaus.org has address 127.0.0.10
user@work:~$ host -t A 127.zen.spamhaus.org
Host 127.zen.spamhaus.org not found: 3(NXDOMAIN)

As mentioned above, the latter one must return NODATA instead of NXDOMAIN as some data below 127.zen.spamhaus.org is listed in the RBL indeed. The current behaviour was found to render applications behind resolvers using strict QNAME minimization (where no fallbacks using the FQDN queried by the client in the first place happen) unusable as the resolver stops after having received NXDOMAIN for the first ENT.

Worse, as RFC 5782, section 5, does not specify testing entries for URIBLs below the first hierarchy (such as dbltest.com.dbl.spamhaus.org), it is impossible to determine whether a URIBL is actually usable or not as test.dbl.spamhaus.org will always return NOERROR, while more realistic queries like example.com.dbl.spamhaus.org will silently fail as com.dbl.spamhaus.org returns NXDOMAIN instead of NODATA.

As far as I am concerned, RFC 7816 requires rbldnsd to return NODATA for Empty Non-Terminals. In my humble opinion, its' current behaviour is RFC-ignorant.

ammammita commented 3 years ago

Hello,

thanks for the feedback.

RBLDNSD was written in the last century, qname minimization was not yet existing so the whole code doesn't even take this chance into account.

There is also quite a lot of consensus in the SMTP World that qname minimization shouldn't be used on the resolvers used by mail servers. See http://postfix.1071664.n5.nabble.com/qname-minimization-and-privacy-breaks-dnsbl-in-postfix-tt103456.html#a103458 for an example.

A change to support this would require some efforts and we'd really like to learn about real world cases where this behaviour has caused issues.

ammammita commented 3 years ago

Going a bit deeper from the technical point of view, the dnset Dataset stores the data in such a way that implementing the requested feature impossible to implement. So, for domain names, if this feature is requested, a completely new dataset should be implemented.

For the IP(v4 and v6) datasets, all of them, we could implement a hackish solution so that when a query for a "partial" ip address is received, rbldnsd doesn't reply NXDOMAIN but NOERROR instead.

For example:

Opinions ?

rfc1036 commented 3 years ago

Please do: I see no downsides to this and an incomplete solution will still be better than being totally broken.

I also think that lack of support for ENTs for some data types should be well documented, because it is and has always been a bug: IIRC the first time that returning NXDOMAIN for ENTs was widely discussed as being broken was at the time of DNSSEC standardization work.

ammammita commented 3 years ago

I'm digging a bit more into the DN dataset. The proper solution would be to rewrite the entire dataset or, better, replacing it with another one that may support ENT (no regressions allowed). And this would be a huge effort that i'm not sure i want to address nowadays.

Another hackish solution would be that the DN dataset always returns NOERROR for every query. For example, it would return NOERROR both for

This would most probably break caching for NXDOMAIN entities, though (and this consideration applies also for the IP datasets with the hack applied).

This solution would probably be too aggressive. Opinions are welcomed on this topic as well.

Regarding the code: should this feature be always enabled or should it be enabled by a configure option ?

dennywatson commented 3 years ago

I am reviewing RFC7816 and currently pondering its ramifications. I'll probably have a better idea what I would want to do later after a bit of testing.

Currently, I have concerns that returning NODATA (NOERROR / ANSWER: 0) would result in client implementations as non-listing (generating false-negitives). IIRC, years ago there was discussion that dnsbl-clients should treat NODATA (NOERROR / ANSWER: 0) as a non-listing. Again, I need to review and better understand the RFC to see where edge cases may exist for clients.

rfc1036 commented 3 years ago

Unless this is subject to serious testing then I believe that the evil we know (NXDOMAIN for ENTs) is better than trying something new like unexpected NODATA.

But I still suggest that you fix NODATA handling for the IP-related datasets, for which it should be easy and a correct solution.

dennywatson commented 3 years ago

Unless this is subject to serious testing then I believe that the evil we know (NXDOMAIN for ENTs) is better than trying something new like unexpected NODATA.

Informational;

Way back in 2006 one of the moderators of NANA.Blacklisting outlined how DNSBL clients should respond to "Empty non-error" responses. https://groups.google.com/g/news.admin.net-abuse.blocklisting/c/UIYFltOT4mA/m/59sRQw4UqwoJ "While unusual, client implementations should ensure that responses where RCODE has been set to 0, and no answer is given, are treated as a negative listing."

Also it appears to have neither RFC5782 nor RFC6471 outline specifically how to handle empty non-error responses.

However, RFC8904 does suggest that returning empty non-error responses should be considered a non-listing; "2. Method Details

The result of the method states how the query did, up to the interpretation of the returned data. The method has four possible results: [...] none: The query worked but yielded no A record or returned NXDOMAIN, so the sender is not whitelisted."

dennywatson commented 3 years ago

But I still suggest that you fix NODATA handling for the IP-related datasets, for which it should be easy and a correct solution.

Given that the rbldns-client passes the full QNAME to the resolver, and the resolver MUST respond with that full QNAME back to the client, with any QNAME minimization schemes (RFC7816 is categorized as "Experimental") being done by the resolver before responding to the client, this is unlikely to have an impact on the clients.

dennywatson commented 3 years ago

(snip)

Another hackish solution would be that the DN dataset always returns NOERROR for every query. For example, it would return NOERROR both for

* com.dbl.spamhaus.com
  and

* thistldisfake.dbl.spamhaus.org

This would most probably break caching for NXDOMAIN entities, though (and this consideration applies also for the IP datasets with the hack applied).

Actually, I would believe that these entries would be cached properly. The resolver has an answer, and the answer is non-error. I would believe that resolvers would cache these replies just like any other replies.

Regarding the code: should this feature be always enabled or should it be enabled by a configure option ?

I would suggest that because RFC7816 is listed as experimental that this be presented as a command line flag to always return "empty non-error" instead of NXDOMAIN.

ammammita commented 3 years ago

I have pushed an implementation of thr qname minimization feature to the qname-minimization branch.

First of all you have to compile rbldnsd with the --enable-qnmin option in order to enable this feature. Once done, nothing should change.

The qname minimization behaviour is activated ONLY for specific datasets. To enable it for a dataset, you have to use the $QNMIN special entry that may be true or false.

As an example, this entry

#$QNMIN true

would enable a specific dataset to show the qname minimization behaviour. This feature has been implemented in all datasets, including dnset. The caveats described above will still apply.

This feature would need extensive testing so help and feedbacks from the community would be much appreciated.

rfc1036 commented 3 years ago

The documentation is not clear:

And what is the rationale for making this a compile-time option? Would building rbldnsd with --enable-qnmin cause a performance regression for a zone even without actually enabling the feature on it?

I do not think that the DNSBL RFCs need to specifically mention how NOERROR answers should be treated, because there are no deviations from the usual DNS standards and behaviours.

I have been pondering this a bit and now I agree with @dennywatson: since a NOERROR answer would be cached using the same rules of a NXDOMAIN answer then there will be no caching lifetime changes for "correct" queries. And certainly no clients should ever consider a NOERROR answer as a listing: I highly doubt that such a broken client ever existed because (at least in the past) some people used to use BIND to serve DNSBLs and it would have returned correct answers for ENTs.

The only change that I can see when answering NOERROR instead of NXDOMIN for all non-listings is that a resolver receiving a NOERROR for faketld.dnsbldomain.net would still send a query for domain.faketld.dnsbldomain.net instead of correctly deducing than no subdomains exist. I do not know if this actually would have a practical impact, but it should be easy to measure for Spamhaus by turning on and off the feature for a while.

If no bad effects due to caching are measured then I even think that correct support of ENTs should be enabled by default because not breaking name servers implementing QNAME minimization (which is probably soon going to be "all of them") is much more important than not breaking already broken clients of which we are not even sure if any exists.

NOERROR answers should definitely be always turned on for queries to IP datasets, because we know exactly which queries should return NXDOMAIN and which ones NOERROR and because legitimate clients are not supposed to query for incomplete IPs, so there can be no concern about incorrect handing of NOERROR answers.

(Hi @dennywatson, you may remember me from the Brussels or Dublin M3AAWG...)

dennywatson commented 3 years ago

(parts snipped and reordered)

(Hi @dennywatson, you may remember me from the Brussels or Dublin M3AAWG...)

Hi, and yes. Also, you and I are aware of each other in other forums -- going back decades.

The documentation is not clear:

* Maybe a better name would be "correct handling of empty non-terminal answers", since this is really about basic DNS rules and QNAME minimization just happens to expose the bug.

Perhaps. I would also like to avoid RFC7816, as I feel it is poorly written. In an attempt for advocacy I feel that it appears to suffer from some logical facilities, glosses over some potential problems, purports to solve more than it actually does, and has potential misunderstandings of what was written in RFCs 1034 and 1035. Reading it with a critical eye, I'm not a fan. I don't have the time to dissect RFC7816 to reveal all of its potential flaws.

Having said that, yes ENTs probably shouldn't respond NXDOMAIN.

* Possibly breaking backward compatibility is mentioned, but the documentation should explain exactly what would be broken.

(reordered)

I do not think that the DNSBL RFCs need to specifically mention how NOERROR answers should be treated, because there are no deviations from the usual DNS standards and behaviours.

In a past life, I had maintained a qmail install, and can think of one example... Though this an unusual one that most likely suffers from a host of other issues. DJB chains could be constructed where that DNSBLs are queried and qmail's rblsmtpd triggered based on the setting of RBLSMTPD variable. Decades ago there was a dns package called firedns (might still exist, I haven't bothered to look) and its command line client would exit non-zero on NXDOMAIN. One could set this up with a string of logical ands resulting in the combination of a listing in two or more positive listings being required for actual blocking of email. This strategy could suffer from other problems such as wildcarding because a domainer has bought the domain, and/or SERVERROR problems, but it is an example of how someone may have implemented a dnsbl in such a way as there might exist problems.

!!! Implementation of this feature is a policy decision that should be expressed to its userbase !!!

(reordered)

I have been pondering this a bit and now I agree with @dennywatson: since a NOERROR answer would be cached using the same rules of a NXDOMAIN answer then there will be no caching lifetime changes for "correct" queries. And certainly no clients should ever consider a NOERROR answer as a listing: I highly doubt that such a broken client ever existed because (at least in the past) some people used to use BIND to serve DNSBLs and it would have returned correct answers for ENTs.

* Are there any other downsides when enabling this feature? E.g., is there a performance regression?

Increased protocol traffic, and reduced response time to the client.

Tested against Unbound. I am somewhat concerned that unbound doesn't appear to cache empty no-error in any way and appears to always wants to traverse the full path when it sees empty non-error. I would need to take a critical look at 1034 and 1035 to determine if this behavior is broken. My gut says that, "You have received an authoritative non-error answer, cache that! If you are going to implement an experimental RFC -- then you need to add code to accommodate what you are doing," again; I need to review 1034 and 1035.

Over-query would appear to always be the case for NXDOMAIN.

And what is the rationale for making this a compile-time option? Would building rbldnsd with --enable-qnmin cause a performance regression for a zone even without actually enabling the feature on it?

I'm not opposed to adding it into default (perhaps at a later date) as the behavior is controlled by the zonefile.

The only change that I can see when answering NOERROR instead of NXDOMIN for all non-listings is that a resolver receiving a NOERROR for faketld.dnsbldomain.net would still send a query for domain.faketld.dnsbldomain.net instead of correctly deducing than no subdomains exist. I do not know if this actually would have a practical impact, but it should be easy to measure for Spamhaus by turning on and off the feature for a while.

If no bad effects due to caching are measured then I even think that correct support of ENTs should be enabled by default because not breaking name servers implementing QNAME minimization (which is probably soon going to be "all of them") is much more important than not breaking already broken clients of which we are not even sure if any exists.

NOERROR answers should definitely be always turned on for queries to IP datasets, because we know exactly which queries should return NXDOMAIN and which ones NOERROR and because legitimate clients are not supposed to query for incomplete IPs, so there can be no concern about incorrect handing of NOERROR answers.

I see this as more of a known query width issue, and rework of the existing data structures to accommodate searching that structure. Yes, for an IP based either IPv4 or IPv6 this is a known width. For domainnames, less so.

Overall, I have opinions. These are only opinions, and they are only mine;

RFC7816;

Unbound;

Debian;

dennywatson commented 3 years ago

One condition that I neglected to point out.

Against a stock build of rbldnsd; After receiving its first NXDOMAIN Unbound appears to then query the full QNAME against the last NS it is working with. I.e. there exists the possibility for significant query reduction for IPv6 datasets.

ammammita commented 3 years ago

The patch has been reviewed: 1) the special label now is called "$ENT" and accepts true or false 2) the configure option is now --enable-ent 3) the manpage has been edited to reflect this change and has been integrated with a few considerations.

Notably: 1) the performance regressions are unnoticeable, if any 2) the configure option stays until full tests are performed and we've received feedbacks from the real world. 3) i don' expect regressions when rbldnsd is used through a resolver. It will cause regressions when queried directly and the code interprets NOERROR as a successfull listing. 4) When querying ipv6 addresses, up to 32 queries could be needed to obtain the proper final response. This is a waste of resources.

twesterhever commented 3 years ago

Hi all. Sorry for my tardy reply.

It might be perhaps useful to provide more information on the setup where I first bumped into this: Infrastructures querying public DNS resolvers usually quickly exceed query rate limits on common DNSBLs - if not even put off with a "you are querying our DNSBL via a public resolver" answer entirely.

In such cases, I frequently observed DNS forwarding setups for common RBL zones, directed against their nameservers directly. This way, rate limits or policy-based decisions on public resolvers can be relatively reliably avoided.

This was the environment where I noticed a script of mine apparently worked with an URIBL, but never blocked anything, despite conducting the sanity checks mentioned in RFC 5782, section 5. Since I was completely unaware of (strict) QNAME minimisation on the infrastructures' resolver, it took quite a while to figure things out.

While I certainly appreciate the patch by @ammammita, the URIBL sanity test(s) in RFC 5782 should be changed to a more realistic query (perhaps for example.com) - but that is out of scope for this issue. While people or operating systems using an experimental DNS feature are somewhat to blame as well, rbldnsds behaviour seems to be the root cause for this. (No offense intended, though. :-) )

It would be nice to see selective QNAME minimisation settings possible in Unbound, by the way. At the moment, they have not implemented that, and it's probably hard to change.

  1. i don'[t] expect regressions when rbldnsd is used through a resolver. It will cause regressions when queried directly and the code interprets NOERROR as a successful listing.

Agreed, and it unfortunately looks like my code is interpreting empty NOERROR replies as a listing. :-/ Thanks for bringing this aspect up.

  1. When querying ipv6 addresses, up to 32 queries could be needed to obtain the proper final response. This is a waste of resources.

Absolutely. This is something I did not have in mind, and unfortunately, I cannot think of an elegant, regression-free solution to this.

sgtchains commented 3 years ago

On 4/1/2021 2:47 PM, twesterhever wrote:

Hi all. Sorry for my tardy reply.

It might be perhaps useful to provide more information on the setup where I first bumped into this: Infrastructures querying public DNS resolvers usually quickly exceed query rate limits on common DNSBLs - if not even put off with a "you are querying our DNSBL via a public resolver" answer entirely.

I believe that this touches on one of RFC7816 (QNAME minimisation) biggest problems, it does not address the wishes and/or policies of the site you are pushing DNS queries to. I have questions regarding the RFC in the matter, and testing with Unbound's implementation suggests the RFC7816 needs an errata document published. Such errata should cover caching on empty responses (I.e. RFC2308 must be implimented) and a method for the authoritative NS to explicitly opt-out of such silliness. The later actually appears to work currently (at least from my testing) with issuing NXDOMAIN and the caching resolver switching to the full QNAME.

.. but then again, my testing may have had flaws.

In such cases, I frequently observed DNS forwarding setups for common RBL zones, directed against their nameservers directly. This way, rate limits or policy-based decisions on public resolvers can be relatively reliably avoided.

This was the environment where I noticed a script of mine https://github.com/twesterhever/squid-dnsbl apparently worked with an URIBL, but never blocked anything, despite conducting the sanity checks mentioned in RFC 5782, section 5. Since I was completely unaware of (strict) QNAME minimisation on the infrastructures' resolver, it took quite a while to figure things out.

It is somewhat worse than what I suspect that you've seen. Because these "empty non-terminals" are empty they do not provide TTL data, the RFC does not address this at all and my testing with Unbound suggests that they are not cached. In other words, it appeared that a query for 2.2.0.192.dsnbl.example.com would result in five queries against the NS server for the zone 'dnsbl.example.com' to retrieve its answer, but an immediate subsequent request for 200.2.0.192.dsnbl.example.com still has to issue five queries against the zone as the return (null) values for 'dsnbl.example.com', '192.dsnbl.example.com', '0.192.dsnbl.example.com', and '2.0.192.dsnbl.example.com' were not cached.

While I certainly appreciate the patch by @ammammita https://github.com/ammammita, the URIBL sanity test(s) in RFC 5782 should be changed to a more realistic query (perhaps for |example.com|)

  • but that is out of scope for this issue. While people or operating systems using an experimental DNS feature are somewhat to blame as well, |rbldnsd|s behaviour seems to be the root cause for this. (No offense intended, though. :-) )

My understanding from your original statement that you appeared to have been ACLed for over query. If this is the case, then I do not see the latest patch having any help in that regard... In fact it might actually hurt by not returning NXDOMAIN and allowing a QNAME minimization enabled server to walk the full tree, the remote system will log that many more queries.

It would be nice to see selective QNAME minimisation settings possible in Unbound, by the way. At the moment, they have not implemented that, and it's probably hard to change.

 3. i don'[t] expect regressions when rbldnsd is used through a
    resolver. It will cause regressions when queried directly and
    the code interprets NOERROR as a successful listing.

Agreed, and it unfortunately looks like my code /is/ interpreting empty |NOERROR| replies as a listing https://github.com/twesterhever/squid-dnsbl/blob/master/dnsbl-ip.py#L205-L217. :-/ Thanks for bringing this aspect up.

Please also quantify the responses received to specific answers. Should a DNSBL start sending flagged administrative codes, or the domain slips out of registration and is picked up by a domainer wild-carding every thing in the domain, then I would believe that things could get ugly.

 4. When querying ipv6 addresses, up to 32 queries could be needed
    to obtain the proper final response. This is a waste of resources.

Absolutely. This is something I did not have in mind, and unfortunately, I cannot think of an elegant, regression-free solution to this.

Actually, I can; see errata statement above where that NXDOMAIN should be viewed as opting out of the silliness. ;)

steadramon commented 7 months ago

Hi there,

Thanks for the updates to rbldnsd regarding this, when testing I've come across a few quirks, I'm hoping you can shed some light as to what I'm doing wrong.

With the following setup things seem to work fine - (initially setting #$QNMIN true caused an error, changing to #$ENT fixed this so happy I have built with the patch fine)

Filename: test_ip

$SOA 3600 ns0.bl.test.com test.test.com 0 3600 900 7200 900
#$ENT true
127.0.0.2

Startup:

./rbldnsd -b 127.0.0.1 -4 -t 600 bl.test.com:ip4set:test_ip

Dig Outcome

# dig A 127.bl.test.com @127.0.0.1 +norec

; <<>> DiG 9.16.1-Ubuntu <<>> A 127.bl.test.com @127.0.0.1 +norec
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34427
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;127.bl.test.com.                  IN      A

;; AUTHORITY SECTION:
test.com.               900     IN      SOA     ns0.bl.test.com. test.test.com. 1704124468 3600 900 7200 900

However I prefer to have my $SOA/$NS in a generic file with other records:

Filename: test_generic

$SOA 3600 ns0.bl.test.com test.test.com 0 3600 900 7200 900
@ A 1.2.3.4
test A 1.2.3.4

With the SOA line omitted from test_ip -

Startup:

./rbldnsd -b 127.0.0.1 -4 -t 600 bl.test.com:ip4set:test_ip bl.test.com:generic:test_generic

I now notice that unless #$ENT true is set in the test_generic ENT seems to now be disabled/fails:

# dig A 127.bl.test.com @127.0.0.1 +norec
...
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 31334

I also notice that trying to look up the "@" record causes rbldnsd to exit:

dig A bl.test.com @127.0.0.1
...
Cannot minimize qname as the packet already contains a response rr
Aborted

Upon adding a third file -

Filename: test_dn

#$ENT true
blockedexample.com
.anotherblockedexample.com

Startup:

./rbldnsd -b 127.0.0.1 -4 -t 600 bl.test.com:ip4set:test_ip bl.test.com:generic:test_generic bl.test.com:generic:test_generic bl.test.com:dnset:test_dn

Queries for "com.bl.test.com" returns NXDOMAIN which doesn't seem to match with the expected outcome from this patch. Queries for the domains as listed seem to return fine.

Interestingly if I start up rbldnsd without the other files I (./rbldnsd -b 127.0.0.1 -4 -t 600 bl.test.com:dnset:test_dn) I still get NXDOMAIN for "com.bl.test.com"

Hoping you can help look into these issues, please feel free to ask for clarification on anything.

steadramon commented 7 months ago

Spotted another weird thing, which might be down to defining different dataset types on the same domain...

Using similar files to above

File: test_dn

$SOA 3600 ns0.bl.test.com test.test.com 0 3600 900 7200 900
#$ENT true
blockedexample.com
.anotherblockedexample.com
./rbldnsd -n -l +/tmp/rbl.log -b 127.0.0.1 -4 -t 600 bl.test.com:dnset:test_dn
# dig blockedexample.com.bl.test.com @127.0.0.1 +norec

; <<>> DiG 9.16.1-Ubuntu <<>> blockedexample.com.bl.test.com @127.0.0.1 +norec
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50755
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;blockedexample.com.bl.test.com.    IN  A

;; ANSWER SECTION:
blockedexample.com.bl.test.com. 600 IN  A   127.0.0.2

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 01 20:57:33 UTC 2024
;; MSG SIZE  rcvd: 64

On stopping rbldnsd I get the following -

rbldnsd: stats for 8secs zone bl.test.com: tot=1 ok=1 nxd=0 err=0 in=71 out=64
rbldnsd: stats for 8sec: tot=1 ok=1 nxd=0 err=0 in=71 out=64

However if I add the additional "generic" file:

Filename test_generic

@ A 1.2.3.4
test A 1.2.3.4

Startup -

./rbldnsd -n -l +/tmp/rbl.log -b 127.0.0.1 -4 -t 600 bl.test.com:dnset:test_dn bl.test.com:generic:test_generic

I get an answer from dig, however it seems to be marked "NXDOMAIN"

# dig blockedexample.com.bl.test.com @127.0.0.1 +norec

; <<>> DiG 9.16.1-Ubuntu <<>> blockedexample.com.bl.test.com @127.0.0.1 +norec
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 20166
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;blockedexample.com.bl.test.com.    IN  A

;; ANSWER SECTION:
blockedexample.com.bl.test.com. 600 IN  A   127.0.0.2

;; AUTHORITY SECTION:
bl.test.com.        900 IN  SOA ns0.bl.test.com. test.test.com. 1704142627 3600 900 7200 900

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 01 20:58:21 UTC 2024
;; MSG SIZE  rcvd: 109

and the log from rbldnsd -

...
rbldnsd: listening on 127.0.0.1/53
rbldnsd: dnset:test_dn: 20240101 205658: e/w=2/1
rbldnsd: generic:test_generic: 20240101 205707: e=2
rbldnsd: zones reloaded, time 0.0e/0.0u sec, mem arena=284 free=66 mmap=0 Kb
rbldnsd: rbldnsd version 0.999 (Still not official, to be released) started (1 socket(s), 1 zone(s))
(Ctrl+C)
rbldnsd: terminating
rbldnsd: stats for 6secs zone bl.test.com: tot=1 ok=0 nxd=1 err=0 in=71 out=109
rbldnsd: stats for 6sec: tot=1 ok=0 nxd=1 err=0 in=71 out=109

Note the OK=0/NXD=1

pspacek commented 1 month ago

I just became aware of this discussion and I want to provide a different angle on query name minimization, as seen from DNS world. (I'm DNS software developer, formerly working on Knot Resolver and now working on BIND and various other DNS tools.)

I can perfectly understand that if you data structures provide only exact match operation returning proper NXDOMAIN is hard. If that's the case then the best DNS protocol compliant answer is so-called "NODATA", i.e. RCODE=NOERROR + empty ANSWER section. You can put SOA RR into AUTHORITY section of such answer to make it cacheable the same way as you would with RCODE=NXDOMAIN.

This approach is compliant with DNS spec and allows efficient caching. You could even put larger TTL on response SOAs higher in the tree so top-level nodes like 192.subtree can get cached as NODATA for a day while 1.2.0.192.subtree could be cached only for couple seconds if you desire so. That way you can populate cache and quickly get rid extra queries generated by QMIN algorithm.

A side-note: The DNS Query Name Minimisation spec (currently RFC9156, a new Internet Standard) does prescribe any changes to compliant DNS servers. RFCs which describe QMIN:

I understand this gets convoluted quickly. I offer help with clarifying this further.