NLnetLabs / nsd

The NLnet Labs Name Server Daemon (NSD) is an authoritative, RFC compliant DNS nameserver.
https://nlnetlabs.nl/nsd
BSD 3-Clause "New" or "Revised" License
463 stars 105 forks source link

different TTLs for a RRSet prevent zone loading #396

Closed and0x000 closed 1 week ago

and0x000 commented 1 month ago

different TTLs in the same RRset prevent a zone from being loaded in NSD 4.10.1

minimal reproducible example:

$ORIGIN example.com.
$TTL 86400

@      IN SOA   ns1.example.com. dns.example.com. 2024102468 86400 10800 3600000 3600
@      IN NS    ns1.example.com.
@      IN NS    ns1.example.com.

@  1   IN TXT   foo
@  2   IN TXT   bar

result:

~ # dig example.com @localhost

; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> example.com @localhost
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 59043
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; EDE: 14 (Not Ready): (Zone is configured but not loaded)
;; QUESTION SECTION:
;example.com.           IN  A

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(localhost) (UDP)
;; WHEN: Fri Oct 25 12:10:31 CEST 2024
;; MSG SIZE  rcvd: 79

Note the ; EDE: 14 (Not Ready): (Zone is configured but not loaded).

Log message:

2024-10-25T12:10:23.614609+02:00 test-dns nsd[2253841]: control cmd:  addzone example.com primary
2024-10-25T12:10:23.616598+02:00 test-dns nsd[2253843]: primary_zonefiles/e/example.com:9: example.com. TTL 2 does not match TTL 1 of TXT RRset
2024-10-25T12:10:23.616714+02:00 test-dns nsd[2253843]: zone example.com file primary_zonefiles/e/example.com read with 1 errors
2024-10-25T12:12:05.048216+02:00 test-dns nsd[2253841]: control cmd:  zonestatus example.com

Yet, zonestatus doesn't hint an issue.

# nsd-control zonestatus example.com
zone:   example.com
    pattern: primary
    state: primary

Personally I'd see this as a regression over the behavior from 4.9.1. In 4.9.1 the warning messages were present as well. However, NSD did load the zone file and didn't resort to SERVFAIL responses.

he32 commented 1 month ago

different TTLs in the same RRset prevent a zone from being loaded in NSD 4.10.1

As it should, I would argue.

See https://datatracker.ietf.org/doc/rfc2181/ section 5.2.

In our local setup (we use BIND), we run named-checkzone and insist on a clean bill of health before feeding the zone to a running name server.

Regards,

and0x000 commented 1 month ago

I'm well aware of the RFC and I don't argue with it. What I argue about is the (from what I found when digging through the changelogs) unnoted change from a lax to a strict parsing. And it's (imo) inconsequential handling. If this is considered a problem, NSD should probably not allow for the zone to be configured at all. Not allow it's state of being configured but not loaded.

he32 @.***> schrieb am Fr., 25. Okt. 2024, 19:06:

different TTLs in the same RRset prevent a zone from being loaded in NSD 4.10.1

As it should, I would argue.

See https://datatracker.ietf.org/doc/rfc2181/ section 5.2.

In our local setup (we use BIND), we run named-checkzone and insist on a clean bill of health before feeding the zone to a running name server.

Regards,

  • Håvard

— Reply to this email directly, view it on GitHub https://github.com/NLnetLabs/nsd/issues/396#issuecomment-2438355565, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNZQ4KDB5422ZJAPB2CDRDZ5J3CBAVCNFSM6AAAAABQTAI4AKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZYGM2TKNJWGU . You are receiving this because you authored the thread.Message ID: @.***>

he32 commented 1 month ago

I'm well aware of the RFC and I don't argue with it. What I argue about is the (from what I found when digging through the changelogs) unnoted change from a lax to a strict parsing. And it's (imo) inconsequential handling. If this is considered a problem, NSD should probably not allow for the zone to be configured at all. Not allow it's state of being configured but not loaded.

I must admit that I do not quite understand what change in behaviour you would like to have. E.g. should the zone load, but with the offending RRSet automatically converted so that all records in the set get the lower of all the different TTLs? Or are you suggesting the zone not be loaded at all due to the error, but NSD just continue without it? Both of those could be argued as valid choices, but have different downsides when it comes to supporting the behaviour. In particular the latter might come as a nasty surprise to you, e.g. when the secondaries all expire the offending zone possibly some 14 days after you made the mistake without checking the result, and poof, your domain is properly gone from the rest of the Internet, and at that time you may have a hard time connecting the dots. "Why didn't NSD earlier clearly indicate that this was a problem" might then be a valid criticism.

and0x000 commented 1 month ago

I must admit that I do not quite understand what change in behaviour you would like to have.

To be honest, I'm not sure neither. But I'm rather unhappy with the way this change was introduced.

Prior to 4.10 NSD served all the records with their respective individual TTLs. Personally I'd prefer this behavior to remain (albeit violating the RFC) until their is a new major release.

I presume the change came with the introduction of simdzone. I don't know to what degree semantic versioning is considered in this project. In my opinion, the change from serving with the individual TTLs to configuring but not loading a zone is a change, that would mandate a major version increase.

E.g. should the zone load, but with the offending RRSet automatically converted so that all records in the set get the lower of all the different TTLs?

If there are problems with implementing the pre-4.10-behavior in the new simdzone parsing, this sounds like a reasonable middle ground to me while sticking with the 4.x major release.

k0ekk0ek commented 1 month ago

Hi @and0x000. Thanks for reporting! This is a semantic error and previously always reported as a warning rather than an error. Most semantic errors in process_rr where/are either reported as an error or warning based on whether the server is a secondary or a primary. In itself, it has nothing to do with simdzone. However, the function processing each RR previously contained an if {} else {} with separate log statements for primary vs. secondary. When I included simdzone I simply used the same pattern for this one too. So, honest mistake on my part. As I believe @wcawijngaards prefers keeping this one a semantic error, I'll update the code to always log this as a warning.

and0x000 commented 1 month ago

Thank you for considering this report.

If I read this correctly, this adds a warning log message but doesn't change the behavior? i.e. the zone still won't be loaded if there is a TTL mismatch in the RRset?