Closed anandb-ripencc closed 2 months ago
I cannot see the difference either. I can confirm that zonemaster.net gives the same result. The same issue does not arise with other zones I have tested. What is special about kapper.net?
We will investigate.
Actually, I don't know what is special about kapper.net. I got this report from a user, investigated, and then opened this issue. Hopefully, you can figure it out.
Just for comparison: https://dnsviz.net/d/kapper.net/dnssec/
DNSViz didn't find any issues with the zone, and the MX record check was fine.
Hi all, I briefly started to look into this.
It seems that Zonemaster sees inconsistent TTL values between MX records returned by different name servers, hence the warning that is outputted. Here's an excerpt of Zonemaster-CLI where I dumped values (name server name/IP pair and MX resource record) from the responses:
$ zonemaster-cli kapper.net --test Zone09 --show-testcase --level=INFO --no-ipv6
Seconds Level Testcase Message
======= ======== ============== =======
0.00 INFO Unspecified Using version v6.0.0 of the Zonemaster engine.
ns1.kapper.net/94.136.1.127
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns2.kapper.net/94.16.111.51
kapper.net. 3565 IN MX 10 inbound.kapper.net.
ns3.kapper.net/103.241.67.58
kapper.net. 3565 IN MX 10 inbound.kapper.net.
ns4.kapper.net/139.99.239.52
kapper.net. 3564 IN MX 10 inbound.kapper.net.
ns5.kapper.net/94.136.22.5
kapper.net. 3564 IN MX 10 inbound.kapper.net.
ns6.kapper.net/97.74.83.192
kapper.net. 3564 IN MX 10 inbound.kapper.net.
ns7.kapper.net/144.217.92.144
kapper.net. 3565 IN MX 10 inbound.kapper.net.
ns8.kapper.net/195.200.6.20
kapper.net. 3564 IN MX 10 inbound.kapper.net.
11.20 WARNING Zone09 The MX RRset data is inconsistent between the name servers.
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "97.74.83.192".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "94.136.1.127".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "103.241.67.58".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "144.217.92.144".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "195.200.6.20".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "139.99.239.52".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "94.136.22.5".
11.20 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "94.16.111.51".
But while it does seem to happen almost every time, it's not all the time. Here's a second excerpt just seconds apart from the previous one:
$ zonemaster-cli kapper.net --test Zone09 --show-testcase --level=INFO --no-ipv6
Seconds Level Testcase Message
======= ======== ============== =======
0.00 INFO Unspecified Using version v6.0.0 of the Zonemaster engine.
ns1.kapper.net/94.136.1.127
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns2.kapper.net/94.16.111.51
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns3.kapper.net/103.241.67.58
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns4.kapper.net/139.99.239.52
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns5.kapper.net/94.136.22.5
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns6.kapper.net/97.74.83.192
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns7.kapper.net/144.217.92.144
kapper.net. 3600 IN MX 10 inbound.kapper.net.
ns8.kapper.net/195.200.6.20
kapper.net. 3600 IN MX 10 inbound.kapper.net.
6.09 INFO Zone09 Mail targets in the MX RRset "inbound.kapper.net." returned from name servers "94.136.1.127;103.241.67.58;144.217.92.144;195.200.6.20;97.74.83.192;94.136.22.5;94.16.111.51;139.99.239.52".
I was able to reproduce the results with dig
:
$ dig MX @94.16.111.51 kapper.net +nord
; <<>> DiG 9.18.24-1-Debian <<>> MX @94.16.111.51 kapper.net +nord
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52322
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 3
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;kapper.net. IN MX
;; ANSWER SECTION:
kapper.net. 3586 IN MX 10 inbound.kapper.net.
;; ADDITIONAL SECTION:
inbound.kapper.net. 3586 IN A 94.136.1.122
inbound.kapper.net. 886 IN AAAA 2a02:ab8:4::107
;; Query time: 29 msec
;; SERVER: 94.16.111.51#53(94.16.111.51) (UDP)
;; WHEN: Tue Jul 23 12:18:22 CEST 2024
;; MSG SIZE rcvd: 107
$ dig MX @139.99.239.52 kapper.net +nord
; <<>> DiG 9.18.24-1-Debian <<>> MX @139.99.239.52 kapper.net +nord
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44864
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 3
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;kapper.net. IN MX
;; ANSWER SECTION:
kapper.net. 3578 IN MX 10 inbound.kapper.net.
;; ADDITIONAL SECTION:
inbound.kapper.net. 3578 IN A 94.136.1.122
inbound.kapper.net. 878 IN AAAA 2a02:ab8:4::107
;; Query time: 269 msec
;; SERVER: 139.99.239.52#53(139.99.239.52) (UDP)
;; WHEN: Tue Jul 23 12:20:03 CEST 2024
;; MSG SIZE rcvd: 107
It seems that doing consecutive DNS queries to these name servers reduces the TTL in the MX records each time:
$ date && dig MX @139.99.239.52 kapper.net +nord +noall +answer
Tue Jul 23 12:25:05 PM CEST 2024
kapper.net. 3276 IN MX 10 inbound.kapper.net.
$ date && dig MX @139.99.239.52 kapper.net +nord +noall +answer
Tue Jul 23 12:25:10 PM CEST 2024
kapper.net. 3271 IN MX 10 inbound.kapper.net.
$ date && dig MX @139.99.239.52 kapper.net +nord +noall +answer
Tue Jul 23 12:25:11 PM CEST 2024
kapper.net. 3270 IN MX 10 inbound.kapper.net.
@anandb-ripencc I'll respond to this zone's administrator by email, since he contacted us directly too.
Thanks for looking into this Tom, and for figuring out that the TTLs are different. I will also respond to him. It may be that we're getting these responses out of a cache or some kind, and the problem isn't in Zonemaster.
We should either ignore the TTL or split the check into TTL and RDATA, respectively. The message only mentions RDATA which is misleading.
Thanks for looking into this Tom, and for figuring out that the TTLs are different. I will also respond to him. It may be that we're getting these responses out of a cache or some kind, and the problem isn't in Zonemaster.
So after all it appears that there is indeed a bug in the implementation of the name servers of that zone regarding the TTL value of resource records. Specifically, the value of the TTL of any resource record appears to be the TTL value from the name server own response cache:
$ date && dig MX @139.99.239.52 kapper.net +nord +noall +answer
Tue Jul 23 01:57:09 PM CEST 2024
kapper.net. 3600 IN MX 10 inbound.kapper.net.
$ date && dig MX @139.99.239.52 kapper.net +nord +noall +answer
Tue Jul 23 01:57:11 PM CEST 2024
kapper.net. 3598 IN MX 10 inbound.kapper.net.
$ date && dig MX @139.99.239.52 kapper.net +nord +noall +answer
Tue Jul 23 01:59:09 PM CEST 2024
kapper.net. 3480 IN MX 10 inbound.kapper.net.
As you can see, for each passing second of time (as seen with the command date
), the TTL value in the resource record as returned by the name server decreases equivalently. It seems to hold true for any type of resource records (not just MX), and most (if not all) name servers of that zone.
We should either ignore the TTL or split the check into TTL and RDATA, respectively. The message only mentions RDATA which is misleading.
To be exact, for that test case implementation (Zone09) all fields in the resource record are used, so also the owner name, type, class, and RDLENGTH are used to make the hash. See:
To be exact, for that test case (Zone09) all fields in the resource record are used, so also the owner name, type, class, and RDLENGTH are used to make the hash. See:
The wording in the Zone09 test case specification is unfortunately not correctly written. I think it was meant to compare the RDATA but as it is written all data including TTL is compared. The specification should be updated by either limiting to RDATA or by splitting TTL and RDATA check.
After that the implementation should be updated.
The wording in the Zone09 test case specification is unfortunately not correctly written. I think it was meant to compare the RDATA but as it is written all data including TTL is compared. The specification should be updated by either limiting to RDATA or by splitting TTL and RDATA check.
Yes, we can improve the test case in that regard. Although If we decide to not limit it to just RDATA, we shouldn't stop with the TTL. With the same logic, I think that other fields become as relevant too.
Yes, we can improve the test case in that regard. Although If we decide to not limit it to just RDATA, we shouldn't stop with the TTL. With the same logic, I think that other fields become as relevant too.
Class is not explicitly checked, but if the MX record (or records) in the answer section does not have the same owner name as the zone name it is ignored, as the specification is written.
Thx for figuring this out! Indeed there is DDoS protection in front of these auth-servers - I guess this is where the different TTL is coming from. Will investigate this further if we can manipulate the responses, though I'm not sure we will be able to get this changed.
funfact it's been a bug in the ddos-protection - on our side it's fixed - thx again for your effort!
No problem, glad it's fixed ! I will close this issue then.
Please refer to the test result here: https://zonemaster.ripe.net/en/result/bc66381e5ac1b0c5
I'm using these versions:
Look at the Zone section, and the warning in there about inconsistent MX RRset data. Just below that are the actual results of the MX RRset responses from all name servers. I squinted, zoomed and looked hard, but I cannot see the inconsistency. I also did queries on the command line using dig, against all the name servers of the zone. I couldn't find the inconsistency. Are you able to shed any light on this please?