PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.7k stars 908 forks source link

NSEC chain bug with LMDB backend #11612

Open hlindqvist opened 2 years ago

hlindqvist commented 2 years ago

Short description

Adding a record for a previously NXDOMAIN name and then removing that record again leaves the NSEC chain with an entry showing non-existence between the now NXDOMAIN name and the next name (types: RRSIG NSEC ??).

This first surfaced when running the "dyndns" regression test suite with the LMDB backend (see #11611) , causing the 1dyndns-1dyndns-update-in-between test to fail, but the same behavior reproduces on master using pdnsutil edit-zone to make the changes.

Environment

Steps to reproduce

This description is based on running pdns as in the regression test suite, working with the test.dyndns zone included there (unmodified at the start of these steps). These steps essentially mimic the existing 1dyndns-update-in-between test, but using pdnsutil to make changes instead of nsupdate in order to decouple this issue from dnsupdate support.

An alternative is running ./start-test-stop 15353 lmdb wait 0 1dyndns-update-in-between in the branch of #11611

  1. Observable state before changes: no d.host.test.dyndns A record, NSEC a-e
~/git/pdns/regression-tests% dig @127.0.0.1 -p 15353 d.host.test.dyndns A +norec +dnssec

; <<>> DiG 9.16.28-RH <<>> @127.0.0.1 -p 15353 d.host.test.dyndns A +norec +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 65424
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
;; QUESTION SECTION:
;d.host.test.dyndns.            IN      A

;; AUTHORITY SECTION:
test.dyndns.            3600    IN      SOA     ns1.test.dyndns. ahu.example.dyndns. 2012060701 28800 7200 604800 86400
test.dyndns.            3600    IN      RRSIG   SOA 13 2 3600 20220519000000 20220428000000 50164 test.dyndns. l6/iUz021utEPs/jutTtumRRQhKmQsxG5j452P8rfwwo+Xt4UL1KHoe2 hXMrSRwuMjwOVGGMjugRJ9ZhqXIkRw==
a.host.test.dyndns.     3600    IN      NSEC    e.host.test.dyndns. A RRSIG NSEC
a.host.test.dyndns.     3600    IN      RRSIG   NSEC 13 4 3600 20220519000000 20220428000000 50164 test.dyndns. YCC3tCYYU9DMA6m+opBMITHj/GXrd40VgtD3HVT5iuVlAuCfc2jzJYtR fepW1S5TesStxJ+NYmZEeMhrKtVMOA==
delete-add.test.dyndns. 3600    IN      NSEC    a.host.test.dyndns. A TXT RRSIG NSEC
delete-add.test.dyndns. 3600    IN      RRSIG   NSEC 13 3 3600 20220519000000 20220428000000 50164 test.dyndns. YrZr0j8zZJWzz7qfmR9wD8rvX1FkTOKma/DXQ6e+UviTvJ+HkuXZ1ObT 3jywaEOBmOW2T6QhXtfD6/HlaWqohQ==

;; Query time: 0 msec
;; SERVER: 127.0.0.1#15353(127.0.0.1)
;; WHEN: Sun May 08 17:39:42 CEST 2022
;; MSG SIZE  rcvd: 513

~/git/pdns/regression-tests%
  1. Change: add d.host.test.dyndns A record
    
    ~/git/pdns/regression-tests% ../pdns/pdnsutil --config-dir=. --config-name=lmdb edit-zone test.dyndns
    Checked 26 records of 'test.dyndns', 0 errors, 0 warnings.
    Detected the following changes:
    +d.host.test.dyndns 3600 IN A 127.0.0.1

You have not updated the SOA record! Would you like to increase-serial? (y)es - increase serial, (n)o - leave SOA record as is, (e)dit your changes, (q)uit: y (a)pply these changes, (e)dit again, (r)etry with original zone, (q)uit: a Adding NSEC ordering information for zone 'test.dyndns', 22 updates ~/git/pdns/regression-tests%


3. Observable state after change: d.host.test.dyndns A record exists

~/git/pdns/regression-tests% dig @127.0.0.1 -p 15353 d.host.test.dyndns A +norec +dnssec

; <<>> DiG 9.16.28-RH <<>> @127.0.0.1 -p 15353 d.host.test.dyndns A +norec +dnssec ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57233 ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 1232 ;; QUESTION SECTION: ;d.host.test.dyndns. IN A

;; ANSWER SECTION: d.host.test.dyndns. 3600 IN A 127.0.0.1 d.host.test.dyndns. 3600 IN RRSIG A 13 4 3600 20220519000000 20220428000000 50164 test.dyndns. ZZt4wckVcTtYjCy9iks+uY/vQZA+rOnjP8CAfh/XgXQFKfISodoLV/rK xmhRQ0Haass2kK5g9S3/IiJ39hudpg==

;; Query time: 0 msec ;; SERVER: 127.0.0.1#15353(127.0.0.1) ;; WHEN: Sun May 08 17:43:18 CEST 2022 ;; MSG SIZE rcvd: 170

~/git/pdns/regression-tests%


5. Additional observable state after change: looking up a name between d and e, like d2, we see NSEC d-e with types A RRSIG NSEC (looks legit)

~/git/pdns/regression-tests% dig @127.0.0.1 -p 15353 d2.host.test.dyndns A +norec +dnssec

; <<>> DiG 9.16.28-RH <<>> @127.0.0.1 -p 15353 d2.host.test.dyndns A +norec +dnssec ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 38171 ;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 1

;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 1232 ;; QUESTION SECTION: ;d2.host.test.dyndns. IN A

;; AUTHORITY SECTION: test.dyndns. 3600 IN SOA ns1.test.dyndns. ahu.example.dyndns. 2012060702 28800 7200 604800 86400 test.dyndns. 3600 IN RRSIG SOA 13 2 3600 20220519000000 20220428000000 50164 test.dyndns. mHD3jkr866K53VGftXWOsLqFDP9LFnG49TlX6gpY4KXMlErfWNZzUypJ 35Y8O+sy24MJ8gvstnuBCcd/wnO6tA== d.host.test.dyndns. 3600 IN NSEC e.host.test.dyndns. A RRSIG NSEC d.host.test.dyndns. 3600 IN RRSIG NSEC 13 4 3600 20220519000000 20220428000000 50164 test.dyndns. UDinWxqj4CN6eizyJe/CJM4+fpMu0U1q6TUMlR33eFS76vPgneyRushM XkYeHv2eyinNMYPpoi1WisaxqPVaug== delete-add.test.dyndns. 3600 IN NSEC a.host.test.dyndns. A TXT RRSIG NSEC delete-add.test.dyndns. 3600 IN RRSIG NSEC 13 3 3600 20220519000000 20220428000000 50164 test.dyndns. YrZr0j8zZJWzz7qfmR9wD8rvX1FkTOKma/DXQ6e+UviTvJ+HkuXZ1ObT 3jywaEOBmOW2T6QhXtfD6/HlaWqohQ==

;; Query time: 0 msec ;; SERVER: 127.0.0.1#15353(127.0.0.1) ;; WHEN: Sun May 08 17:43:37 CEST 2022 ;; MSG SIZE rcvd: 514

~/git/pdns/regression-tests%


6. Change: remove d.host.test.dyndns A record

~/git/pdns/regression-tests% ../pdns/pdnsutil --config-dir=. --config-name=lmdb edit-zone test.dyndns Checked 25 records of 'test.dyndns', 0 errors, 0 warnings. Detected the following changes: -d.host.test.dyndns 3600 IN A 127.0.0.1

You have not updated the SOA record! Would you like to increase-serial? (y)es - increase serial, (n)o - leave SOA record as is, (e)dit your changes, (q)uit: y (a)pply these changes, (e)dit again, (r)etry with original zone, (q)uit: a Adding NSEC ordering information for zone 'test.dyndns', 21 updates ~/git/pdns/regression-tests%


7. Observable change after: no d.host.test.dyndns record, still NSEC d-e, with types RRSIG NSEC ??

~/git/pdns/regression-tests% dig @127.0.0.1 -p 15353 d.host.test.dyndns A +norec +dnssec

; <<>> DiG 9.16.28-RH <<>> @127.0.0.1 -p 15353 d.host.test.dyndns A +norec +dnssec ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 19342 ;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 1

;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 1232 ;; QUESTION SECTION: ;d.host.test.dyndns. IN A

;; AUTHORITY SECTION: test.dyndns. 3600 IN SOA ns1.test.dyndns. ahu.example.dyndns. 2012060703 28800 7200 604800 86400 test.dyndns. 3600 IN RRSIG SOA 13 2 3600 20220519000000 20220428000000 50164 test.dyndns. 4eVs5bwmJKmS0LTsPUwjJhPeimCMr69fFlzMATnyZtcAbALHOD6+xdAY aJk2KkCcxF5l27tM9ZoqX6GL5vg0Ig== d.host.test.dyndns. 3600 IN NSEC e.host.test.dyndns. RRSIG NSEC d.host.test.dyndns. 3600 IN RRSIG NSEC 13 4 3600 20220519000000 20220428000000 50164 test.dyndns. Sm6D/vwW9OkAa2m6v1DKxcyF5lApGLbcoXxS0CJ9lN2YAyHL9EXa12FC xRpo0vV5zh+yHCRoD034iZMsZif3+g== delete-add.test.dyndns. 3600 IN NSEC a.host.test.dyndns. A TXT RRSIG NSEC delete-add.test.dyndns. 3600 IN RRSIG NSEC 13 3 3600 20220519000000 20220428000000 50164 test.dyndns. YrZr0j8zZJWzz7qfmR9wD8rvX1FkTOKma/DXQ6e+UviTvJ+HkuXZ1ObT 3jywaEOBmOW2T6QhXtfD6/HlaWqohQ==

;; Query time: 0 msec ;; SERVER: 127.0.0.1#15353(127.0.0.1) ;; WHEN: Sun May 08 17:50:14 CEST 2022 ;; MSG SIZE rcvd: 511

~/git/pdns/regression-tests%



### Expected behaviour
<!-- What would you expect to happen when the reproduction steps are run -->
We ought to get back to the original state; no d.host.test.dyndns A record, NSEC a-e as in step 1 above.

### Actual behaviour
<!-- What did happen? Please (if possible) provide logs, output from `dig` and/or tcpdump/wireshark data -->
We get no d.host.test.dyndns A record, but NSEC d-e with types RRSIG NSEC.
Worth noting that it has clearly updated the NSEC chain after the record was removed, but it has updated the d-e entry with types RRSIG NSEC instead of removing it.

### Other information
<!-- if you already did more digging into the issue, please provide all the information you gathered -->

I do not know if this behavior is specific to the LMDB backend, that is just where the pdns regression tests made me aware of it.
The regression test exposing this, `1dyndns-update-in-between`, is only run with mysql in current master, leaving the state of other non-mysql backends unclear in this regard.
hlindqvist commented 2 years ago

A few notes:

The problem is caused by the "NSEC3" qtype LMDB entries (which do not represent actual NSEC3 records, but are used to store a mapping between the owner name of an actual recordset and an NSEC(3) name), and how these entries are not cleaned up when the original recordset is removed, leaving that name in existence but without any visible records. As these "NSEC3" qtype LMDB entries are also created in NSEC zones the "NSEC3" name is just a potential point of confusion when investigating this.

I have confirmed that if one deletes the "NSEC3" lmdb entry at the supposedly empty name, the NSEC chain looks as expected afterwards. Another aspect to this is that a name that is emptied like this could be a new ENT, which may or may not be handled already. Ideally I think updateEmptyNonTerminals and updateDNSSECOrderNameAndAuth respectively could be fixed to deal with these things, but that remains to be verified somehow. Otherwise it is at least a possibility to clean up the "NSEC3" entry directly when deleting the last record.