NLnetLabs / nsd

The NLnet Labs Name Server Daemon (NSD) is an authoritative, RFC compliant DNS nameserver.
https://nlnetlabs.nl/nsd
BSD 3-Clause "New" or "Revised" License
432 stars 97 forks source link

Assertion `token->code == RIGHT_PAREN' failed #345

Open jaredmauch opened 2 weeks ago

jaredmauch commented 2 weeks ago

Updated to head and seeing this issue.

2024-06-28T06:13:19.059699-04:00 puck nsd[28550]: nsd: ./src/generic/parser.h:473: maybe_take: Assertion `token->code == RIGHT_PAREN' failed.
2024-06-28T06:13:21.071231-04:00 puck systemd-coredump[28553]: Process 28550 (nsd: main) of user 116 dumped core.#012#012Module libcap.so.2 from rpm libcap-2.69-8.fc40.x86_64#012Module libnss_systemd.so.2 from rpm systemd-255.7-1.fc40.x86_64#012Module libz.so.1 from rpm zlib-ng-2.1.6-5.fc40.x86_64#012Module libevent-2.1.so.7 from rpm libevent-2.1.12-12.fc40.x86_64#012Module libcrypto.so.3 from rpm openssl-3.2.1-2.fc40.x86_64#012Module libfstrm.so.0 from rpm fstrm-0.6.1-10.fc40.x86_64#012Module libprotobuf-c.so.1 from rpm protobuf-c-1.5.0-3.fc40.x86_64#012Module libssl.so.3 from rpm openssl-3.2.1-2.fc40.x86_64#012Stack trace of thread 28550:#012#0  0x00007fd48b4ab144 __pthread_kill_implementation (libc.so.6 + 0x98144)#012#1  0x00007fd48b45365e raise (libc.so.6 + 0x4065e)#012#2  0x00007fd48b43b902 abort (libc.so.6 + 0x28902)#012#3  0x00007fd48b43b81e __assert_fail_base.cold (libc.so.6 + 0x2881e)#012#4  0x00007fd48b44b977 __assert_fail (libc.so.6 + 0x38977)#012#5  0x0000561071abbfb3 maybe_take.lto_priv.1 (nsd + 0x96fb3)#012#6  0x0000561071aeb1ae parse (nsd + 0xc61ae)#012#7  0x0000561071a886e9 zone_parse (nsd + 0x636e9)#012#8  0x0000561071a888b6 zonec_read (nsd + 0x638b6)#012#9  0x0000561071a8913f namedb_read_zonefile (nsd + 0x6413f)#012#10 0x0000561071b2dd20 namedb_check_zonefiles.constprop.0.isra.0 (nsd + 0x108d20)#012#11 0x0000561071a42f6b main (nsd + 0x1df6b)#012#12 0x00007fd48b43d088 __libc_start_call_main (libc.so.6 + 0x2a088)#012#13 0x00007fd48b43d14b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2a14b)#012#14 0x0000561071a447e5 _start (nsd + 0x1f7e5)#012ELF object binary architecture: AMD x86-64

I'll submit the triggering zonefile in a moment, i have more than one zone file that is triggering this with a secondary zone

jaredmauch commented 2 weeks ago

backtrace:

[kadimperium.com.txt](https://github.com/user-attachments/files/16028093/kadimperium.com.txt)
(gdb) bt
#0  0x00007fd48b4ab144 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x00007fd48b45365e in raise () from /lib64/libc.so.6
#2  0x00007fd48b43b902 in abort () from /lib64/libc.so.6
#3  0x00007fd48b43b81e in __assert_fail_base.cold () from /lib64/libc.so.6
#4  0x00007fd48b44b977 in __assert_fail () from /lib64/libc.so.6
#5  0x0000561071abbfb3 in maybe_take (parser=0x7ffee4b5e400, token=0x7ffee4b4dfd0) at simdzone/./src/generic/parser.h:473
#6  0x0000561071aeb1ae in take (parser=<optimized out>, token=0x7ffee4b4dfd0) at simdzone/./src/generic/parser.h:514
#7  parse_rr (parser=0x7ffee4b5e400, token=0x7ffee4b4dfd0) at simdzone/./src/generic/format.h:225
#8  parse (parser=0x7ffee4b5e400) at simdzone/./src/generic/format.h:385
#9  0x0000561071a886e9 in parse (parser=0x7ffee4b5e400, user_data=0x7ffee4b4e1f0) at simdzone/./src/zone.c:99
#10 zone_parse (parser=0x7ffee4b5e400, options=0x7ffee4b21550, buffers=0x7ffee4b214f0, path=0x7ffee4b4e1f0 "\240\302P\204\020V", 
    user_data=0x7ffee4b4e1f0) at simdzone/./src/zone.c:462
#11 zone_parse (parser=parser@entry=0x7ffee4b5e400, options=options@entry=0x7ffee4b4e230, buffers=buffers@entry=0x7ffee4b4e1d0, 
    path=path@entry=0x5610841a42f8 "/etc/nsd/secondaries/kadimperium.com", user_data=user_data@entry=0x7ffee4b4e1f0)
    at simdzone/./src/zone.c:451
#12 0x0000561071a888b6 in zonec_read (database=<optimized out>, domains=<optimized out>, name=0x5610841a4090 "kadimperium.com", 
    zonefile=zonefile@entry=0x5610841a42f8 "/etc/nsd/secondaries/kadimperium.com", zone=zone@entry=0x5610888cae90)
    at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/zonec.c:400
#13 0x0000561071a8913f in namedb_read_zonefile (nsd=<optimized out>, zone=<optimized out>, taskudb=0x0, last_task=0x0)
    at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/dbaccess.c:261
#14 0x0000561071b2dd20 in namedb_check_zonefile (nsd=<optimized out>, taskudb=<optimized out>, last_task=<optimized out>, 
    zopt=0x5610841a3f20) at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/dbaccess.c:326
#15 namedb_check_zonefiles.constprop.0.isra.0 (taskudb=0x0, last_task=0x0, nsd=<optimized out>, opt=<optimized out>)
    at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/dbaccess.c:335
#16 0x0000561071a42f6b in server_prepare (nsd=<optimized out>) at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/server.c:1530
#17 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/nsd.c:1760
(gdb) up
#1  0x00007fd48b45365e in raise () from /lib64/libc.so.6
(gdb) up
#2  0x00007fd48b43b902 in abort () from /lib64/libc.so.6
(gdb) up
#3  0x00007fd48b43b81e in __assert_fail_base.cold () from /lib64/libc.so.6
(gdb) up
#4  0x00007fd48b44b977 in __assert_fail () from /lib64/libc.so.6
(gdb) up
#5  0x0000561071abbfb3 in maybe_take (parser=0x7ffee4b5e400, token=0x7ffee4b4dfd0) at simdzone/./src/generic/parser.h:473
473       assert(token->code == RIGHT_PAREN);
(gdb) up
#6  0x0000561071aeb1ae in take (parser=<optimized out>, token=0x7ffee4b4dfd0) at simdzone/./src/generic/parser.h:514
514       maybe_take(parser, token);
(gdb) up
#7  parse_rr (parser=0x7ffee4b5e400, token=0x7ffee4b4dfd0) at simdzone/./src/generic/format.h:225
225   take(parser, token);
(gdb) up
#8  parse (parser=0x7ffee4b5e400) at simdzone/./src/generic/format.h:385
385       code = parse_rr(parser, &token);
(gdb) up
#9  0x0000561071a886e9 in parse (parser=0x7ffee4b5e400, user_data=0x7ffee4b4e1f0) at simdzone/./src/zone.c:99
99    return kernel->parse(parser);
(gdb) up
#10 zone_parse (parser=0x7ffee4b5e400, options=0x7ffee4b21550, buffers=0x7ffee4b214f0, path=0x7ffee4b4e1f0 "\240\302P\204\020V", 
    user_data=0x7ffee4b4e1f0) at simdzone/./src/zone.c:462
462   code = parse(parser, user_data);
(gdb) up
#11 zone_parse (parser=parser@entry=0x7ffee4b5e400, options=options@entry=0x7ffee4b4e230, buffers=buffers@entry=0x7ffee4b4e1d0, 
    path=path@entry=0x5610841a42f8 "/etc/nsd/secondaries/kadimperium.com", user_data=user_data@entry=0x7ffee4b4e1f0)
    at simdzone/./src/zone.c:451
451 int32_t zone_parse(
(gdb) up
#12 0x0000561071a888b6 in zonec_read (database=<optimized out>, domains=<optimized out>, name=0x5610841a4090 "kadimperium.com", 
    zonefile=zonefile@entry=0x5610841a42f8 "/etc/nsd/secondaries/kadimperium.com", zone=zone@entry=0x5610888cae90)
    at /usr/src/debug/nsd-4.10.1-5.fc40.x86_64/zonec.c:400
400     if (zone_parse(&parser, &options, &buffers, zonefile, &state) != 0) {
(gdb) print zonefile
$1 = 0x5610841a42f8 "/etc/nsd/secondaries/kadimperium.com"
jaredmauch commented 2 weeks ago

kadimperium.com.txt

k0ekk0ek commented 2 weeks ago

Thanks for reporting @jaredmauch. Was about to ask if we could have the zonefile that triggered it, but you beat me to it :slightly_smiling_face:. I'll probably get to this next week.

jaredmauch commented 2 weeks ago

@k0ekk0ek there's numerous that are triggering it

zone.slave=6900

jaredmauch commented 1 week ago

Curious if you have figured it out yet so I can grab a branch and test

k0ekk0ek commented 1 week ago

Not yet, I'm a bit pressed for time and another PR was first on my list. I'll have a look tomorrow.

k0ekk0ek commented 1 week ago

@jaredmauch, I believe indexer state is corrupted when the first OPENPGP RR is encountered. The RDATA size exceeds the buffer size and so the buffer should be increased in size and the indexer needs to resume, but state is likely not managed correctly in that particular case. I've created NLnetLabs/simdzone#213 to track progress. Thanks for reporting! I'll keep you posted.

k0ekk0ek commented 1 week ago

Fixed it. Upon resize the partial token reference was not properly updated. @jaredmauch, can you give https://github.com/k0ekk0ek/simdzone/tree/fix-213 a try? If you add a remote in the nsd/simdzone Git submodule and checkout the fix-213 branch, things should work.

jaredmauch commented 1 week ago

I'm now seeing this:

[2024-07-05 06:39:58.235] nsd[2510162]: info: rehash of zone bloom-legal.fr. with parameters 1 0 1 -
nsd: ./src/generic/parser.h:687: maybe_take_quoted: Assertion `*parser->file->fields.head > *parser->file->delimiters.head' failed.
[2024-07-05 06:40:01.808] nsd[2510161]: error: did not get start signal from main

bloom-legal.fr.txt

k0ekk0ek commented 1 week ago

Thanks for testing. I'll tackle the new one next.

jaredmauch commented 1 week ago

If you send me an email offline I can give you a tar of my entire config/directory that might help if there are others. I can't find your email but mine should be pretty easy to find :-)

jaredmauch commented 1 week ago

I also opened https://github.com/NLnetLabs/nsd/pull/349 as related to ease troubleshooting

k0ekk0ek commented 1 week ago

The last one was simply a bad test in the assert. i.e. the test is meant to verify the start of the field comes before the end of the field, but in this case it verifies the end comes before the start. As it's in maybe_take_quoted the assert only triggers if the buffer needs refilling, which explains why you'd see it intermittently. I've updated the PR. I'll run with all the configs you provided to see if I can find any more bugs. I'll try to think of more edge cases myself too.