NLnetLabs / unbound

Unbound is a validating, recursive, and caching DNS resolver.
https://nlnetlabs.nl/unbound
BSD 3-Clause "New" or "Revised" License
3.03k stars 348 forks source link

local root zone can't resolving some domain #646

Closed ziojacky closed 2 years ago

ziojacky commented 2 years ago

i use unbound 1.15 made a root zone resolution server, but some domain can't resolving, main problem domain is all root-servers.net and icann with internic, but other domain can normal resolution, unbound compile with libevent, my unbound conf is :

server:
    verbosity: 1
    num-threads: 2
    port: 53
       interface: 127.0.0.1
    do-ip6: no
    access-control: 0.0.0.0/0 allow
    logfile: "unbound.log"
    root-hints: "root.hints"
    auto-trust-anchor-file: "root.key"

auth-zone:
    name: "."
    for-downstream: no
    for-upstream: yes
    zonefile: "root.zone"
wcawijngaards commented 2 years ago

For me, such a configuration resolves a.root-servers.net just fine. Have you tried setting the verbosity to 4, and then log the first resolution to see what is going wrong? The details then end up in the log file unbound.log.

ziojacky commented 2 years ago

For me, such a configuration resolves a.root-servers.net just fine. Have you tried setting the verbosity to 4, and then log the first resolution to see what is going wrong? The details then end up in the log file unbound.log.

i try view log resolution is normal, but dig can't get answer, if i close auto-trust-anchor-file dig can get icann and internic normal answer, but no matter how set it conf file, all root-servers.net can't get answer in dig, looks like a dead circle...

wcawijngaards commented 2 years ago

The auto-trust-anchor-file statement is causing trouble and turning it off solves some resolutions? Something must be wrong with it the file. It would be interesting to set verbosity to 4, then unbound logs in more detail what happens. Then start the server again, and look up one of the failing domains. That should give a lot of details on what is wrong.

ziojacky commented 2 years ago

The auto-trust-anchor-file statement is causing trouble and turning it off solves some resolutions? Something must be wrong with it the file. It would be interesting to set verbosity to 4, then unbound logs in more detail what happens. Then start the server again, and look up one of the failing domains. That should give a lot of details on what is wrong.

emmm, i try set verbosity 4, seem log nothing wrong and have right answer, since open new unbound process i only try get icann.org domain a type once, the log keeps enlarge, didn't do anything in between, the log content is circular query...

here are some parts:

`[1647468850] unbound[975:5] info: sending query: icann.org. A IN [1647468850] unbound[975:5] debug: sending to target: 199.19.56.1#53 [1647468850] unbound[975:5] debug: dnssec status: expected [1647468850] unbound[975:5] debug: mesh_run: iterator module exit state is module_wait_reply [1647468850] unbound[975:5] info: mesh_run: end 1 recursion states (1 with reply, 0 detached), 1 waiting replies, 0 recursion replies sent, 0 replies dropped, 0 states jostled out [1647468850] unbound[975:5] info: 0RDd mod1 rep icann.org. A IN [1647468850] unbound[975:5] debug: cache memory msg=66072 rrset=70274 infra=7808 val=66352 [1647468850] unbound[975:5] debug: serviced send timer [1647468850] unbound[975:5] debug: EDNS lookup known=0 vs=0 [1647468850] unbound[975:5] debug: serviced query UDP timeout=376 msec [1647468850] unbound[975:5] debug: inserted new pending reply id=6744 [1647468850] unbound[975:5] debug: opened UDP if=0 port=4219 [1647468850] unbound[975:5] debug: comm point start listening 52 (-1 msec) [1647468850] unbound[975:5] debug: answer cb [1647468850] unbound[975:5] debug: Incoming reply id = 6744 [1647468850] unbound[975:5] debug: Incoming reply addr = ip4 199.19.56.1 port 53 (len 16) [1647468850] unbound[975:5] debug: lookup size is 1 entries [1647468850] unbound[975:5] debug: received udp reply. [1647468850] unbound[975:5] debug: udp message[471:0] 674480100001000000080003056963616E6E036F72670000010001C00C0002000100015180001501620D6963616E6E2D73657276657273036E657400C00C00020001000151800005026E73C00CC00C000200010001518000040161C029C00C000200010001518000040163C029C00C002B0001000151800024468C07026BE021818B9F10ED981A03ACBF74F01E31FB15C58680AD0C4BAA464BF37A7523C00C002B00010001518000245C200702AF57A492640102809209AA005B93C32B7ACC83734BC785CFA50B51688299CD61C00C002B000100015180002443600702885CF8A6CF35FD5C619E1D48B59AFB23063BBA9FEC52FF25F99094CBA10910A2C00C00 [1647468850] unbound[975:5] debug: udp message[471:256] 2E0001000151800097002B080200015180624476216228B891776D036F7267004C57B0826D121BEA3B861F1B110E36491785E08CB8233B572DF657F228E7416A9AAA0951D19A61477C777A3F38FFF2CC97423ACA7BECF143BBBC0EDC72FEC9E056AEBC75F4793E38DA050F49DD099AE6FDCED6FCD9BFF296A0EBC883E094D8EF9DB5AB0B2173824210E8A510730A356012710033994D0BFFB0F3CF6F1888C6ADC04800010001000151800004C7048A35C048001C00010001518000102001050000890000000000000000005300002904D0000080000000 [1647468850] unbound[975:5] debug: outnet handle udp reply [1647468850] unbound[975:5] debug: serviced query: EDNS works for ip4 199.19.56.1 port 53 (len 16) [1647468850] unbound[975:5] debug: measured roundtrip at 50 msec [1647468850] unbound[975:5] debug: svcd callbacks start [1647468850] unbound[975:5] debug: worker svcd callback for qstate 0x7f8b2c4ad650 [1647468850] unbound[975:5] debug: mesh_run: start [1647468850] unbound[975:5] debug: iterator[module 1] operate: extstate:module_wait_reply event:module_event_reply [1647468850] unbound[975:5] info: iterator operate: query icann.org. A IN [1647468850] unbound[975:5] debug: process_response: new external response event [1647468850] unbound[975:5] info: scrub for org. NS IN [1647468850] unbound[975:5] info: response for icann.org. A IN [1647468850] unbound[975:5] info: reply from 199.19.56.1#53 [1647468850] unbound[975:5] info: incoming scrubbed packet: ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 0 ;; flags: qr ; QUERY: 1, ANSWER: 0, AUTHORITY: 8, ADDITIONAL: 2 ;; QUESTION SECTION: icann.org. IN A

;; ANSWER SECTION:

;; AUTHORITY SECTION: icann.org. 4 IN NS a.icann-servers.net. icann.org. 4 IN NS c.icann-servers.net. icann.org. 4 IN NS b.icann-servers.net. icann.org. 4 IN NS ns.icann.org. icann.org. 4 IN DS 23584 7 2 AF57A492640102809209AA005B93C32B7ACC83734BC785CFA50B51688299CD61 icann.org. 4 IN DS 17248 7 2 885CF8A6CF35FD5C619E1D48B59AFB23063BBA9FEC52FF25F99094CBA10910A2 icann.org. 4 IN DS 18060 7 2 6BE021818B9F10ED981A03ACBF74F01E31FB15C58680AD0C4BAA464BF37A7523 icann.org. 4 IN RRSIG DS 8 2 86400 20220330152417 20220309142417 30573 org. TFewgm0SG+o7hh8bEQ42SReF4Iy4IztXLfZX8ijnQWqaqglR0ZphR3x3ej84//LMl0I6ynvs8UO7vA7ccv7J4FauvHX0eT442gUPSd0Jmub9ztb82b/ylqDryIPglNjvnbWrCyFzgkIQ6KUQcwo1YBJxADOZTQv/sPPPbxiIxq0= ;{id = 30573}

;; ADDITIONAL SECTION: ns.icann.org. 4 IN A 199.4.138.53 ns.icann.org. 4 IN AAAA 2001:500:89::53 ;; MSG SIZE rcvd: 460`

wcawijngaards commented 2 years ago

So I do not see anything wrong in the logs you quote. Suspicious, the TTL on the reply you get is 4 seconds on the RRs. From the .org server, but in reality it has a TTL of about 86400 on the RRs. For the reply from the .org servers I would expect the full TTL on the replies. The content looks okay otherwise.

ziojacky commented 2 years ago

So I do not see anything wrong in the logs you quote. Suspicious, the TTL on the reply you get is 4 seconds on the RRs. From the .org server, but in reality it has a TTL of about 86400 on the RRs. For the reply from the .org servers I would expect the full TTL on the replies. The content looks okay otherwise.

today i find a error in my log:

unbound[2253:7] debug: tcp error for address ip4 192.5.6.30 port 53 (len 16)

if set do-tcp: yes, all previous errors will be resolved, but i don't want open tcp listen port in local, is there any way to solve it?

wcawijngaards commented 2 years ago

Perhaps you have, apart from the weird TTL, also an MTU problem, because you receive only 460 bytes, less than 512, instead of an expected like up to 1500 bytes MTU. This can be caused by firewalls and settings along those lines. When the MTU is too small, also TCP is needed. It then uses TCP to fetch the larger message. You could change edns-buffer-size: to like 512, but it is too small, and likely will create a lot of TCP backoff, for which you then have to enable TCP. This TCP backoff seems to be happening all by itself as well, as you receive shorter replies, with also altered TTLs.

If there is some upstream firewall you can fix, eg. enable DNS port 53 larger than 512 content, that would probably fix a lot of issues. Also having TCP enabled is a good idea. There is no way to have TCP disabled for downstream but enable it for upstream. Usually having TCP enabled is considered the default, and is also encouraged by several RFC standards.