Closed plevart closed 2 years ago
Great catch, thanks @plevart . Can you check if #125 behaves correctly with bind?
I can confirm that it now correctly works with Bind 9 in front of the CoreDNS/k8s_gateway. Thanks for quick fix.
I was hitting this same issue, issue-124 branch looks to fix it (unbound in front)! Thanks as well for the quick fix!
fix is in the latest v.0.3.1 release
Hi,
I have setup a special instance of CoreDNS with k8s_gateway plugin to serve a special zone dedicated to External IP addresses obtained from LoadBalancer services and Ingresses. The following is the Corefile config file:
I noticed strange behavior when resolving against a caching nameserver based on bind 9 that has this CoreDNS server configured as a forwarding zone target. Sometimes resolving would yield "Could not resolve host" for a name that should be present while sometimes it would resolve OK for the same name. Digging deeper I found that this CoredDNS/k8s_gateway combo does not return correct answers for AAAA queries when there is nothing to return for the requested name, but there is an A record for the same name. In such scenario DNS server should return status: NOERROR with an empty answer section. For example, a query against CoreDNS instance with kubernetes plugin for internal k8s names:
And here is a response for a query against CordeDNS instance with k8s_gateway plugin combo for a special zone and an existing name which has an A record but no AAAA record:
I think this is exactly the problem that this document describes:
https://datatracker.ietf.org/doc/html/rfc4074
TLDR; Returning status NXDOMAIN (3) for an AAAA query means there are no records (of any type) for the requested name. Returning NOERROR (0) with an empty response for an AAAA query merely means there are no AAAA records for requested name but there are records of other types (such as A for example).
Linux resolver typically asks for A and AAAA records in two concurrent queries when resolving a name. Depending which answer comes 1st it may cache the A answer or it may cache the AAAA negative answer meaning that it may return "Cant resolve name". When a request for AAAA returns status NOERROR with empty response, such answer does not "override" positive answer to an A query and resolving always works correctly.