Open raffaelespazzoli opened 5 months ago
Thanks for reaching me, I will try to reproduce the root cause of this issue. However, if you could provide CoreDNS logs before crash - it would be very valuable to me. Nevertheless, I can give piece of advice regarding your configuration. The rewrite configuration that you made causes that CoreDNS returns answers inconsistent to original questions (and I suppose that such behavior does not adhere RFC) - please take a look on rewriting response.
To put it simply, instead such ANSWER section in response:
dig @10.89.0.225 -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.cluster1
...
;; ANSWER SECTION:
_peers._tcp.etcd-headless.h2.svc.cluster.local. 30 IN SRV 0 100 2379 etcd-headless.h2.svc.cluster.local.
you should received something like that:
dig @10.89.0.225 -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.cluster1
...
;; ANSWER SECTION:
_peers._tcp.etcd-headless.h2.svc.cluster.cluster1. 30 IN SRV 0 100 2379 etcd-headless.h2.svc.cluster.cluster1.
My suggestion is to use below configuration snippet for rewrite plugin for each cluster[1-3] zones - below example is for cluster1
rewrite stop {
name suffix .cluster.cluster1 .cluster.local answer auto
}
good call. I made those changes. still getting the same behavior. so now I'm getting something like this:
dig @10.89.0.225 -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.cluster2
; <<>> DiG 9.18.24 <<>> @10.89.0.225 -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.cluster2
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4600
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: d1c25b7278f911e0 (echoed)
;; QUESTION SECTION:
;_peers._tcp.etcd-headless.h2.svc.cluster.cluster2. IN SRV
;; ANSWER SECTION:
_peers._tcp.etcd-headless.h2.svc.cluster.cluster2. 10 IN SRV 0 100 2379 etcd-headless.h2.svc.cluster.cluster2.
;; ADDITIONAL SECTION:
etcd-headless.h2.svc.cluster.cluster2. 10 IN A 10.96.1.114
;; Query time: 3 msec
;; SERVER: 10.89.0.225#53(10.89.0.225) (UDP)
;; WHEN: Tue Apr 02 13:44:24 EDT 2024
;; MSG SIZE rcvd: 249
I have prepared the minimal standalone config to reproduce the problem on local docker:
.:{$DNS_PORT} {
log . "catch-all logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
template IN SRV local {
match (_[^.]+\.)*(?P<record>.*)$
answer "{{ .Name }} 10 IN SRV 0 100 2379 {{ .Group.record }}"
fallthrough
}
}
cluster.cluster1:{$DNS_PORT} {
log . "cluster1 logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
rewrite stop {
name suffix .cluster.cluster1 .cluster.local answer auto
}
forward . 127.0.0.1:{$DNS_PORT}
}
cluster.cluster2:{$DNS_PORT} {
log . "cluster2 logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
rewrite stop {
name suffix .cluster.cluster2 .cluster.local answer auto
}
forward . 127.0.0.1:{$DNS_PORT}
}
cluster.cluster3:{$DNS_PORT} {
log . "cluster3 logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
rewrite stop {
name suffix .cluster.cluster3 .cluster.local answer auto
}
forward . 127.0.0.1:{$DNS_PORT}
}
cluster.all:{$DNS_PORT} {
gathersrv cluster.all. {
cluster.cluster1. c1-
cluster.cluster2. c2-
cluster.cluster3. c3-
}
log . "sub-query logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
forward . 127.0.0.1:{$DNS_PORT}
}
When I ask for a SRV record in zone cluster.all
I get all three records as a result:
dig -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.all -p5300 @127.0.0.1
; <<>> DiG 9.18.20 <<>> -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.all -p5300 @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26628
;; flags: qr aa rd; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 355c70260c63fcfe (echoed)
;; QUESTION SECTION:
;_peers._tcp.etcd-headless.h2.svc.cluster.all. IN SRV
;; ANSWER SECTION:
_peers._tcp.etcd-headless.h2.svc.cluster.all. 10 IN SRV 0 100 2379 c3-etcd-headless.h2.svc.cluster.all.
_peers._tcp.etcd-headless.h2.svc.cluster.all. 10 IN SRV 0 100 2379 c1-etcd-headless.h2.svc.cluster.all.
_peers._tcp.etcd-headless.h2.svc.cluster.all. 10 IN SRV 0 100 2379 c2-etcd-headless.h2.svc.cluster.all.
;; Query time: 2 msec
;; SERVER: 127.0.0.1#5300(127.0.0.1) (UDP)
;; WHEN: Tue Apr 02 20:37:03 CEST 2024
;; MSG SIZE rcvd: 382
In my snippet, there are additional logs that show what requests were handled by each zone - so the output from CoreDNS is as follows:
.:5300
cluster.all.:5300
cluster.cluster1.:5300
cluster.cluster2.:5300
cluster.cluster3.:5300
CoreDNS-1.11.1
linux/amd64, go1.21.3, v1.11.1
[INFO] catch-all logger: 127.0.0.1:33758 - 7774 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.local. udp 87 false 1232 NOERROR qr,aa,rd 164 0.000226256s
[INFO] catch-all logger: 127.0.0.1:58998 - 2692 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.local. udp 87 false 1232 NOERROR qr,aa,rd 164 0.000373521s
[INFO] cluster3 logger: 127.0.0.1:54775 - 35088 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster3. udp 87 false 1232 NOERROR qr,aa,rd 196 0.000551974s
[INFO] cluster1 logger: 127.0.0.1:54817 - 53917 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster1. udp 87 false 1232 NOERROR qr,aa,rd 196 0.00057642s
[INFO] catch-all logger: 127.0.0.1:43712 - 44374 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.local. udp 87 false 1232 NOERROR qr,aa,rd 164 0.000238325s
[INFO] sub-query logger: 192.168.65.1:42032 - 26628 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster3. udp 90 false 1232 NOERROR qr,aa,rd 196 0.000767918s
[INFO] cluster2 logger: 127.0.0.1:44086 - 55398 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster2. udp 87 false 1232 NOERROR qr,aa,rd 196 0.00054577s
[INFO] sub-query logger: 192.168.65.1:42032 - 26628 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster1. udp 90 false 1232 NOERROR qr,aa,rd 196 0.000871329s
[INFO] sub-query logger: 192.168.65.1:42032 - 26628 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster2. udp 90 false 1232 NOERROR qr,aa,rd 196 0.000836313s
[INFO] type=SRV, question=_peers._tcp.etcd-headless.h2.svc.cluster.all., response=;; opcode: QUERY, status: NOERROR, id: 26628, answer-records=3, extra-records=1, gathered=3, not-gatherer=0, duration=1.239167ms
cluster.all
. CoreDNS
are processed in the order defined during compilations - so it is crucial to set it properlyI put the plugin at the end of the list. retrying with the right order....
no changes, I am still getting the error. Enabling the logs, I see that the pods enter in an infinite loop:
[INFO] catch-all logger: 127.0.0.1:36210 - 63423 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.00097998s
[INFO] catch-all logger: 127.0.0.1:41054 - 57808 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000907888s
[INFO] catch-all logger: 127.0.0.1:42035 - 41998 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.00057028s
[INFO] catch-all logger: 127.0.0.1:46771 - 34200 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000373977s
[INFO] catch-all logger: 127.0.0.1:58456 - 25621 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000408552s
[INFO] catch-all logger: 127.0.0.1:39111 - 18325 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000520405s
[INFO] catch-all logger: 127.0.0.1:36067 - 37192 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000569562s
[INFO] catch-all logger: 127.0.0.1:57640 - 43318 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000514316s
[INFO] catch-all logger: 127.0.0.1:51773 - 7138 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000478651s
This is my config now:
.:53 {
log . "catch-all logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
errors
health {
lameduck 5s
}
ready
rewrite name suffix .cluster.cluster1 .cluster.local answer auto
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
cluster.cluster2:53 {
log . "cluster2 logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
rewrite name suffix .cluster.cluster2 .cluster.local answer auto
forward . ${cluster2_coredns_ip}:53 {
expire 10s
policy round_robin
}
cache 10
}
cluster.cluster3:53 {
log . "cluster3 logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
rewrite name suffix .cluster.cluster3 .cluster.local answer auto
forward . ${cluster3_coredns_ip}:53 {
expire 10s
policy round_robin
}
cache 10
}
cluster.all:53 {
gathersrv cluster.all. {
cluster.cluster1. c1-
cluster.cluster2. c2-
cluster.cluster3. c3-
}
log . "sub-query logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
forward . 127.0.0.1:53
}
It seems that you still have the wrong order of plugins in your binary (if you use CoreDNS Makefile
to build binary make sure to clean the environment before rebuilding CoreDNS: make clean
).
Please filter logs generated by cluster.all
zone, it means logs with sub-query logger
prefix.
Such logs should contain queries to sub-zones: SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster[1-3].
. If you notice the query to the cluster.all
zone it indicates that gathersrv
plugin was triggered too late.
I recompiled from a clean workspace. now I am not getting the infinite loop anymore. The pod just dies. This is what I see in the logs
.:53
cluster.all.:53
cluster.cluster2.:53
cluster.cluster3.:53
[INFO] plugin/reload: Running configuration SHA512 = 7201ab82faa86d13333e01a35249e80eb96ffa6a71e98d0529443f52a02a584a215e95297d90d6c31983f0344b9c503b24de6222464b2fb1abe6104b3e5dac3c
CoreDNS-1.11.2
linux/arm64, go1.21.8, e3f83cb1f-dirty
[INFO] catch-all logger: 127.0.0.1:42128 - 19759 HINFO IN 6028198551586158430.7082859719790006129. udp 57 false 512 NXDOMAIN qr,rd,ra 132 0.047698373s
that catch-all logger
line is there before I run the query.
I gave the pod a bit more memory and it started with the loop again until it dies... here is the log with more memory:
.:53
cluster.all.:53
cluster.cluster2.:53
cluster.cluster3.:53
[INFO] plugin/reload: Running configuration SHA512 = 7201ab82faa86d13333e01a35249e80eb96ffa6a71e98d0529443f52a02a584a215e95297d90d6c31983f0344b9c503b24de6222464b2fb1abe6104b3e5dac3c
CoreDNS-1.11.2
linux/arm64, go1.21.8, e3f83cb1f-dirty
[INFO] catch-all logger: 127.0.0.1:60357 - 35183 HINFO IN 723505535130710133.1386775834635033436. udp 56 false 512 NXDOMAIN qr,rd,ra 131 0.048782517s
[INFO] catch-all logger: 10.89.0.1:51558 - 47673 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster1. udp 90 false 1232 NOERROR qr,aa,rd 226 0.006197935s
[INFO] cluster2 logger: 10.89.0.1:33265 - 34423 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.cluster2. udp 90 false 1232 NOERROR qr,aa,rd 226 0.008317279s
[INFO] catch-all logger: 127.0.0.1:40103 - 20964 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000856663s
[INFO] catch-all logger: 127.0.0.1:48758 - 12068 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000814903s
[INFO] catch-all logger: 127.0.0.1:39548 - 29014 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000367905s
[INFO] catch-all logger: 127.0.0.1:53146 - 31204 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000374262s
[INFO] catch-all logger: 127.0.0.1:52731 - 1049 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000407602s
[INFO] catch-all logger: 127.0.0.1:46820 - 37443 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000438854s
[INFO] catch-all logger: 127.0.0.1:48116 - 44858 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.000390648s
[INFO] catch-all logger: 127.0.0.1:43784 - 23307 NS IN . udp 17 false 512 NXDOMAIN qr,rd,ra 17 0.00040763s
as you can see I tried with .cluster.local
and .cluster.cluster2
before doing .cluster.all
which is where the loop started. again no sign of the sub-query logger
.
that is probably because the log statement is after the gathersrv statement. That is where it was in your original example. Is that correct?
catch-all logger
line with HINFO is a query made by loop
plugin while spinning up the server.
Were you able to grep lines generated by cluster.all
zone before the crash?
you can use kubectl logs
with --previous
flag to examine the output of the container before crash.
If there are no such lines, it means that OOMkiller killed container earlier.
To eliminate the loop, you can also forward queries in cluster.all
zone to the wrong port ie. 5353:
cluster.all:53 {
gathersrv cluster.all. {
cluster.cluster1. c1-
cluster.cluster2. c2-
cluster.cluster3. c3-
}
log . "sub-query logger: {remote}:{port} - {>id} {type} {class} {name} {proto} {size} {>do} {>bufsize} {rcode} {>rflags} {rsize} {duration}"
forward . 127.0.0.1:5353
}
it allows us to easier check what exactly happens.
what Is shared is all of the logs. it does not seem to ever print anything with sub-query logger
, perhaps I wasn't clear on that.
I'll try with the wrong port.
with the wrong port, I got this:
[INFO] Reloading complete
[INFO] sub-query logger: 10.89.0.1:41081 - 3290 SRV IN _peers._tcp.etcd-headless.h2.svc.cluster.all. udp 85 false 1232 - - 0 1.00269468s
and this answer:
; <<>> DiG 9.18.24 <<>> @10.89.0.225 -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.all
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 3290
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: fbb6a1a411f95bc1 (echoed)
;; QUESTION SECTION:
;_peers._tcp.etcd-headless.h2.svc.cluster.all. IN SRV
;; Query time: 1005 msec
;; SERVER: 10.89.0.225#53(10.89.0.225) (UDP)
;; WHEN: Wed Apr 03 10:02:13 EDT 2024
;; MSG SIZE rcvd: 85
So, as I supposed - the problem is caused by the wrong order of plugins. Pls. verify:
I don't know how to verify, is this what you are looking for:
kubectl --context kind-cluster1 exec -n kube-system coredns-598c664574-jpfct -- /coredns -plugins
Server types:
dns
Caddyfile loaders:
flag
default
Other plugins:
dns.acl
dns.any
dns.auto
dns.autopath
dns.azure
dns.bind
dns.bufsize
dns.cache
dns.cancel
dns.chaos
dns.clouddns
dns.debug
dns.dns64
dns.dnssec
dns.dnstap
dns.erratic
dns.errors
dns.etcd
dns.file
dns.forward
dns.gathersrv
dns.geoip
dns.grpc
dns.header
dns.health
dns.hosts
dns.k8s_external
dns.kubernetes
dns.loadbalance
dns.local
dns.log
dns.loop
dns.metadata
dns.minimal
dns.nsid
dns.pprof
dns.prometheus
dns.ready
dns.reload
dns.rewrite
dns.root
dns.route53
dns.secondary
dns.sign
dns.template
dns.timeouts
dns.tls
dns.trace
dns.transfer
dns.tsig
dns.view
dns.whoami
on
unfortunately, it prints plugins in alphabetic order instead of processing order
You wrote: I recompiled from a clean workspace.
- what does it mean? Have you cloned CoreDNS
source to a new directory?
As I can see there is a problem that CoreDNS Makefile does not clean generated files that contain the order of plugins. It means you have to either:
make gen
)plugin.cfg
before building CoreDNSI will prepare the repo with an automated building process
Here you can find a simple repo that allows building docker with a properly configured order of plugins: https://github.com/ziollek/gathersrv-docker
You can also find there Corefile
that allows testing the behavior locally.
it worked, it was the make gen
step. I have two questions:
;; ANSWER SECTION:
_peers._tcp.etcd-headless.h2.svc.cluster.all. 26 IN SRV 0 100 2379 c2-etcd-headless.h2.svc.cluster.all.
_peers._tcp.etcd-headless.h2.svc.cluster.all. 10 IN SRV 0 100 2379 c1-etcd-headless.h2.svc.cluster.all.
_peers._tcp.etcd-headless.h2.svc.cluster.all. 10 IN SRV 0 100 2379 c3-etcd-headless.h2.svc.cluster.all.
;; ADDITIONAL SECTION: c2-etcd-headless.h2.svc.cluster.all. 26 IN A 10.96.1.107 c1-etcd-headless.h2.svc.cluster.all. 10 IN A 10.96.0.234 c3-etcd-headless.h2.svc.cluster.all. 10 IN A 10.96.2.74
now `c2-etcd-headless.h2.svc.cluster.all` cannot actually be resolved. Isn't that going to be a problem?
2. I find this aggregating-result plugin very useful, why stop at SRV record and not support any record type?
Ad 1. The A, AAAA records either for c2-etcd-headless.h2.svc.cluster.all
and for etcd-headless.h2.svc.cluster.all
should be resolved as well. Have you tried:
dig c2-etcd-headless.h2.svc.cluster.all @your-coredns-ip
Ad 2. As mentioned above it supports SRV
, A
, AAAA
in contrast to k8s multicluster DNS which is much more complicated to set up and supports only A
, AAAA
. But indeed, I see the lack of such information in README
I have added an example of resolving hostnames returned by SRV query to the previously prepared demo - it needs to rebuild the image because of changes in Corefile
.
Ad 1. I tried and it works, thanks. I didn't understand this feature.
So when configuring these stateful workloads and referrign to my example, should one use the c2-etcd-headless.h2.svc.cluster.all
or the etcd-0.etcd-headless.h2.svc.cluster.cluster2
notation?
if I had two instances of etcd per cluster, how would the first notation work?
Ad 2. contrast to k8s multicluster DNS which is much more complicated to set up
what are you referring to here? The MCS specification?
Ad 1. To be honest, I do not understand why your headless service does not point to the particular node. I am referring to your first comment where you pasted:
dig @10.89.0.225 -t SRV _peers._tcp.etcd-headless.h2.svc.cluster.local
...
;; ANSWER SECTION:
_peers._tcp.etcd-headless.h2.svc.cluster.local. 30 IN SRV 0 100 2379 etcd-headless.h2.svc.cluster.local.
It should be resolved to etcd-0.etcd-headless.h2.svc.cluster.local
instead of etcd-headless.h2.svc.cluster.local.
Are you sure that you configured a headless service - it looks rather like ClusterIP
.
Ad 2. I am referring to the current implementation in GKE
MCS only supports ClusterSetIP and headless Services. Only DNS "A" records are available.
hello, I am trying to use this plugin, but my coredns pods get OOMKilled. I am probably mis-configuring it, possibly creating a loop... I'd like someone to review my config and possibly help me troubleshoot.
I have three clusters each with a modified coredns config. This is the config, this is one of them as an example:
so
cluster.local
is the local cluster,cluster.cluster[1..3]
is rewritten ascluster.local
and forwarded to the pertinent coredns. Finallycluster.all
should gather srv records from all of the clusters.pointing to cluster1 coredns IP, I can resolve
_peers._tcp.etcd-headless.h2.svc.cluster.local
:and resolve
_peers._tcp.etcd-headless.h2.svc.cluster.cluster1
:which result in the same response, correctly so. I can also try with cluster2:
which still works but it is resolved to a different IP. however if I try cluster.all:
I get a timeout and generate an OOMKilled for the coredns pod.