Closed ravilr closed 6 years ago
Record TTL is set to 30 seconds currently. Are you seeing inconsistencies beyond that length of time?
yes, I was seeing inconsistent results. But, i forgot to capture dig
output at the time when it was happening. I restarted all kube-dns pods since then.
Before restarting kube-dns (10.10.10.109 is the correct clusterIP) :
[qa.default.zk2@tachyon-qa-bf1]# host qa-default-zk1
qa-default-zk1.tachyon.svc.starfleet.local has address 10.10.10.33
[qa.default.zk2@tachyon-qa-bf1]# host qa-default-zk1
qa-default-zk1.tachyon.svc.starfleet.local has address 10.10.10.33
[qa.default.zk2@tachyon-qa-bf1]# host qa-default-zk1
qa-default-zk1.tachyon.svc.starfleet.local has address 10.10.10.109
[qa.default.zk2@tachyon-qa-bf1]# host qa-default-zk1
qa-default-zk1.tachyon.svc.starfleet.local has address 10.10.10.109
[qa.default.zk2@tachyon-qa-bf1]# host qa-default-zk1
qa-default-zk1.tachyon.svc.starfleet.local has address 10.10.10.109
[qa.default.zk2@tachyon-qa-bf1]# host qa-default-zk1
dig output after restarting kube-dns (shows ttl=30):
[qa.default.zk2@tachyon-qa-bf1]# dig qa-default-zk1.tachyon.svc.starfleet.local
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.4 <<>> qa-default-zk1.tachyon.svc.starfleet.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56373
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;qa-default-zk1.tachyon.svc.starfleet.local. IN A
;; ANSWER SECTION:
qa-default-zk1.tachyon.svc.starfleet.local. 30 IN A 10.10.10.109
;; Query time: 1 msec
;; SERVER: 10.10.10.10#53(10.10.10.10)
;; WHEN: Sat Feb 18 20:44:22 2017
;; MSG SIZE rcvd: 76
I'll see if i can reproduce this and report back here.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Prevent issues from auto-closing with an /lifecycle frozen
comment.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or @fejta
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or @fejta
.
/lifecycle rotten
/remove-lifecycle stale
/close
kubedns: gcr.io/google_containers/kubedns-amd64:1.8 dnsmasq: gcr.io/google_containers/kube-dnsmasq-amd64:1.4
A service resource say qa-svc1 was created and deleted after some time. the same qa-svc1, if recreated and got assigned a different ClusterIP, we are seeing kube-dns/Cluster-First dns-policy pods continue to see older ClusterIP on dns resolution of qa-svc1. I believe this is from the dnsmasq cache. Should there be a max-cache-ttl setting set on all dnsmasq cached records? or can kube-dns invalidate the cache in dnsmasq?
@bowei @thockin