Open JohnStrunk opened 6 years ago
added the initContainer for the glusterd2 still facing the same issue here is the logs
initContainer
initContainers:
- name: check-dns
command: ["/bin/sh","-c","until nslookup gluster-{{ kube_hostname }}-0.glusterd2.{{ gcs_namespace }}.svc.cluster.local; do echo waiting for gluster-{{ kube_hostname }}-0.glusterd2.{{ gcs_namespace }}.svc.cluster.local; sleep 2;done;"]
image: busybox:1.28.0-glibc
imagePullPolicy: IfNotPresent
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
logs from initContainer:
kubectl logs gluster-kube1-0 check-dns -ngcs
Server: 10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
Name: gluster-kube1-0.glusterd2.gcs.svc.cluster.local
Address 1: 10.233.64.8 gluster-kube1-0.glusterd2.gcs.svc.cluster.local
time="2018-11-15 06:45:56.004341" level=fatal msg="failed to create gd2-muxsrv listener" error="listen tcp: lookup gluster-kube1-0.glusterd2.gcs on 10.233.0.3:53: no such host" source="[server.go:24:muxsrv.newMuxSrv]"
Does gluster-kube1-0.glusterd2.gcs
vs gluster-kube1-0.glusterd2.gcs.svc.cluster.local
make a difference?
@Madhu-1 any progress made on this?
worked on john proposal, not able to luck. will take a look
Does gluster-kube1-0.glusterd2.gcs vs gluster-kube1-0.glusterd2.gcs.svc.cluster.local make a difference?
@JohnStrunk gluster-kube1-0.glusterd2.gcs
is not rechable from initcontainer but gluster-kube1-0.glusterd2.gcs.svc.cluster.local
is rechable from initcontainer.
It appears gd2 is using .gcs
as opposed to .gcs.svc.cluster.local
. Does the init container fix the problem if we wait on the same DNS name as gd2? (hence my initial cryptic question :disappointed:)
even I tried with other, with .gcs
is not reachable from initcontainer (may due to headless services and stateful sets).
with .gcs.svc.cluster.local
it was able to resolve the DNS name, but it won't work out.
@JohnStrunk am out of ideas, I tried with both initcontainers
and pod spec.containers.lifecycle.postStart
but no luck.
if i run the same script inside the container its working as expected, it its not working with postStart
initcontainer works as expected(nslookup was successful) but still am seeing the issue when pod restarts.
In gluster/glusterd2#1324, it looks like DNS isn't ready by the time gd2 tries to resolve its own hostname.
We should add an init container like etcd does to wait for DNS.
Example: