gluster / gcs

Check github.com/heketi, github.com/gluster/gluster-containers, or github.com/kadalu/kadalu as active alternatives
https://gluster.org
Apache License 2.0
36 stars 24 forks source link

Add init container to glusterd2 pod to wait for DNS #68

Open JohnStrunk opened 6 years ago

JohnStrunk commented 6 years ago

In gluster/glusterd2#1324, it looks like DNS isn't ready by the time gd2 tries to resolve its own hostname.

We should add an init container like etcd does to wait for DNS.

Example:

$ kubectl -n gcs get po/etcd-sv9sxbvm7j -oyaml
apiVersion: v1
kind: Pod
metadata:
...
spec:
...
  initContainers:
  - command:
    - /bin/sh
    - -c
    - "\n\t\t\t\t\twhile ( ! nslookup etcd-sv9sxbvm7j.etcd.gcs.svc )\n\t\t\t\t\tdo\n\t\t\t\t\t\tsleep
      2\n\t\t\t\t\tdone"
    image: busybox:1.28.0-glibc
    imagePullPolicy: IfNotPresent
    name: check-dns
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
...
Madhu-1 commented 6 years ago

added the initContainer for the glusterd2 still facing the same issue here is the logs

initContainer

initContainers:
      - name: check-dns
        command: ["/bin/sh","-c","until nslookup gluster-{{ kube_hostname }}-0.glusterd2.{{ gcs_namespace }}.svc.cluster.local; do echo waiting for gluster-{{ kube_hostname }}-0.glusterd2.{{ gcs_namespace }}.svc.cluster.local; sleep 2;done;"]
        image: busybox:1.28.0-glibc
        imagePullPolicy: IfNotPresent
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File

logs from initContainer:

kubectl logs gluster-kube1-0 check-dns -ngcs
Server:    10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
Name:      gluster-kube1-0.glusterd2.gcs.svc.cluster.local
Address 1: 10.233.64.8 gluster-kube1-0.glusterd2.gcs.svc.cluster.local
JohnStrunk commented 6 years ago

Does gluster-kube1-0.glusterd2.gcs vs gluster-kube1-0.glusterd2.gcs.svc.cluster.local make a difference?

atinmu commented 5 years ago

@Madhu-1 any progress made on this?

Madhu-1 commented 5 years ago

worked on john proposal, not able to luck. will take a look

Madhu-1 commented 5 years ago

Does gluster-kube1-0.glusterd2.gcs vs gluster-kube1-0.glusterd2.gcs.svc.cluster.local make a difference?

@JohnStrunk gluster-kube1-0.glusterd2.gcs is not rechable from initcontainer but gluster-kube1-0.glusterd2.gcs.svc.cluster.local is rechable from initcontainer.

JohnStrunk commented 5 years ago

It appears gd2 is using .gcs as opposed to .gcs.svc.cluster.local. Does the init container fix the problem if we wait on the same DNS name as gd2? (hence my initial cryptic question :disappointed:)

Madhu-1 commented 5 years ago

even I tried with other, with .gcs is not reachable from initcontainer (may due to headless services and stateful sets). with .gcs.svc.cluster.local it was able to resolve the DNS name, but it won't work out.

Madhu-1 commented 5 years ago

@JohnStrunk am out of ideas, I tried with both initcontainers and pod spec.containers.lifecycle.postStart but no luck.

if i run the same script inside the container its working as expected, it its not working with postStart initcontainer works as expected(nslookup was successful) but still am seeing the issue when pod restarts.