gliderlabs / registrator

Service registry bridge for Docker with pluggable adapters
http://gliderlabs.com/registrator
MIT License
4.66k stars 912 forks source link

Registrator doesn't reports IP of Kubernetes Pod #506

Open iluxame opened 7 years ago

iluxame commented 7 years ago

Description of the problem: Kubernetes pod contains namespace container and one or more containers defined by user. These containers are using this namespace container in order to communicate with outside world so their NetworkMode is set to container: and they doesn't have IP in their Docker metadata. Since Registartor is using docker.inspect on container it doesn't catch its IP and provides all info to Consul without container IP.

How reproducible: Deterministic

Steps to Reproduce: Install Registrator container directly via Docker on each Kubernetes host. Create Kubernetes pod with the same data like it was a Docker container with Registrator related env vars. Go to Consul or run curl via agent and see that all data you've pass is there but IP is not reported

Actual Results: Go to Consul or run curl via agent and see that all data you've pass is there but IP is not reported Expected Results: To see all data with IP Additional info:

Possible solution is to add additional lookup and check the value of NetworkMode and in case of container network to retrieve IP information from the relevant container

balexx commented 7 years ago

PR #507 should solve the issue.

ganeshkaila commented 7 years ago

@balexx I am having the similar problem here. I am not sure, whether the problem is because of my current approach or really an issue. So, I wanted to describe the flow here.

Steps to reproduce the problem:

Additional logs, those may help:

Note:

FYI, consul and registrator are run with the hostNetwork enabled where as tomcat is run with default options.

cc @iluxame

balexx commented 7 years ago

@meganesh containers inside kubernetes pods don't see their IP address properly. See the following from your 'docker inspect':

        "NetworkMode": "container:1c00a4d939d46742bf8f502b29b79122647fbde5190c2944371560ad901a4545",

What this means is that networking of the tomcat container is "offloaded" to container c00a4d939d46742bf8f502b29b79122647fbde5190c2944371560ad901a4545, which is the one carrying the IP address.

PR 507 solves this, by grabbing the IP of the linked container.

ganeshkaila commented 7 years ago

@balexx Did you test https://github.com/gliderlabs/registrator/pull/507 in GKE?

FYI, the logs corresponding to my tomcat service registration with registrator:

  2017/03/22 06:08:44 tomcat: detected container NetworkMode, linked to: a7823230f585
  2017/03/22 06:08:44 tomcat: using network container IP 
  2017/03/22 06:08:44 added: c94f42796593 NODE_NAME:k8s_tomcat.4f2a469_tomcat-66122773-g6xt1_default_f19b47ec-0ec5-11e7-bc48-42010af00249_0be74575:8080

The registered tomcat service on the consul side:

  [my-user@consul-client ~]$ curl http://localhost:8500/v1/catalog/service/tomcat | jq
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed
  100   489  100   489    0     0  81581      0 --:--:-- --:--:-- --:--:-- 97800
  [
    {
      "ID": "4a223fd7-652f-727c-dff2-f0f3f689c218",
      "Node": "NODE_NAME",
      "Address": "10.0.4.4",
      "TaggedAddresses": {
        "lan": "10.0.4.4",
        "wan": "10.0.4.4"
      },
      "NodeMeta": {},
      "ServiceID": "NODE_NAME:k8s_tomcat.4f2a469_tomcat-66122773-g6xt1_default_f19b47ec-0ec5-11e7-bc48-42010af00249_0be74575:8080",
      "ServiceName": "tomcat",
      "ServiceTags": [],
      "ServiceAddress": "",
      "ServicePort": 8080,
      "ServiceEnableTagOverride": false,
      "CreateIndex": 27546,
      "ModifyIndex": 27546
    }
  ]

The service is resolving to node ip instead of container ip.

  [my-user@consul-client ~]$ ping -n -c2 tomcat.service.consul
  PING tomcat.service.consul (10.0.4.4) 56(84) bytes of data.
  64 bytes from 10.0.4.4: icmp_seq=1 ttl=64 time=0.845 ms
  64 bytes from 10.0.4.4: icmp_seq=2 ttl=64 time=0.344 ms
  --- tomcat.service.consul ping statistics ---
  2 packets transmitted, 2 received, 0% packet loss, time 1000ms
  rtt min/avg/max/mdev = 0.344/0.594/0.845/0.251 ms

Note:

The icmp request is expected to resolve to 10.136.1.4 instead of 10.0.4.4.

balexx commented 7 years ago

@meganesh according to this line:

2017/03/22 06:08:44 tomcat: detected container NetworkMode, linked to: a7823230f585

You should look at the container a7823230f585 and see what network address it holds. I believe what you're seeing is the IP used by that container.

The patch I provided was not tested on GKE, so YMMV.

ganeshkaila commented 7 years ago

@balexx I just re-run the service again.

Newly generated logs on registrator side while tomcat is getting registered.

  2017/03/22 09:53:15 tomcat: detected container NetworkMode, linked to: bd52929abcf2
  2017/03/22 09:53:15 tomcat: using network container IP 
  2017/03/22 09:53:15 added: d47a662b20af NODE_NAME:k8s_tomcat.a284bf2f_tomcat-37895574829v7cv_default_ad071f33-0ee1-11e7-bc48-42010af00249_43936e18:8080

When I did docker inspect bd52929abcf2, I found no IP address at NetworkSettings.IPAddress and NetworkSettings.Networks.none.IPAddress.

ganeshkaila commented 7 years ago

@balexx FYI,

registrator-ds.yml:

  apiVersion: extensions/v1beta1
  kind: DaemonSet
  metadata:
    creationTimestamp: null
    labels:
      run: registrator
    name: registrator
  spec:
    selector:
      matchLabels:
        run: registrator
    template:
      metadata:
        creationTimestamp: null
        labels:
          run: registrator
      spec:
        hostNetwork: true
        containers:
        - image: gliderlabs/registrator:master
          name: registrator
          command: ["/bin/sh"]
          args: ["-c", "registrator -internal consul://<NODE_IP>:8500"]
          lifecycle:
            postStart:
              exec:
                command:
                  - "/bin/sh"
                  - "-c"
                  - "echo \"nameserver <NODE_IP>\" > /etc/resolv.conf"
          env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          volumeMounts:   
          - mountPath: /tmp/docker.sock
            name: docker-sock
        volumes:
        - name: docker-sock
          hostPath:
            path: /var/run/docker.sock

tomcat-deploy.yml

  apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    creationTimestamp: null
    labels:
      run: tomcat
    name: tomcat
  spec:
    replicas: 1
    selector:
      matchLabels:
        run: tomcat
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          run: tomcat
      spec:
        containers:
        - image: tomcat:8.0
          name: tomcat
          ports:
          - containerPort: 8080
          lifecycle:
            postStart:
              exec:
                command:
                  - "/bin/sh"
                  - "-c"
                  - "echo \"nameserver <NODE_IP>\" > /etc/resolv.conf"
          env:
          - name: SERVICE_NAME
            value: tomcat
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName           
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          resources: {}
  status: {}

Please let me know, if I have given anything wrong in the configuration perspective of registrator or tomcat.

ganeshkaila commented 7 years ago

@balexx I have recently tested it on my local kubernetes cluster as well. It didn't work there also. Please point me if I am missing any configuration in the spec files.

balexx commented 7 years ago

Then it seems like GKE is behaving differently than bare metal, which is what we're using.

Maybe if you post docker describe for the tomcat container and the pause container it's linked to, we can find where the IP address is actually stated.

gvenka008c commented 7 years ago

@meganesh Did you fix the issue? Can you provide what change was done on registrator end?

ganeshkaila commented 7 years ago

@gvenka008c If you are having the similar problem, this docker hub repo may help.

raffian commented 6 years ago

@iluxame

What does your Tomcat application connect with to do service discovery? Registrator, or the back end provider?