d2iq-archive / mesos-dns

DNS-based service discovery for Mesos.
https://mesosphere.github.com/mesos-dns
Apache License 2.0
483 stars 137 forks source link

mesos-dns resolving the docker container ip instead of the agent ip for a task when using custom top level domain? #506

Closed nizamik closed 7 years ago

nizamik commented 7 years ago

Hello,

When we try and resolve a task running on our DCOS cluster, mesos-dns appears to be responding with the docker container ip and not the agent ip address as expected. It is resolving the leader correctly and the master address correctly.

mesos-dns configuration file: { "domain": "mesos.hackathon-rd-dev", "externalon": false, "listener": "10.134.11.40", "masters": ["10.134.8.128:5050", "10.134.9.44:5050", "10.134.9.64:5050"], "port": 8053, "recurseon": false, "refreshSeconds": 60, "resolvers": ["8.8.8.8","8.8.4.4"], "SOAExpire": 86400, "SOAMinttl": 60, "SOAMname": "ns1.mesos.hackathon-rd-dev", "SOARefresh": 60, "SOARetry": 600, "SOARname": "root.ns1.mesos.hackathon-rd-dev", "ttl": 60, "zk": "zk://leader.mesos:2181/mesos" }

dig the grafana task on mesos-dns running on the agent node: [07/26/17 12:39 PM] nizami, khurram: ip-10-134-11-131 etc # dig @localhost -p 8053 grafana.marathon.mesos.hackathon-rd-dev

; <<>> DiG 9.10.2-P4 <<>> @localhost -p 8053 grafana.marathon.mesos.hackathon-rd-dev ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9280 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available

;; QUESTION SECTION: ;grafana.marathon.mesos.hackathon-rd-dev. IN A

;; ANSWER SECTION: grafana.marathon.mesos.hackathon-rd-dev. 60 IN A 172.17.0.2

;; Query time: 0 msec ;; SERVER: 127.0.0.1#8053(127.0.0.1) ;; WHEN: Wed Jul 26 17:34:19 UTC 2017 ;; MSG SIZE rcvd: 73

dig for masters, responds correctly: ip-10-134-11-131 etc # dig @localhost -p 8053 master.mesos.hackathon-rd-dev

; <<>> DiG 9.10.2-P4 <<>> @localhost -p 8053 master.mesos.hackathon-rd-dev ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29930 ;; flags: qr aa rd; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available

;; QUESTION SECTION: ;master.mesos.hackathon-rd-dev. IN A

;; ANSWER SECTION: master.mesos.hackathon-rd-dev. 60 IN A 10.134.8.128 master.mesos.hackathon-rd-dev. 60 IN A 10.134.9.64 master.mesos.hackathon-rd-dev. 60 IN A 10.134.9.44

;; Query time: 0 msec ;; SERVER: 127.0.0.1#8053(127.0.0.1) ;; WHEN: Wed Jul 26 17:39:25 UTC 2017 ;; MSG SIZE rcvd: 95

Go to the agent node running grafana and inspect the container, the container ip is what mesos-dns is reporting:

        "Networks": {
            "bridge": {
                "IPAMConfig": null,
                "Links": null,
                "Aliases": null,
                "NetworkID": "1d2976b7f288f15a0b099513b5b4c7b42c1084a4a63df7bccf0efe85afd2ec60",
                "EndpointID": "7cf116257efc0e39c895ae5de44fdf385d8cf16b82c5fdca0a7d894e51081f4d",
                "Gateway": "172.17.0.1",
                "IPAddress": "172.17.0.2",
                "IPPrefixLen": 16,
                "IPv6Gateway": "",
                "GlobalIPv6Address": "",
                "GlobalIPv6PrefixLen": 0,
                "MacAddress": "02:42:ac:11:00:02"

If I query mesos-dns running on the master nodes for grafana I get the correct response:

ip-10-134-11-131 etc # dig grafana.marathon.mesos

; <<>> DiG 9.10.2-P4 <<>> grafana.marathon.mesos ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31526 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION: ;grafana.marathon.mesos. IN A

;; ANSWER SECTION: grafana.marathon.mesos. 60 IN A 10.134.8.152

;; Query time: 2 msec ;; SERVER: 198.51.100.1#53(198.51.100.1) ;; WHEN: Wed Jul 26 17:42:35 UTC 2017 ;; MSG SIZE rcvd: 56

ip-10-134-11-131 etc #

nizamik commented 7 years ago

I found the answer, the IPSources order needs to have "host" first, as shown in #477