mesosphere / mesos-dns

DNS-based service discovery for Mesos.
https://mesosphere.github.com/mesos-dns
Apache License 2.0
484 stars 137 forks source link

nslookup in docker container failed! #527

Closed junneyang closed 6 years ago

junneyang commented 6 years ago

slave node: ping helloworld.marathon.mesos------Ok

slave docker app: ping helloworld.marathon.mesos------FAIL nslookup: can't resolve '(null)': Name does not resolve

jdef commented 6 years ago

Can you provide more context re: your environment? See https://github.com/mesosphere/mesos-dns/blob/master/CONTRIBUTING.md#issues

junneyang commented 6 years ago

@jdef

mesos: 1.7.0 marathon: 1.7.50

helloworld app: { "id": "/helloworld", "cmd": "python -m SimpleHTTPServer $PORT", "cpus": 1, "mem": 128, "disk": 0, "instances": 2, "acceptedResourceRoles": [ "*" ], "healthChecks": [ { "gracePeriodSeconds": 300, "ignoreHttp1xx": false, "intervalSeconds": 60, "maxConsecutiveFailures": 3, "path": "/", "portIndex": 0, "protocol": "HTTP", "ipProtocol": "IPv4", "timeoutSeconds": 20, "delaySeconds": 15 } ], "portDefinitions": [ { "port": 10000, "protocol": "tcp" } ] }

nginx-docker app: { "id": "/helloworld-nginx", "cmd": null, "cpus": 1, "mem": 128, "disk": 0, "instances": 1, "constraints": [ [ "hostname", "CLUSTER", "10.240.185.53" ] ], "acceptedResourceRoles": [], "container": { "type": "DOCKER", "docker": { "forcePullImage": false, "image": "nginx:1.15-alpine", "parameters": [], "privileged": false }, "volumes": [ { "containerPath": "/usr/share/nginx/html", "hostPath": "/opt/paas/data/nginx", "mode": "RO" } ], "portMappings": [ { "containerPort": 80, "hostPort": 0, "labels": {}, "protocol": "tcp", "servicePort": 10001 } ] }, "healthChecks": [ { "gracePeriodSeconds": 300, "ignoreHttp1xx": false, "intervalSeconds": 60, "maxConsecutiveFailures": 3, "path": "/", "portIndex": 0, "protocol": "HTTP", "ipProtocol": "IPv4", "timeoutSeconds": 20, "delaySeconds": 15 } ], "networks": [ { "mode": "container/bridge" } ], "portDefinitions": [] }

on slave node: nslookup helloworld.marathon.mesos------OK tim 20180927215647

in nginx-docker: nslookup helloworld.marathon.mesos------FAIL tim 20180927215908

dig reply: image

mesos-dns:v0.6.0 config.json: { "zk": "zk://yangjun-centos7-4-0001.novalocal:2181,yangjun-centos7-4-0002.novalocal:2181,yangjun-centos7-4-0003.novalocal:2181/mesos", "masters": ["yangjun-centos7-4-0001.novalocal:5050", "yangjun-centos7-4-0002.novalocal:5050", "yangjun-centos7-4-0003.novalocal:5050"], "refreshSeconds": 60, "ttl": 60, "domain": "mesos", "port": 53, "resolvers": ["8.8.8.8"], "timeout": 5, "httpon": true, "dnson": true, "httpport": 8123, "externalon": true, "listener": "0.0.0.0", "SOAMname": "felix.mesos", "SOARname": "admin.felix.mesos", "SOARefresh": 60, "SOARetry": 600, "SOAExpire": 86400, "SOAMinttl": 60, "IPSources": ["netinfo", "mesos", "host"] } but in nginx-docker, curl http://10.240.185.53:8123/v1/hosts/helloworld.marathon.mesos colud return right message: [ { "host": "helloworld.marathon.mesos.", "ip": "10.240.185.53" }, { "host": "helloworld.marathon.mesos.", "ip": "10.240.185.6" } ]

thanks for your help!

jdef commented 6 years ago

on which node is mesos-dns running? have you tried enabling verbose logging for mesos-dns to observe what it's doing?

On Thu, Sep 27, 2018 at 10:01 AM YangJun notifications@github.com wrote:

@jdef https://github.com/jdef

mesos: 1.7.0 marathon: 1.7.50

helloworld app: { "id": "/helloworld", "cmd": "python -m SimpleHTTPServer $PORT", "cpus": 1, "mem": 128, "disk": 0, "instances": 2, "acceptedResourceRoles": [ "*" ], "healthChecks": [ { "gracePeriodSeconds": 300, "ignoreHttp1xx": false, "intervalSeconds": 60, "maxConsecutiveFailures": 3, "path": "/", "portIndex": 0, "protocol": "HTTP", "ipProtocol": "IPv4", "timeoutSeconds": 20, "delaySeconds": 15 } ], "portDefinitions": [ { "port": 10000, "protocol": "tcp" } ] }

nginx-docker app: { "id": "/helloworld-nginx", "cmd": null, "cpus": 1, "mem": 128, "disk": 0, "instances": 1, "constraints": [ [ "hostname", "CLUSTER", "10.240.185.53" ] ], "acceptedResourceRoles": [], "container": { "type": "DOCKER", "docker": { "forcePullImage": false, "image": "nginx:1.15-alpine", "parameters": [], "privileged": false }, "volumes": [ { "containerPath": "/usr/share/nginx/html", "hostPath": "/opt/paas/data/nginx", "mode": "RO" } ], "portMappings": [ { "containerPort": 80, "hostPort": 0, "labels": {}, "protocol": "tcp", "servicePort": 10001 } ] }, "healthChecks": [ { "gracePeriodSeconds": 300, "ignoreHttp1xx": false, "intervalSeconds": 60, "maxConsecutiveFailures": 3, "path": "/", "portIndex": 0, "protocol": "HTTP", "ipProtocol": "IPv4", "timeoutSeconds": 20, "delaySeconds": 15 } ], "networks": [ { "mode": "container/bridge" } ], "portDefinitions": [] }

on slave node: nslookup helloworld.marathon.mesos------OK [image: tim 20180927215647] https://user-images.githubusercontent.com/6802322/46150968-5e73bf80-c2a0-11e8-9bfe-2212691c6b17.png

in nginx-docker: nslookup helloworld.marathon.mesos------FAIL [image: tim 20180927215908] https://user-images.githubusercontent.com/6802322/46151088-9975f300-c2a0-11e8-9f8f-a4aeebe143ad.png

mesos-dns:v0.6.0 config.json: { "zk": "zk://yangjun-centos7-4-0001.novalocal:2181,yangjun-centos7-4-0002.novalocal:2181,yangjun-centos7-4-0003.novalocal:2181/mesos", "masters": ["yangjun-centos7-4-0001.novalocal:5050", "yangjun-centos7-4-0002.novalocal:5050", "yangjun-centos7-4-0003.novalocal:5050"], "refreshSeconds": 60, "ttl": 60, "domain": "mesos", "port": 53, "resolvers": ["8.8.8.8"], "timeout": 5, "httpon": true, "dnson": true, "httpport": 8123, "externalon": true, "listener": "0.0.0.0", "SOAMname": "felix.mesos", "SOARname": "admin.felix.mesos", "SOARefresh": 60, "SOARetry": 600, "SOAExpire": 86400, "SOAMinttl": 60, "IPSources": ["netinfo", "mesos", "host"] }

thanks for your help!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mesosphere/mesos-dns/issues/527#issuecomment-425104071, or mute the thread https://github.com/notifications/unsubscribe-auth/ACPVLGvo-iIFobzc41fTv6EKGr0ndW9Lks5ufNozgaJpZM4W8n8e .

junneyang commented 6 years ago

@jdef mesos-dns runs on 10.240.185.53

in container dig says: image

why 172.17.0.1#53 ? 172.17.0.1 is address of docker0

jdef commented 6 years ago

can you include the output of docker version?

junneyang commented 6 years ago

@jdef `docker version Client: Version: 18.06.1-ce API version: 1.38 Go version: go1.10.3 Git commit: e68fc7a Built: Tue Aug 21 17:23:03 2018 OS/Arch: linux/amd64 Experimental: false

Server: Engine: Version: 18.06.1-ce API version: 1.38 (minimum version 1.12) Go version: go1.10.3 Git commit: e68fc7a Built: Tue Aug 21 17:25:29 2018 OS/Arch: linux/amd64 Experimental: false ` image

junneyang commented 6 years ago

@jdef Is this problem related to the iptables? it seems that the docker0 reply the dig request directly, not the mesos-dns

jdef commented 6 years ago

I see you're trying to use docker bridge-mode networking. Are you using the default docker bridge, as auto-configured by docker OOTB? If not, how have you configured the default bridge network for docker?

Docker networking is a bit funny. See https://docs.docker.com/network/bridge/#connect-a-container-to-the-default-bridge-network

As an aside, consider changing your IPSources to ["host", "netinfo"] unless you're sure you really want what you have.

junneyang commented 6 years ago

@jdef solved!

image

i changed the listener: 0.0.0.0 to 10.240.185.53 problem solved

thanks very much!

jdef commented 6 years ago

Glad you got it working. Closing this out as resolved.