mesosphere / mesos-dns

DNS-based service discovery for Mesos.
https://mesosphere.github.com/mesos-dns
Apache License 2.0
484 stars 137 forks source link

Non-unique hostnames in SRV records #115

Closed kensipe closed 9 years ago

kensipe commented 9 years ago

previously working, the SRV records used to provide unique hostnames such as:

;; ANSWER SECTION:
_nginx._tcp.marathon.mesos. 60  IN  SRV 0 0 31000 demo-37r7.c.massive-bliss-781.internal.
_nginx._tcp.marathon.mesos. 60  IN  SRV 0 0 31000 demo-0ug4.c.massive-bliss-781.internal.
_nginx._tcp.marathon.mesos. 60  IN  SRV 0 0 31000 demo-6ke4.c.massive-bliss-781.internal.

however the last version 0.1.1 of mesos-dns provides non unique values:

;; ANSWER SECTION:
_nginx._tcp.marathon.mesos. 60  IN  SRV 0 0 31000 nginx.marathon.mesos.
_nginx._tcp.marathon.mesos. 60  IN  SRV 0 0 31000 nginx.marathon.mesos.
_nginx._tcp.marathon.mesos. 60  IN  SRV 0 0 31000 nginx.marathon.mesos.

the value nginx.marathon.mesos is not resolvable to 1 unique host.

a query in the A record shows:

;; ANSWER SECTION:
nginx.marathon.mesos.   60  IN  A   10.240.43.175
nginx.marathon.mesos.   60  IN  A   10.240.72.62
nginx.marathon.mesos.   60  IN  A   10.240.122.212
kozyraki commented 9 years ago

can you give this a try: https://github.com/mesosphere/mesos-dns/tree/srvfix

It does the following:

mwl commented 9 years ago

I just tried the srvfix branch, but from what I can see it's exactly the same. Unless I'm testing this the wrong way?

$ docker run --rm -it --link mesosdns:dns tutum/dnsutils dig @dns _nginx._tcp.marathon.mesos SRV

Given that mesosdns is a docker container running mesos-dns

kozyraki commented 9 years ago

@mwl I am not sure what is the problem you are having. Can you tell me a little about

mwl commented 9 years ago

These are the dig commands for our elasticsearch service

$ docker run --rm -it --link mesosdns:dns tutum/dnsutils dig @dns _elasticsearch-mesos._tcp.marathon.mesos SRV

; <<>> DiG 9.9.5-3ubuntu0.2-Ubuntu <<>> @dns _elasticsearch-mesos._tcp.marathon.mesos SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44069
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_elasticsearch-mesos._tcp.marathon.mesos. IN SRV

;; ANSWER SECTION:
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31820 elasticsearch-mesos.marathon.mesos.
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31887 elasticsearch-mesos.marathon.mesos.
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31585 elasticsearch-mesos.marathon.mesos.
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31816 elasticsearch-mesos.marathon.mesos.

;; Query time: 7 msec
;; SERVER: 172.17.0.33#53(172.17.0.33)
;; WHEN: Tue Mar 31 09:12:42 UTC 2015
;; MSG SIZE  rcvd: 434
$ docker run --rm -it --link mesosdns:dns tutum/dnsutils dig @dns elasticsearch-mesos.marathon.mesos

; <<>> DiG 9.9.5-3ubuntu0.2-Ubuntu <<>> @dns elasticsearch-mesos.marathon.mesos
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8062
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;elasticsearch-mesos.marathon.mesos. IN A

;; ANSWER SECTION:
elasticsearch-mesos.marathon.mesos. 60 IN A 10.126.142.159
elasticsearch-mesos.marathon.mesos. 60 IN A 10.170.113.147

;; Query time: 4 msec
;; SERVER: 172.17.0.33#53(172.17.0.33)
;; WHEN: Tue Mar 31 09:13:04 UTC 2015
;; MSG SIZE  rcvd: 152

And finally one mesos-dns update roundtrip from the logs. From what I can see it looks like it sees things correctly?

VERY VERBOSE: 2015/03/31 09:11:40 generator.go:127: reloading from master 10.11.142.163
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._tcp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31585
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._udp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31585
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] elasticsearch-mesos.marathon.mesos.: 10.126.142.159
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._tcp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31887
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._udp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31887
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] elasticsearch-mesos.marathon.mesos.: 10.126.142.159
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._tcp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31816
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._udp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31816
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] elasticsearch-mesos.marathon.mesos.: 10.170.113.147
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._tcp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31820
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _elasticsearch-mesos._udp.marathon.mesos.: elasticsearch-mesos.marathon.mesos:31820
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] elasticsearch-mesos.marathon.mesos.: 10.170.113.147
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] mesos-dns.mesos.: 172.17.0.33
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] leader.mesos.: 10.11.142.163
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] master.mesos.: 10.11.142.163
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _leader._tcp.mesos.: leader.mesos:5050
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _leader._udp.mesos.: leader.mesos:5050
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _master._tcp.mesos.: master.mesos:5050
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _master._udp.mesos.: master.mesos:5050
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] master.mesos.: 10.11.142.163
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [A] master0.mesos.: 10.11.142.163
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _master._tcp.mesos.: master.mesos:5050
VERY VERBOSE: 2015/03/31 09:11:40 generator.go:434: [SRV]   _master._udp.mesos.: master.mesos:5050
VERY VERBOSE: 2015/03/31 09:11:40 logging.go:61: {MesosRequests:4 MesosSuccess:4 MesosNXDomain:0 MesosFailed:0 NonMesosRequests:0 NonMesosSuccess:0 NonMesosNXDomain:0 NonMesosFailed:0 NonMesosRecursed:0}
kozyraki commented 9 years ago

@mwl This does not seem to be the srvfix branch. @kensipe Did you have any success with this branch?

I can suspect the following issue: when you type git clone -b srvfix, it gives you a copy of the mesos-dns branch in the directory you are currently working. However, when you type "make build", it builds whatever is the mesos-dns branch you have in $GOPATH ($GOPATH/src/github.com/mesosphere/mesos-dns). Can you check?

mwl commented 9 years ago

I'm pretty sure I checkout the right branch

git rev-parse HEAD
593037921fbd6d53063583fd59be9a565c53f0a5

Though I might, or might not, build mesos-dns the correct way. Is there any way you could provide a docker image with your verified mesos-dns?

mwl commented 9 years ago

Sorry you're right. I misunderstood the build system. Now I have the correct answer

;; ANSWER SECTION:
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31887 elasticsearch-mesos-s0.marathon.mesos.
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31820 elasticsearch-mesos-s1.marathon.mesos.
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31585 elasticsearch-mesos-s0.marathon.mesos.
_elasticsearch-mesos._tcp.marathon.mesos. 60 IN SRV 0 0 31816 elasticsearch-mesos-s1.marathon.mesos.

;; ADDITIONAL SECTION:
elasticsearch-mesos-s0.marathon.mesos. 60 IN A  10.126.142.159
elasticsearch-mesos-s0.marathon.mesos. 60 IN A  10.126.142.159
elasticsearch-mesos-s1.marathon.mesos. 60 IN A  10.170.113.147
elasticsearch-mesos-s1.marathon.mesos. 60 IN A  10.170.113.147

Thanks! That's indeed much more useful.

kozyraki commented 9 years ago

@mwl I've made this mistake about 100 times so far :) Glad this works for you. Let me know if you find any bugs to correct before merging with the master.

3h4x commented 9 years ago

exactly the same issue that I'm having right now. good to know it's fixed already. I will try the srvfix branch

mwl commented 9 years ago

@3h4x I've build a docker image on the srvfix branch.

docker run -d --name mesosdns -v "$PWD/config.json:/config.json" mwldk/mesos-dns:0.1.1-srvfix mesos-dns -config=/config.json

It'll be removed as soon as the srvfix is released.

3h4x commented 9 years ago

@mwl sorry mate, I won't use docker image without Dockerfile that I can see. I just belive in opensource. my automated build on docker hub is on https://github.com/3h4x/docker-mesos-dns

kozyraki commented 9 years ago

@mwl @3h4x I merged srvfix with the master yesterday. While it has not been tested as much as other updates, it seems stable. Please let me know if you find bugs or other issues.

mwl commented 9 years ago

Thanks @kozyraki! The only thing I've noticed is that dnsjava isn't compatible with the update.

mvn dependency:get -Dartifact="dnsjava:dnsjava:2.1.7"
java -cp ~/.m2/repository/dnsjava/dnsjava/2.1.7/dnsjava-2.1.7.jar dig @dns _nginx._tcp.marathon.mesos SRV

It isn't an issue for us though as we're using the built in JNDI resolver for now.

3h4x commented 9 years ago

@kozyraki thanks, now all good in a hood

kozyraki commented 9 years ago

@mwl I cannot reproduce the javadns problem. See what I get below. Can you provide some more info?

$ java dig _nginx._tcp.marathon.mesos ANY ; java dig 0.0 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2822 ;; flags: qr aa rd ra ; qd: 1 an: 3 au: 0 ad: 3 ;; QUESTIONS: ;; _nginx._tcp.marathon.mesos., type = ANY, class = IN

;; ANSWERS: _nginx._tcp.marathon.mesos. 60 IN SRV 0 0 31644 nginx-s2.marathon.mesos. _nginx._tcp.marathon.mesos. 60 IN SRV 0 0 31667 nginx-s1.marathon.mesos. _nginx._tcp.marathon.mesos. 60 IN SRV 0 0 31880 nginx-s0.marathon.mesos.

;; AUTHORITY RECORDS:

;; ADDITIONAL RECORDS: nginx-s1.marathon.mesos. 60 IN A 10.190.238.173 nginx-s2.marathon.mesos. 60 IN A 10.249.219.155 nginx-s0.marathon.mesos. 60 IN A 10.156.230.230

;; Message size: 368 bytes ;; Query time: 25 ms