Closed juwi closed 7 years ago
Just curious, are you using DC/OS, or Mesos directly?
Its a DCOS cluster that was updated from1.7 to 1.8.
Can you use this: https://dcos.io/docs/1.8/administration/overlay-networks/
taskname.marathon.containerip.dcos.thisdcos.directory
taskname.marathon.agentip.dcos.thisdcos.directory
The way it is documented there, without .thisdcos.directory
domain I never got an answer back. Using it the way you just described I at least get an authority section that looks right, however still no answer.
Assuming overlay-testcontainer is the container I want to resolve I did the following:
dig overlay-testcontainer.marathon.containerip.dcos.thisdcos.directory
; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.3 <<>> overlay-testcontainer.marathon.containerip.dcos.thisdcos.directory ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 65147 ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;overlay-testcontainer.marathon.containerip.dcos.thisdcos.directory. IN A
;; AUTHORITY SECTION: dcos.thisdcos.directory. 1 IN SOA ns.spartan. support.mesosphere.com. 1 60 180 86400 1
;; Query time: 11 msec ;; SERVER: 198.51.100.1#53(198.51.100.1) ;; WHEN: Tue Sep 27 08:16:36 CEST 2016 ;; MSG SIZE rcvd: 170
Resolving the same container on mesos-dns using .marathon.mesos
domain now works like a charm, though.
I've created a ticket to fix the doc.
Does overlay-testcontainer.marathon.agentip.dcos.thisdcos.directory
come back?
No, neither of the two do.
When I look through the journals I see lots of errors in the minuteman log though:
Sep 27 10:20:51 purplesubmarine2.local minuteman-env[12436]: 10:20:51.025 [error] Failed to parse task: {function_clause,[{mesos_state_client,protocol,[<<"client">>],[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,439}]},{mesos_state_client,discovery_ports,2,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,423}]},{mesos_state_client,discovery,1,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,411}]},{mesos_state_client,task,3,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,233}]},{mesos_state_client,tasks,5,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,211}]},{mesos_state_client,executors,5,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,191}]},{mesos_state_client,frameworks,4,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,181}]},{mesos_state_client,tasks,1,[{file,"/pkg/src/minuteman/_build/default/lib/mesos_state/src/mesos_state_client.erl"},{line,169}]}]}
So maybe those are related. Doesn't seem to be a mesos-dns issue anymore at this point, though.
@juwi Are you running 1.8.4?
@sargun : Indeed, I am
Hi,
I just tried to resolve an issue with mesos-dns resolving the host IP instead of overlay-network IP for about two days. In the end I found the solution here: https://github.com/projectcalico/calico-containers/blob/master/docs/mesos/DCOS.md So basically I switched the order in IPSources from ["host" "netinfo"] to ["netinfo" "host"]. Would be nice if this was documented more prominently (if it is at all in the mesos-dns doc) so others won't have as much trouble finding their problem.
Regards