Open stevvooe opened 8 years ago
@aluzzardi This is related to what we discussed today.
@mavenugo Is there any progress on making this happen?
@stevvooe How does this map with #1193 since we've adopted <e.ServiceAnnotations.Name>.<Slot>.<TaskID>
as a model?
@aluzzardi That is just the naming convention, which goes left to right. DNS goes right to left. Completely compatible.
The only open item is the consistency of active slots. We need a way for tasks to discover all of the hostnames of other tasks, via DNS or other, regardless of DNS-RR or VIP mode. This will allow us to support host-identity based services, like zk, etcd, nats, etc.
Perhaps I'm confused, but shouldn't DNS SRV records will perfectly match for this use case?
Regarding zookeeper config, as stated by documentation, the config file file should look like:
...
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
...
A Zookeeper image with a defined entrypoint, could execute:
nslookup -querytype=srv _zookepeer._tcp.swarmmode.com
# _service._proto.name. TTL class SRV priority weight port target.
_zookepeer._tcp.swarmmode.com. 100 IN SRV 10 10 2888 zookeper.1.swarmmode.com
_zookepeer._tcp.swarmmode.com. 100 IN SRV 10 10 2888 zookeper.2.swarmmode.com
_zookepeer._tcp.swarmmode.com. 100 IN SRV 10 10 2888 zookeper.3.swarmmode.com
And echo records for each server to config file.
Each time a service-task is executed, it should publish SRV entries at swarm internal DNS, and when stopped, removed from it. Of course, I dare to say this without having any clue about Docker internal DNS (or whatever it uses)
Regards.
@mostolog That is somewhat the goal here but we first need a plan for mapping these into the SRV records in a consistent manner. At this point, it is fairly ad-hoc, which is very disappointing.
I have been reading #192 and trying to reply, but at the end I just think I probably lack the needed knowledge to discuss this.
I'll only say that, IMHO, it makes much more sense to have abcdef-1.job-0.cluster-0 rather than abcdef.1.job0.production.cluster0.
@stevvooe Somehow related to this:
docker run -h myhost...
adds an entry on /etc/host with: ip container-name Is there any way to specify a domain or set this entry to FQDN? ie: ip myhost.domain.com on /etc/host
Seems doing:
docker run -h myhost.domain.com...
it's not very polite, cause hostname should be only myhost
Is this a missing feature I could request on https://github.com/docker/docker ? Am I missing something? Thanks
Like I've said before, this isn't about a single fix. There needs to be a concerted effort to manage the DNS name mapping.
Will this allow the routing mesh to route requests to different services running on the same port? Is that even being planned for the routing mesh dns stuff???
@vovimayhem No, this is more about mapping services into DNS. To multiplex services on the same port, you'll need to introduce an L7 load balancer to manage that in your infrastructure.
Thank you for clarifying that! I've been searching for clear info (a confirmation) about this for almost a month now!
@stevvooe I'm back! :stuck_out_tongue_closed_eyes: Have you considered enabling DNS registering in out-of-docker DNS server?
It will also be interesting being able to register under a specific domain tree Background-related to this: could a docker node create multiple swarm clusters (each having a domain scope)?
@mostolog Let's keep the discussion focused on the proposal at hand. The presented ideas are interesting but they are orthogonal to the goal of creating a clear DNS-based service discovery, which is the topic at hand.
I like your naming scheme proposal. What I'm wondering is if a container's canonical fully-qualified hostname could be the proposed task DNS name instead of simply the container ID? For example, if some application in the container does the equivalent of gethostname()
and then getaddrinfo()
it should return abcdef.1.job0.production.cluster0
instead of simply abcdef
.
This would allow applications in containers to provide more useful hostnames to other applications, e.g. Apache Spark.
@doxxx Currently, docker containers have an expectation that their hostname is the container id (not the task id). This is insufficient, as it is not unique (truncated to only 48 bits of entropy!) and provide no notion of location. I am not sure if we can change this to something more correct inside the container.
With the way UTS namespace work, I would expect us to set the hostname to the abcdef
and the domainname to 1.job0.production.cluster0
(corresponding to the slot). Assembling these would require calling gethostname
and getdomain
, resulting in the FQDN for the task.
Don't know if close-related or related-enough to be considered...
I'm starting to get some "unfriendly" experiences with containers having too-long hostnames like: project-service-swarmnodeidwhichactuallyisquitelong-#slot created using 2.13 template naming. eg: {{.Node.ID}}.
Perhaps it would be great to be able to allow "nearest" container resolution a partial name (not FQDN). ie: A configuration asking for "mysql" running on a container under "com.domain.app" should look for "com.domain.mysql" while another running under "com.domain.sub.whatever.app" "com.domain.sub.whatever.mysql"
Is that already designed that way? Does it make sense? It is not related at all? Thanks ;)
@mostolog I think that is a reasonable assumption. One example might be having task 1 hit task 2
with just 2
, since they are in the same job. We already do this to some degree but there will need to be clever setup in the domain naming approach.
Thanks!
Hi again.
Reviewing my notes I just confirmed docker stack deploy --compose-file stack.yml mystack creates services named like:
mystack_fooservice mystack_barservice
Already asked if dash can be configured instead of underscore on forums, but I was wondering if this won't have any effects on this issue (aka: define a FQDN for host containing "_") or something else. Just to let you know.
Already asked if dash can be configured instead of underscore on forums, but I was wondering if this won't have any effects on this issue (aka: define a FQDN for host containing "_") or something else. Just to let you know.
This is a HUGE bug.
@dnephin Are you guys going to fix this? Supporting _
in service names is a huge nono and will break any hope for a reasonable future. I'm surprised these passed validation.
Until we have server side stacks I think it's a mistake to change this. We kept it consistent with Compose knowing that either namespaces or server side stacks would be a major change, and we didn't want to change it twice.
Services in a stack should be referenced by the scoped name (without the underscore) anyway, so the underscore shouldn't be relevant for anything except for the CLI.
Services in a stack should be referenced by the scoped name (without the underscore) anyway, so the underscore shouldn't be relevant for anything except for the CLI.
Phew!
Are there underscores in the actual service names? Because we to avoid having anything in the resolution path not be supportable in the future.
Services in a stack should be referenced by the scoped name (without the underscore) anyway, so the underscore shouldn't be relevant for anything except for the CLI.
If a service provides it's fully-qualified domain name to another service (e.g. Apache Spark Master and Workers), that would include the underscore which has been known to cause problems in URL validation code in Spark.
Services in a stack should be referenced by the scoped name (without the underscore) anyway, so the underscore shouldn't be relevant for anything except for the CLI.
Although compose works correctly when using link/depens_on, wouldn't this still be a problem when setting service names within container configuration files? eg:
services:
mysql:
...
php:
...
docker stack deploy --compose-file file.yml my
mystack_mysql.1...
mystack_php.1...
what name should I set on php's mysql_connect? IIUC, I have to actually use "mystack_mysql"
Are there underscores in the actual service names? Because we to avoid having anything in the resolution path not be supportable in the future.
Not to me...
Although compose works correctly when using link/depens_on, wouldn't this still be a problem when setting service names within container configuration files?
I don't think links or depends_on is supported in a distributed environment. There are odd scheduling issues that come up when you start introducing these kinds of features.
I tried this out and looks like we are pushing underscores into service names:
$ docker service ps redis-test_redis
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ehi3v814uudq redis-test_redis.1 redis:latest docker-XPS-13-9343 Running Running 2 hours ago
w2iti40qz180 redis-test_redis.2 redis:latest docker-XPS-13-9343 Running Running 58 seconds ago
vu3bgifqim3r redis-test_redis.3 redis:latest docker-XPS-13-9343 Running Running 58 seconds ago
i34cxloom9sq redis-test_redis.4 redis:latest docker-XPS-13-9343 Running Running 58 seconds ago
raqf7lhkp9b0 redis-test_redis.5 redis:latest docker-XPS-13-9343 Running Running 58 seconds ago
This might hold this proposal back.
Screw others! Change the world! Power to the people! DNS schema! ;)
Will this address this: I have a docker swarm on which say I deploy multiple instances of MongoDB. The instances on doing docker service ls services_mongo will show something along these lines
service_mongo.1 service_mongo.2 service_mongo.3 My web services need individually pingable names for the mongo service and the url for mongo is composed of taking all 3 names. On swarm the ping for individual instance is not supported today is what I understand it as based on https://github.com/moby/moby/issues/30546
If I have the above solution, in my yaml compose file, I can just set the replica to 3 for the mongo service block and I know I can access the services via above named entries. Is this possible? Without this, I am creating 3 separate service blocks to achieve the same.
Hi @stevvooe, is there any progress here? It seems like other attempts aimed at simple addressing for {{.Task.Slot}}.{{.Service.Name}}
within swarm have been abandoned while waiting for this... Is there any way to resurrect https://github.com/moby/moby/pull/24973 to alleviate the problem in the short term in a way that would be compatible with your future work here ?
Yes, this would be really helpful! Because it should be possible to resolve through {{.Task.Slot}}.{{.Service.Name}}
for cluster setups like Redis cluster join.
It is not very useful currently:
root@redis:/# nslookup 10.0.16.5
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
5.16.0.10.in-addr.arpa name = echo_app.3.8tppck42ohyz5j1dq49al80r3.echo_net.
Authoritative answers can be found from:
root@redis:/# nslookup echo_app.3.8tppck42ohyz5j1dq49al80r3.echo_net
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: echo_app.3.8tppck42ohyz5j1dq49al80r3.echo_net
Address: 10.0.16.5
root@redis:/# nslookup echo_app.3.8tppck42ohyz5j1dq49al80r3.
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: echo_app.3.8tppck42ohyz5j1dq49al80r3
Address: 10.0.16.5
root@redis:/# nslookup echo_app.3.
Server: 127.0.0.11
Address: 127.0.0.11#53
** server can't find echo_app.3: NXDOMAIN
(thanks for pointing me here @thaJeztah )
Overall, it is good to get this standardized. I think most of the naming convention here makes sense - essentially following the DNS rule of most-specific-to-least-specific from left-to-right. I see a few issues.
<task>.<slot>
feels reversed. Although the how it works docs say that a task is analogous to a slot (implying 1:1), the language here speaks more of the slot name. If we have 3 replicas of nginx, then the grouping is nginx, and we have slots 1
, 2
, 3
(or 0
, 1
, 2
, if you prefer). So if we want to dot-separate, it would make more sense to be 1.nginx.production...
and 2.nginx.production...
etc. rather than nginx.0
.<task>.<slot>
: If having just the single digit feels strange (it does to me, but my bias), then give it a name <task>-<slot>
(or <task>_<slot>
) and make it an atomic part of the name. FWIW kube does that with statefulsets and it works pretty well (although consistency of restart is important there; more below).<task>-<slot>
or similar as the hostname makes a lot of sense. I also think injecting some other vars that would be in the FQDN above (namespace, etc.) might be useful.I will admit I am cheating; I have done some heavy kube lifting, run into some of these issues, and seen how the availability of things like hostnames and env vars make a big difference.
One more point: if we start having predictable hostnames (or at least resolvable names) for instances, people will come to expect that they are consistent. If I can reach a particular container via nginx.0
(or 0.nginx
or nginx-0
), then if it dies, and swarm starts a new one, is that nginx-0
too? Or nginx-3
? It might not matter for stateless nginx
, but it sure does for things like etcd
or zookeeper
or redis
etc.
Finally, any intent to put these into non-swarm mode (compose)? Or is the general thrust that over time non-swarm mode will cease to exist, and a single docker engine is just a single-node swarm?
To follow this up I would like to drop in docker/libnetwork#1855 here. At some point, we need predictable hostnames because otherwise setups, like cluster setups as mentioned in moby/moby#30546, in a dynamic environment wouldn't possible. Its not a fact about the container itself, because the container is only an envelope for the running process, e.g. Redis inside. This process itself is defined by the given configuration and in this configuration you would like to use resolvable hostnames, because they stay static nginx.0 or 0.nginx or whatever.0.you.want.1 in the config but the ip will not.
The same is true for config mounting. You have a share with
/share/0.nginx
/share/1.nginx
....
Then maybe you want to use -v /share:/share
and in the config inside the container you can then use (pseudo config): datadir=/share/$HOSTNAME
. This is possible via --hostname={{.Service.Name}}-{{.Task.Slot}}
already today, but you will not be able to resolve this hostnames via DNS. Why?
Yes I know that there is a problem with scaling setups, but you won't scale setups which are stateful like a cluster with three nodes. But you want this service to be deployable via Swarm, for example, with zero configuration. And therefore hostnames should also be DNS registered like in my PR.
I have a fork that I'm testing in my local environments with the following changes to Docker Engine 17.06 and 17.07 (some hard-coded logic for now, but with the intention to move to templates and submit applicable PRs):
The shared IPAM server processes Docker IPAM driver calls, as well as handles API calls from a PowerDNS authoritative resolver. I am planning to delegate a subdomain, such as swarm.mycompany.com to my Swarm-aware IPAM-backed PowerDNS server, so our Docker hosts and development workstations can resolve instantiated containers, even if they move between Docker Swarm hosts (as with the above changes, the volume names, hostnames, container names, and IPs remain static to the instantiated slot instance).
Ideally, we would likely resolve such as: 1.consul.uat.swarm.mycompany.com. This service and slot ID ordering, as I believe it was mentioned however, goes against the current standard, which is for the service name to come first, then task slot ID, then task ID -- such as: consul.1.yp6q0y0f48qz24z90mlt7oq7t
I would love a more integrated solution, however -- especially before I get too deep into using this. Anything Moby can offer to assist in these areas is very much welcomed.
Please note that I have quite a bit of time available to work on items in these areas, as this is an area my company is extremely invested in to manage our next infrastructure design, so if you have thoughts on items in these areas I may be able to provide time: at minimum, for testing, and at most, partial or complete code.
CC @mavenugo, as I've spoken with him about my environment in the past, and he may be interested in more (updated) scope
I also agree that DNS should be arranged such as: [slot id].[service id].[stack].[cluster] -- in order of smallest to largest, as that is how domains are organized
Hi @stevvooe
We are making a few tests with swarm, stacks and DNS If our stack is deployed "normally", containers will see each other and they'll have randomized hostnames However, when setting hostnames, the aren't able to resolve/dig each other. There seem to be a bunch of related issues already open.
It seems this topic has spread into multiple/zillions of issues, making it hard to follow/mantain What about an [epic] issue to summarize all them?
IMHO this topic is also taking too long to be define, although doesn't seem is an implementation blocking. Is there anything we could do to push it forward?
Thanks
This issue is the overall discussion, the great solution. Look at docker/libnetwork#1855, there I've made a PR with a possible solution which I guess is what you need.
@kleinsasserm yes, but it has been idle for quite a while, isn't it? Why is hasn't been approved/merged? Are we missing some roadmap plans? Is that a bad practice we shall avoid?
To sum up: we are on the same car and - I don't know why - but it seems we aren't moving.
Yes, sad but true. Every six days or so I try to refresh my PR. Currently I was on vacation. I even don't know why my PR is not going to be merged. But I will still try it. Thank you for your response on my PR!
How could we get a view of DNS records stored within docker?
https://github.com/moby/moby/pull/31710
Something like the following might do the trick. Prints info for containers on attachable networks and the VIP for services using that endpoint mode. Does not print:
network inspect --verbose
docker network inspect --verbose --format '{{range $name, $service := .Services}}{{if eq $name ""}}{{range .Tasks}}{{.Name}} {{.EndpointIP}}{{printf "\n"}}{{end}}{{else}}{{$name}} {{$service.VIP}}{{end}}{{end}}' $NETWORK |column -t
Thank @trapier
As you said, it prints service's vip, but it doesn't contain any reference to containers (id) that could help me understand why if containers are unnamed (hostnamed random by default) they are able to resolve, while setting a specific hostname they can't.
@stevvooe Any comments on this issue/PR/behavior?
Oh, I just need some time to find it again, and maybe I mentioned it not clearly enough in my PR. So I debugged it once again. Lets assume the following two swarm stacks echo
and echoa
, which upon started will create these containers.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4740fde7c5ac n0r1skcom/echo:latest "python3 -u /echo.py" 3 minutes ago Up 3 minutes 3333/tcp, 3333/udp echoa_app.2.z8weq0rralpx71to3luhi2oss
e2a90f29b495 n0r1skcom/echo:latest "python3 -u /echo.py" 3 minutes ago Up 3 minutes 3333/tcp, 3333/udp echoa_app.1.7fk8h5sct8png4r7xbu5ox34w
f80aa02479fb n0r1skcom/echo:latest "python3 -u /echo.py" 5 minutes ago Up 5 minutes 3333/tcp, 3333/udp echo_app.2.yx5j98hhcgk6pnev8hqauq2to
e7c58268f4b4 n0r1skcom/echo:latest "python3 -u /echo.py" 5 minutes ago Up 5 minutes 3333/tcp, 3333/udp echo_app.1.og46rf0bp8tsl1v4z8ultomnt
The stack echo
is started with no hostname
parameter in the swarm compose file, so the CONTAINER ID
is equal with the hostname in the container itself. This point is important!
# docker exec echo_app.2.yx5j98hhcgk6pnev8hqauq2to hostname
f80aa02479fb
So, if I do the same for a container which is started with the hostname
parameter inside the swarm compose file, this will be different!
# docker exec echoa_app.1.7fk8h5sct8png4r7xbu5ox34w hostname
echoa_app1
BUT I am still able to nslookup the CONTAINER ID
from inside this containers:
docker exec echoa_app.1.7fk8h5sct8png4r7xbu5ox34w nslookup 4740fde7c5ac
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: 4740fde7c5ac
Address: 10.0.1.3
Do you see it? I can lookup the CONTAINER ID
of the echoa_app.2 container from inside the echoa_app.1 container without any problems. What I found out is, that always only the CONTAINER ID
is used for DNS registration, never the hostname.
What I did in my PR is to use sb.config.hostName
from the container sandbox object to register the hostname in addition. Normally the code only uses n.ID()
. If the hostname is the same as the CONTAINER ID
nothing happens. If they are different, then the hostname will be DNS registered additionally.
Conclusion: As it still looks to me from the code, only CONTAINER ID
is used for DNS registration, never the hostname (what my PR does). But any other hint is welcome. And sorry if I am wrong.
@kleinsasserm Thanks for such a clarifying explanation. Indeed, everything lead to that (now scientifically-supported) conclusion.
I guess the main concern is not registering hostnames, as they can be non-unique, they can change IPs along time...but still it makes much sense to me to merge your PR. Or maybe it's just an Imbusywithmoreimportantstuffatthismoment.
Anyway, I think we still have to way for an answer from @docker-team
Thank you for your response, you are welcome! True, we will have to wait. :innocent:
@mostolog This issue is about the schema of mapping tasks to DNS in a more sane manner. For discussions regarding the current behavior, open another issue in moby.
😄 I have already opened one here moby/moby#34239 and there docker/swarmkit#2325 which is referenced in the PR docker/libnetwork#1855 that I mentioned here. No need to open up another one which will probably get closed as duplicate but what it really needs is, that someone of the maintainers should have a look there.
@stevvooe IMHO it's related enough to be discussed here, as we are discussing why node names whould be added into DNS tree.
AKA: if I mess my containers having the same names, docker could complaint about it, or even dont work at all, but if they are going to be unique (AKA: using {{.Task.Slot}} template)...let me do it! https://github.com/docker/swarmkit/issues/1242#issuecomment-272039199 https://github.com/docker/swarmkit/issues/1242#issuecomment-272110280 https://github.com/docker/swarmkit/issues/1242#issuecomment-319943259 (bullet #3)
@mostolog @kleinsasserm Sorry but this is not a general hostname discussion thread. This issue is about the schema for mapping names to DNS. If there are other relevant discussions already open, have them there. I know it is annoying, but we need to keep the discussion topical.
The issues you are describing should be discussed on the issues in moby. If you want me to join those discussions, pull me in. I have no clue why your PR isn't merged and I am not familiar with the current behavior. I'll try to help as much as I can, but please be patient.
As far as getting this done, I am not really working in this area any more. If someone wants to take on a more complete proposal, I would be more than willing to support and advise.
OK.
I agree with the schema proposed here in the first message and close moby/moby#30546, As i say in the last issue, i think each component should be multi naming (ex: service.taskid, cluster.taskid, etc.)
When will this feature be available ? I would like to use the <Slot ID>.<Service>
or <Service>.<Slot ID>
.
I'd also like to echo support for this. It's been 3 years since this issue was opened, and there's still no easy way to configure things like ZooKeeper.
If I understand this issue correctly, there is still no way to assign predictable DNS names to tasks inside a Swarm. Is that correct?
I was expecting to be able to use the {{.Task.Name}}
template and resolve my tasks to mytask.1
, mytask.2
... but those task names are always suffixed with a random id (mytask.1.p8d7aufb80h8f8dwtfcmsyzy4
).
While there has been discussion in https://github.com/docker/docker/pull/24973 and https://github.com/docker/swarmkit/issues/192, the adoption of a clear schema for mapping service resources into the DNS space is unclear.
The following presents a schema for mapping cluster-level FQDNs from various components:
<cluster>
<cluster>
local
,cluster0
<namespace>
<namespace>.<cluster>
production.cluster0
,development.local
,system
<node>
<node>.<cluster>
node0.local
<job>
<job>.<namespace>.<cluster>
job0.production.cluster0
<slot>
<slot id>.<job>.<namespace>.<cluster>
1.job0.production.cluster0
<task>
<task id>.<slot id>.<job>.<namespace>.<cluster>
abcdef.1.job0.production.cluster0
@mavenugo @mrjana @aluzzardi