Closed far-blue closed 7 years ago
@far-blue I am trying to think how you are making the containers talk to other containers without either a. using host network namespace or b. port mapping. For either of these, the scheduler needs to allocate a port to the container on the node where it is running so that no other process uses the same port.
I don't quite see how you might be able to use something like docker link
in nomad since we don't allow that directly.
You don't need to map a port on a docker container to the host to talk to it. Any other container on the same docker network can talk to an exposed port on a container. In my case, I have lots of web services and they are in different containers and they all expose port 80. Then I have a single proxy container that is port mapped to port 80 on the host and which accepts and redirects traffic to the appropriate service containers. This is a very common design pattern.
In fact, with Consul and proxies like fabio, or with nginx and consul-template it is so easy it takes about 15mins to setup. As long as the docker network is the same between containers they don't need to be on the same host either - overlay networking or an underlying 'mesh' network of bridges and tunnels (or ubuntu's new fanNetwork setup) will all allow this.
docker link
is pretty much deprecated as it only works with the legacy 'bridge' network driver so the more common pattern these days is exposed ports and service discovery of some sort - whether that be env vars setup via docker-compose, consul, docker's built-in dns, or some other mechanism.
@far-blue Oh I see your point, on the same host all the containers are running on the same bridge so they can talk to each other. And since every container gets a unique IP from the bridge(or from an overlay network) they don't have any port conflicts either.
So this is something we were going to address when we were going to support multiple IP addresses on the same machine or make it possible for a container to get it's unique IP address since in that case the scheduler doesn't need to allocate any ports since container can use any port on the IP that it is allocated.
Let me think about this a little bit more.
yes, you've got it :) think of the various discussions about Docker and IPv6 in other tickets as being a special case of the more generic situation of all docker containers getting their own IP address. The difference with IPv6 is that the addresses are natively routable. With IPv4, containers have traditionally not had routable addresses but if you use multi-host overlay networking then they become 'internally' routable across the overlay. If you go one step further and create an 'underlay' network below docker then you can create the IPv4 equivalent of the IPv6 network where addresses are routable. In my particular case, containers on any of the hosts in my cluster can route to each other and the hosts themselves can also route to any container (useful for Consul health checks) but access to the network is still only through the hosts at the 'edges' of the cluster. I have a setup similar to those you could create using Ubuntu's new fan network.
In all these cases, ports don't need mapping to the host and the container's IP address and exposed port information is very useful to other services via Consul. Of course, Nomad itself really doesn't need to worry about all this because it can still do its job without any of it. But as Nomad integrates so neatly with Consul it would be much nicer if Nomad could supply this IP/Port data to Consul rather than users being forced to use something like Registrator. Depending on how the network is configured Consul itself will also need the IP/Port info in order to carry out health checks.
Assuming the ability to register the internal container IP with Consul rather than the host IP is not something that has made it into v0.4, does anyone have any ideas when we could see the functionality being made available?
Once this ability is available I can start testing Nomad for use in my production environment for all containers not backed by persistent storage (about 40% of the total) which would be a great step forward for me :)
@far-blue Sorry this hasn't made it to 0.4. We need to re-work the client first to make it handle multiple IPs. It's probably going to land in the next few releases. We don't have an exact time frame for this just yet.
I totally understand :) I'll keep an eye on future releases and look forward to when the functionality is available :)
Ability to register exposed port[s] in consul will be very helpful for using Nomad as container orchestrator in Calico networking.
For the moment I'm using Registrator on each node, set to register the 'internal' container ip with consul. I use labels in the nomad config to control how registrator works.
Overall, it all works very well but it's extra moving parts and I can't even deploy Registrator via Nomad yet because it needs the docker.sock socket file mounted into the registrator container - which Nomad can't yet do.
Hopefully over time these extra bits of complexity can be removed as Nomad catches up with our needs :)
For reference, I'm using Ubuntu's FanNetwork capability which creates a uniform, routable ip space much like Calico will :)
We are looking to use the docker MACVLAN driver and assign each container an individual IP. How is this looking at the moment with regards to Nomad?
I've found one irritation with using Registrator as a Nomad service is that it doesn't have a chance to remove services from Consul when the node is drained (because all the containers are closed down at once). This leaves stale service data and failing health checks behind which is a pain to tidy up.
I'm really looking forward to being able to use Nomad directly to register services - please, please can someone look at implementing this?
We have had success with joyents containerpilot as it takes PID 1 it will manage registration and de-registration. It handles signals sent and allows for a bunch of other options as well. We only use it for consul reg/de-reg at the moment though. Setting "deregisterserviceafter: XXm" takes care of stuff not gracefully shutdown. That said, having nomad inspect container for ip & ports would be ideal.
I generally prefer to avoid the cross-cutting of concerns in my containers and so don't want service registration to be part of the container. Nomad also already supports service registration through the Service {} block in the job spec. The only things it can't currently do are:
The first should be very easy to fix through a request for IP details from Docker and a parameter in the Service block to state which IP to use during registration.
The second could be sorted either by supporting a new 'internal' type for registered ports in a Job or by providing a way in the Service block to specify directly (rather than by name) which port should be used in the service registration.
@far-blue Agreed! I just deployed a test Nomad cluster with an overlay and ran into this. Having the options you describe would allow things to work exactly how I'd like.
This is currently an issue for our ipv6-only container deployment.
@far-blue: I completely agree with this. I'm only proposing this as a temporary workaround to address the current limitations of Nomad.
For anyone else requiring a simple to implement workaround for the direct addressing issue, here's an example of the set up:
{
"consul": "{{.NOMAD_NODE_HOSTNAME}}:8500",
"services": ...
}
"command": "/local/containerpilot",
"args": [
"-config",
"file:///local/containerpilot.json",
"npm",
"start"
]
"Artifacts": [
{
"GetterSource": "http://minio:9000/packages/containerpilot-2.6.0.tar.gz"
}...
Thats it. The container will end up on the correct node in consul, but with it's own IP address.
Is it ideal? No. But it works for now.
@rickardrosen: It's an interesting idea that will likely work well for some people but isn't really great for my needs because you need to install containerpilot into the containers.
As an alternative, I currently run Registrator. It needs to run on each node as it directly monitors Docker through the event system and, as I've learnt the hard way, you can't schedule Registrator through Nomad because it will be shut down before being able to react to other container shutdowns and will leave a mess in Consul. However, I use Salt to manage my nodes so I just run Registrator via systemd and managed with Salt.
Registrator reacts to labels or environment vars in the container so I use labels, such as like this within the Nomad job spec:
task "kibana" {
driver = "docker"
config {
image = "kibana:4.5"
labels {
SERVICE_NAME = "kibana"
SERVICE_CHECK_HTTP = "/api/status"
SERVICE_CHECK_INTERVAL = "25s"
SERVICE_CHECK_TIMEOUT = "2s"
}
}
This sets up a service called kibana with an http health check checking every 25s and with a timeout on the check of 2s. You can also use a 'SERVICE_TAGS' label to setup consul tag data that, for instance, Fabio can use.
In order for this to work, you do need to run Registrator with the --internal startup parameter so it registers the container's IP address rather than the host's.
This is working and is stable for me but it would still be much better to have Nomad handle it natively.
Yeah, it's the never ending spiral of requirements and available features that constantly have to be patched together... =)
We looked in to registrator, and it seemed to do it's thing, even though I wasn't fond of a "third party" to keep track of events. And I was kind of right in this assessment as when docker 1.12 showed up and allowed us to use MACVLAN, registrator stopped working, as there were no events interface in docker > 1.10. Has this interoperability been worked out yet btw?
To expand a bit on our use case, we actually use containerpilot to achieve some of what you describe:
{
"consul": "{{.NOMAD_NODE_HOSTNAME}}:8500",
"services": [
{
"name": "{{.NOMAD_JOB_NAME}}",
"tags": ["{{.SCOPE}}"],
"port": "{{.SERVICE_PORT}}",
"health": "curl --fail --silent http://localhost:{{.SERVICE_PORT}}{{.HEALTH_CHECK_URI}}",
"poll": 10,
"ttl": 50,
"consul": {
"enableTagOverride": true,
"deregisterCriticalServiceAfter": "90m"
}
}
]
}
Injecting containerpilot and it's configuration as artifacts allows us to keep the containerns "unmodified". Tags and other service details is simply injected as env vars and interpolated by containerpilot.
Always interesting to see what others are up to though, thanks for sharing!
I use registrator with docker 1.12 without issues - although with Ubuntu Fan networking, not MacVLan :)
I think what this does emphasise is that for lots of different reasons in lots of different configurations, Nomad really should support registering the container's IP.
@far-blue: I had to test registrator again and as you stated it works well with docker > 1.10 again. Great! But how do you configure the exposed ports properly? I end up with a udp service for every container. Any idea why?
The udp services are mainly down to Nomad. Rather than behaving like 'normal' and just registering the port, it separately registers both tcp and udp mappings. You can see this using docker ps
. Registrator doesn't have any facility to ignore the udp port mappings so it registers both. However, as both point to the same IP address and port I've not found it to have any impact on my cluster :) I have mentioned the odd udp behaviour of Nomad in the Gitter chat and apparently it's 'expected' behaviour. If it bothers you I'm sure Glider Labs would be open to the suggestion of some form of control over the behaviour of Registrator. I didn't bother raising it because it has no impact for me and because I'm eternally optimistic Nomad will start being able to register services natively... :)
One thing to add for anyone using Registrator (and it's something I've mentioned before) - do NOT under any circumstances setup Registrator to point at anything other than a single consul agent per node - preferably a consul agent running on the same node but at a minimum it must be deterministic. I made the mistake of thinking I could just point it at a round-robbin setup of consul agents and things went horribly wrong. While round-robin works for things like dns lookup and the key/value store, it is essential that a service is only ever registered with a single agent and that the same agent is used for de-registration and health checks.
I'm sure this is obvious and no-one would do any different ... unlike me... I speak from experience ;)
If you use a Docker network driver other than host or bridge (default) then that IP:Port will automatically be advertised. You can control this behavior on a per-service
basis by setting address_mode: {auto,driver,host}
Looking forward to being able to ditch Registrator and just use Nomad's Service registration support directly :D
Believe this can be closed now because of the new features @schmichael references that have landed in 0.6.0
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
At the moment (v0.3.2), it isn't possible to specify a port in the service config that has not been registered in the resource block as either a static or dynamic port. When using Docker, however, containers can talk to each other via exposed ports without them needing to be mapped to a host. When using proxies such as 'fabio' and alongside the ip suggestions in #511 it would be very helpful to be able to register an exposed port with consul rather than being limited to only mapped ports.
One way to do this might be to allow port numbers rather than only port identifiers for the service port key. Alternatively, a new type of port (e.g. 'virtual' or 'exposed') in the resources config could allow for the definition of a port identifier that can then be used in the service config.
Maybe I have mis-remembered but I'm sure I saw mention of consul gaining an 'internal / external' split capability. These suggested enhancements for port (and ip, as per #511) would work well with the registering of an 'exposed' port identifier only being allowed for 'internal' services with an 'internal' ip address etc.