Host ports being reassigned on restart

drnic commented 9 years ago

@frodenas we're observing that docker/broker is not using the same host:container port combination on restart (which means that apps cannot connect to the services, and need to recreate the binding). Is this the old behavior? Or a regression from upgrading to docker 1.6?

/cc @djsplice

frodenas commented 9 years ago

Yep, this has been the behavior since we introduced the restart property from docker 1.2 (there's a mention to this behavior at the container.restart property description). We also discussed this with @jbayer at issue https://github.com/cf-platform-eng/cf-containers-broker/issues/2.

I don't see any straightforward solution to this unless we introduce some state to the broker, ie for every single container created it stores some metadata info (like port, ...).

drnic commented 9 years ago

Workaround of the day - dodgy script to rebind apps https://gist.github.com/drnic/cb695417db200bd10f6c

frodenas commented 9 years ago

Awesome!

drnic commented 9 years ago

@frodenas so I'm thinking that instead of https://github.com/cf-platform-eng/cf-containers-broker/blob/master/app/models/docker_manager.rb#L350-L359 where we don't explicitly allocate a host port; and as such Docker later on doesn't respect the contract.

So perhaps at this point we discover the next N ports, explicitly assign them, and then Docker will respect the contract.

Would need a lock around the "get next X ports" so that two "create services" calls don't discover the same ports as available at the same time.

drnic commented 9 years ago

Am now thinking of using a small file on /var/vcap/store to track the last port used - it will also act as the lock around allocations. Starting at 10000 this would give us 53000 ports for the lifetime of the server's persistent disk.

drnic commented 9 years ago

Combine this with double checking if the next port is actually available and it should remove the limit and work forever. If we get to port 63000 then resume at 10000.

frodenas commented 9 years ago

I'm not sure how this is going to prevent using a different port on restarts. The logic you're proposing is what Docker implements. It looks up for the next available port.

What we need is a way to store for each instance service what host port has been allocated. In case of a restart, the broker must query this store and if the service instance already existed, then use the previously allocated port instead of letting Docker pick up a random one.

Also, this won't help in case you're not using static IPs. As the IP can change, the service instance must be bound again in order to pick up the new IPs.

drnic commented 9 years ago

@frodenas the untested assumption is that I were to explicitly tell docker daemon what host port to use then it will restart containers on the same host port; my proposal works off this assumption.

The static IP/not is outside of scope - deployments IRL could use static IPs or a TCP proxy to manage the continuation of hostname over time.

frodenas commented 9 years ago

But my question is still the same, how do you know what host port to expose when you have more than 1 instance service of the same plan?

drnic commented 9 years ago

Sorry for not explaining that well enough above. I'll try to think of a way to better describe it.

As an aside, I've tested that docker will restart processes with the same host port if you explicitly allocate the host ports at run time:

# docker --host unix:///var/vcap/sys/run/docker/docker.sock ps
CONTAINER ID        IMAGE                         COMMAND              CREATED             STATUS              PORTS                                                       NAMES
253ee092e502        cfcommunity/logstash:latest   "/scripts/run.sh "   19 seconds ago      Up 18 seconds       514/tcp, 0.0.0.0:10004->9200/tcp, 0.0.0.0:10005->9300/tcp   modest_euclid           
5330ace84e4d        cfcommunity/logstash:latest   "/scripts/run.sh "   28 seconds ago      Up 27 seconds       514/tcp, 0.0.0.0:10002->9200/tcp, 0.0.0.0:10003->9300/tcp   goofy_bell              
8eab6128a963        cfcommunity/logstash:latest   "/scripts/run.sh "   40 seconds ago      Up 39 seconds       514/tcp, 0.0.0.0:10000->9200/tcp, 0.0.0.0:10001->9300/tcp   determined_mcclintock   
# monit stop all
# wait til all stopped
# monit start all
# docker --host unix:///var/vcap/sys/run/docker/docker.sock ps
CONTAINER ID        IMAGE                         COMMAND              CREATED              STATUS              PORTS                                                       NAMES
253ee092e502        cfcommunity/logstash:latest   "/scripts/run.sh "   About a minute ago   Up 2 seconds        514/tcp, 0.0.0.0:10004->9200/tcp, 0.0.0.0:10005->9300/tcp   modest_euclid           
5330ace84e4d        cfcommunity/logstash:latest   "/scripts/run.sh "   2 minutes ago        Up 2 seconds        514/tcp, 0.0.0.0:10002->9200/tcp, 0.0.0.0:10003->9300/tcp   goofy_bell              
8eab6128a963        cfcommunity/logstash:latest   "/scripts/run.sh "   2 minutes ago        Up 2 seconds        514/tcp, 0.0.0.0:10000->9200/tcp, 0.0.0.0:10001->9300/tcp   determined_mcclintock

drnic commented 9 years ago

I will code a PR to explain

On Tue, May 26, 2015 at 11:00 AM, Ferran Rodenas notifications@github.com wrote:

But my question is still the same, how do you know what host port to expose when you have more than 1 instance service of the same plan?

Reply to this email directly or view it on GitHub: https://github.com/cf-platform-eng/cf-containers-broker/issues/17#issuecomment-105619704

drnic commented 9 years ago

@frodenas proposal at https://github.com/cf-platform-eng/cf-containers-broker/pull/20

drnic commented 9 years ago

Ferdy, want to catch up in person and chat about this?

vlerenc commented 9 years ago

We have noticed the same in our Swarm setup. We also considered implementing something similar to you @drnic. Basically, do not let Docker chose a port on container (re)start (Docker is actually following the contract just fine - it's just that the current implementation asks for automatic port mappings and this is what Docker provides), but statically allocate ports. This works fine with Docker.

However, it is kind of sad to introduce state here on the broker. Also, if you take the swarm setup, you run out of ports more quickly, because in order to statically assign ports, you need to have them/keep book of them, but since you don't know where the Swarm scheduler will deploy the container, you have one port range for all instances, which becomes a limitation sooner or later. You went from 10000 upwards, but maybe it would be safer to stay out of the ephemeral port range? Either way, the number of ports is finite, but the Docker nodes in Swarm are not.

For now we haven't patched the issue, because the ideal solution would be that rather Docker/Swarm finds a solution that has the following properties:

Does automatic port mapping on initial start
Reuses that port on restart or fails That would keep the state with Docker (that has state anyway) and not the broker (that doesn't have state). And we wouldn't have to decide about ports a priori and hence this wouldn't become a bottleneck with increasing Swarm sizes.

In any case, if you or someone else really likes to take this up, please also consider the Swarm setup as this is a great means to have horizontal scaling on the service containers.

frodenas commented 9 years ago

This issue has been fixed at commit https://github.com/cf-platform-eng/cf-containers-broker/commit/e9232589ed0363ee814dfffb6c0279e4eaa917fe. I ran several tests, using both docker and swarm, killing/restating vms, and now the host ports are preserved.

Basically, it uses a similar approach as https://github.com/cf-platform-eng/cf-containers-broker/pull/20 but without state. In case the broker is restarted, it queries the docker api to see what ports are already allocated, to not reuse them again. This approach also works when using docker swarm, but there's an scalability issue. This approach uses the host ephemeral port range (usually from 32768 to 61000), so you can have a theoretically maximum of 28000 docker containers running at the same time, because the cluster is seen as a single host (we don't known where the container is going to land, so we cannot check the ports already used for a particular host).

I also added an option to enable/disable this behavior via the global allocate_docker_host_ports setting (see SETTINGS.md).

Thanks @drnic for bringing this issue to the table!

vlerenc commented 9 years ago

Great, thank you Ferran. That will definitely buy everybody some time (until the scalability issue hits), while in the meanwhile users will not have to rebind their apps when a node crashes.

Going for a stateless solution is a great idea. I see one caveat though of asking the Docker nodes/not holding state: if for whatever reason the broker crashed and a node as well that isn't up while the broker initializes/determines the used ports, the broker will not get the full picture. If that happens, it can happen later that the broker assigns a port to a new container that by bad luck will be scheduled on the Docker node that was down/is now up again/has already a container running on that port. The error message may be somewhat confusing, but the good thing is that it won't even start.

Thank you again for your fix/change. In the long run I hope Swarm comes up with an idea. Certain features like static port mappings just make little sense in a Swarm cluster if you have no clue (and should have no clue) on which node the container will be scheduled. Ideal would be if automatic port mappings will preserve the port/the Docker daemon can be told to rely on "its own port range".

vlerenc commented 8 years ago

Ah, by the way, not sure whether that was intentional: the ports of deleted service instances/containers are not reused. Eventually, after restart, I understand the gaps will be filled again, but while the broker runs, the ports aren't released, I believe.

cloudfoundry-community / cf-containers-broker

Host ports being reassigned on restart #17

But my question is still the same, how do you know what host port to expose when you have more than 1 instance service of the same plan?