moby / libnetwork

networking for containers
Apache License 2.0
2.15k stars 879 forks source link

overlay network stops working after stack down/up cycles (possible race condition or locking issue) #2081

Open jcmcote opened 6 years ago

jcmcote commented 6 years ago

Following these steps you can reproduce the issues in a matter of minutes. All you need is to bring up a cluster of 2 nodes

create a manager node

docker-machine create --driver virtualbox manager docker-machine ssh manager

add debug setting

echo '{ "debug": true }' > /etc/docker/daemon.json

get dockerd to reload the config

kill -HUP $(pidof dockerd)

check log for releasing of overlay network

tail -f /var/log/docker.log | grep 'releasing IPv4 pools'

start another terminal and do the steps above for a worker node

start another terminal init swarm manager

eval $(docker-machine env manager) docker swarm init --advertise-addr 192.168.99.103

make the worker join the swarm

eval $(docker-machine env worker) docker swarm join --token SWMTKN-1-2duh1guir5ywynuyz2p4w2 192.168.99.103:2377

you should now have a 2 node cluster

eval $(docker-machine env manager) docker node ls

run this until the worker log inidicate it did not release the overlay network as it should

./up-and-down.sh

Monitor the nodes dockerd logs

tail -f /var/log/docker.log | grep 'releasing IPv4 pools'

You'll notice both nodes release the overlay network but sometimes (after a few cycles) the worker node does not release the overlay network and then your in a state where both nodes do not use the same overlay network id. At this point the services are unable to ping each other.

Files needed

up-and-down.sh script brings up and down the stack ping.sh used to ping other service in the overlay network Dockerfile create an image and put the ping.sh script into it docker-stack.yml services to deploy to the swarm

files.tar.gz

jcmcote commented 6 years ago

Related to #1765

selansen commented 6 years ago

@jcmcote , what docker CE version do you use ? I dont see version information here.

selansen commented 6 years ago

I am using 18.02 to reproduce this issue. When I try to reproduce the issue I see below issue. failed to create service x_serva: Error response from daemon: network x_mynet not found Creating service x_serva I think script needs modification. There is no delay between " docker stack down x" and "docker stack deploy -c docker-stack.yml x" . in general we need to wait until all cleanup is done when you want to redeploy again . log messages like below indicates clean up takes time. "eb 20 14:54:06 ELANGO-CE18-2-ubuntu-0 dockerd[21841]: time="2018-02-20T14:54:06.877864376-08:00" level=debug msg="Sending kill signal 15 to container 70060435bcaa63195e5b36051eee7da01c7005676832d9f6747c219acdf08f43" Feb 20 14:54:08 ELANGO-CE18-2-ubuntu-0 dockerd[21841]: time="2018-02-20T14:54".

@mavenugo mentioned in some #issue on how doing right way of script will avoid these kind of issues. I am trying to dig old issues and trying to find it out. Will update again soon.

jcmcote commented 6 years ago

@selansen I'm using docker version 18.02.0-ce. The latest version used by docker-machine with driver virtual box.

When I deploy there is an error saying the network is not yet created. That's fine. I should fail if it can't deploy just yet (after a tear down). However it will eventually deploy with no errors and you'll be in a state where the overlay networks are not cleaned up correctly.

I have reproduced this issue by aggressively deploying (not waiting for stack to come down) but my hunch is that this race condition issue is what we've been experiencing occasionally. Sometimes after an update to our stack (some services or network are changed) we get into a situation where some services can't ping or resolve each other's IP addresses.

I'm hoping someone will be able to use this scenario to explore potential race conditions in the docker network code that might show up occasionally under normal (less aggressive situations).

The point is when we deploy aggressively after a tear down it reports an error which again is fine. But then the system thinks all is ok and the deploy returns successful but leaving the overlay network in an inconsistent state. Why does the system report a successful deploy if it's not ready to deploy ?

selansen commented 6 years ago

May I know how long or how many iteration does it take for you to get into this state?

I have been running the same script for almost 45 mins, I am still able to ping between two containers.

jcmcote commented 6 years ago

it does not take too long (about 10min). But you have to monitor the release and stop the script as soon as there is a release missing. If you don't the script will bring things down again and put things up again.

However if you stop when you see a missing release of the overlay network. Then you'll notice you can't ping and will never be able to (the 2 nodes will not have the same overlay network id)

jcmcote commented 6 years ago

I've modified the up-and-down script. It now counts the number of releases in the manager and the worker. If the counts are not equal it will stop.

I'm at 22 iterations and it has not happened yet... It was much easier to reproduce a couple of days ago. I'll keep at it...

Also I added an init-swarm script which include the steps I use to create my 2 node swarm cluster.

init-swarm.sh.txt up-and-down.sh.txt

ashish235 commented 6 years ago

Having this issue right now. Tried restarting docker, created new swarm, re-created the n/w still the issue exists.

Using docker version - docker --version Docker version 17.12.0-ce, build c97c6d6

OS- ubuntu 16.04 ` "Error": "subnet sandbox join failed for \"10.0.0.0/24\": error creating vxlan interface: file exists",

` Even can't remove a netns file. rm: cannot remove '/var/run/docker/netns/1-bbosggv6eg': Device or resource busy