gliderlabs / registrator

Service registry bridge for Docker with pluggable adapters
http://gliderlabs.com/registrator
MIT License
4.66k stars 913 forks source link

Registrator is deregistering just-started containers when using Marathon #254

Open rithumlabs opened 9 years ago

rithumlabs commented 9 years ago

When Mesos Marathon is used, the docker container instances start and run fine - registrator will register (add) the dockerized containers and then immediately deregister them. This scenario does not occur for containers running outside of Marathon - i.e. directly via docker.

Note: A weave network is in use as well : 10.2.0.0/16

Example cfg:

consul, where node_instance = 3, bind_ip is the host node's ip

docker run -d \ --name="consulagent$host" \ --net=host \ gliderlabs/consul-server -bootstrap-expect $node_instances \ -data-dir /tmp/consul -advertise $bind_ip

registrator

docker run -d \ --name=registrator \ --net=host \ --volume=/var/run/docker.sock:/tmp/docker.sock \ gliderlabs/registrator:latest \ consul://localhost:8500

sample marathon json, dynamic port assignment

{
  "id": "ha-zmq",
  "args": ["python", "ha_zmq/zmq_test.py", "-i", "1"],
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "dboss_dev:latest",
      "network": "BRIDGE",
      "portMappings": [
        { "containerPort": 0, "hostPort": 0, "protocol": "tcp" },
        { "containerPort": 0, "hostPort": 0, "protocol": "tcp" }
      ],
      "parameters": [
        {"key": "hostname", "value": "ha-zmq.weave.local"},
        {"key": "env", "value": "SERVICE_NAME=ha-zmq"},
        {"key": "env", "value": "SERVICE_TAGS=zmq-test,haproxy"},
        {"key": "env", "value": "WEAVE_CIDR=net:10.2.0.0/16"}
      ]
    },
    "volumes": [
      {
        "containerPath": "/dboss-apps",
        "hostPath": "/dboss-dev/dboss-apps",
        "mode": "RW"
      }
    ]
  },
  "cpus": 0.25,
  "mem": 100.0,
  "instances": 2,
  "constraints": [ 
        ["hostname", "LIKE", "172.18.10.10[1-2]"]
    ]
}

from docker logs registrator

2015/09/18 02:29:53 added: fb0fdfad2459 devnode-1:mesos-20150917-200119..... 2015/09/18 02:29:53 added: fb0fdfad2459 devnode-1:mesos-20150917-200119.....

2015/09/18 02:29:53 removed: fb0fdfad2459 devnode-1:mesos-20150917-200119..... 2015/09/18 02:29:53 removed: fb0fdfad2459 devnode-1:mesos-20150917-200119.....

If I start the same container outside of marathon, registrator has no problem registering it. Also, I did a test where I stopped registrator, then started the containers via marathon and then started registrator and the containers stayed registered - very strange. I also attempted to use the -resync option but nothing happened.

Hope you can shed some light on whats happening here.

andyshinn commented 9 years ago

Can you provide some more information? Versions of components used, host OS, etc?

rithumlabs commented 9 years ago

Tools

Mesos - 0.23.0 Marathon - .010.1 Weave - 1.0.1 registrator / consul-server - latest consul 0.5.2 Zookeeper - 3.4.6-1569965

Environment

VirtualBox - 5.0.0, Windows 8 Vagrant - 1.74 CentOS 7 Guests

My suspicion is there is an issue of how Registrator is confirming of the containers being started with Marathon. Remember that if I start the containers with Marathon and then start Registrator they get added. But if I then destroy the containers and recreate via Marathon the original issue returns. Also I should mention I have another 1 node vbox/vagrant setup that I previously used where I start consul, registrator and all services via Marathon. Not like now where I'm only starting the test service via Marathon. No registration problems previously but I was using the older sstt versions of registrator / consul and I believe Marathon 0.20.0.

rithumlabs commented 9 years ago

any update on the status of this issue? thx

mgood commented 9 years ago

@rithumlabs can you run docker events to see what events are happening when Marathon starts your containers? That might help see what's going on.