mantl / mesos-consul

Mesos to Consul bridge for service discovery
Apache License 2.0
338 stars 95 forks source link

Question about health checks #65

Open kamaradclimber opened 8 years ago

kamaradclimber commented 8 years ago

I'd like to register health checks on services declared by mesos-consul. This will avoid to rely on the aliveness (and speed) of mesos-consul to clean dead instances and would leverage consul health checking instead.

Mesos already have some health checks (command health check) and might have http health checks (https://reviews.apache.org/r/36816) but I don't know in which endpoint we could see them.

Do other users do that ? Would it be possible to register healthchecks ?

rncry commented 8 years ago

There appears to be some health check functionality in mesos-consul based on tags submitted in marathon, but I can't get it to work :(

kamaradclimber commented 8 years ago

Could you point to the code ?

rncry commented 8 years ago

https://github.com/CiscoCloud/mesos-consul/blob/master/mesos/state.go#L23

Seems to imply it will generate checks based on labels?

kamaradclimber commented 8 years ago

indeed, thanks @rncry I'll test when I have the chance

ryane commented 8 years ago

I recently tested this feature and it does work. Here is a very simple example for an app in marathon:

...
  "labels": {
    "check_http": "http://{host}:{port}",
    "check_interval": "10s"
  },
...

Note that you will need the latest release (v0.3.2) for it to properly find the labels.

rncry commented 8 years ago

Ah ha!! Upgrading from 0.3 to 0.3.2 made this work.. thanks @ryane !

chaen commented 8 years ago

From what I see in the code, Docker checks are not supported yet. Does anyone know whether this is in the pipeline ?

bbayani commented 8 years ago

Can I add checks for mesos-consul itself using this?

We deploy mesos-consul as a marathon application

I added ... "labels": {
"check_http": "http://{host}:{port}/health" } ...

in our marathon json, but I dont see the health-checks added in consul. I have enabled the health-check in mesos-consul and marathon health-check for same works properly.

bbayani commented 8 years ago

The version of mesos-consul i am using is 0.4.0

gusnuf commented 8 years ago

@bbayani I had the same issue. As @ryane stated, the check_interval label needs to be set also.

mohamedhaleem commented 8 years ago

@gusnuf - can you post a sample marathon setup for mesas-consul, please?

gusnuf commented 8 years ago

@mohamedhaleem This feels slightly off-topic but I've provided my mesos-consul Marathon JSON below. I think you should start a new thread if you need further or more general help. Notice that the mesos-consul container is providing health checks both for Marathon to monitor it (via the "healthChecks" section) but also for Consul (via "labels", which is this thread's topic). FYI, I've removed my private docker registry information for how my docker engines find my own build of mesos-consul.

{
  "id": "/mesos-consul",
  "cmd": "/bin/mesos-consul --zk=zk://zk-1.zk:2181/mesos --healthcheck --healthcheck-ip=0.0.0.0 --healthcheck-port=$PORT0 --log-level=INFO",
  "cpus": 0.5,
  "mem": 256,
  "disk": 0,
  "instances": 1,
  "acceptedResourceRoles": [
    "slave_public",
    "*"
  ],
  "container": {
    "type": "DOCKER",
    "volumes": [],
    "docker": {
      "image": "mesos-consul",
      "network": "HOST",
      "privileged": false,
      "parameters": [],
      "forcePullImage": false
    }
  },
  "healthChecks": [
    {
      "path": "/health",
      "protocol": "HTTP",
      "portIndex": 0,
      "gracePeriodSeconds": 10,
      "intervalSeconds": 60,
      "timeoutSeconds": 20,
      "maxConsecutiveFailures": 3,
      "ignoreHttp1xx": false
    }
  ],
  "labels": {
    "check_http": "http://{host}:{port}/health",
    "check_interval": "10s"
  },
  "portDefinitions": [
    {
      "port": 10002,
      "protocol": "tcp",
      "name": "http",
      "labels": {}
    }
  ]
}
mohamedhaleem commented 8 years ago

sorry about that @gusnuf - just wanted to get a working health check configuration. ty

caussourd commented 8 years ago

TCP checks are also missing. Any reason for that?

tungdam commented 6 years ago

I have the same question here. Do we have any news about tcp check ? We tried this

"labels": { "check_tcp": "{host}:{port}", "check_ttl": "30s" }

But consul keeps returning the failed status when our service ( simple redis instance in this case ) is still running well.

we're using: mesos-consul 0.4.0 ( inside a docker container, simply pulled from docker hub ) version 1.4 for both marathon and mesos. The docker container's IP is provided by calico

tungdam commented 6 years ago

The latest image on docker hub is not correct somehow ( though the binary version still shows 0.4.0 ). I build a docker image myself from Dockerfile here ( needed to modify a bit to 's/CiscoCloud/mantl/g' ) and it works well. JFYI