confluentinc / cp-docker-images

[DEPRECATED] Docker images for Confluent Platform.
Apache License 2.0
1.14k stars 704 forks source link

[cp-zookeeper] Include health check script for Kubernetes etc. #358

Open jtv8 opened 7 years ago

jtv8 commented 7 years ago

When deploying with Kubernetes, a means of checking the health of the Zookeeper container is advisable. This can be achieved (to an extent) with a TCP socket probe, but this approach will result in a lot of log warning spam in Zookeeper:

[2017-10-11 13:53:10,276] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
        at java.lang.Thread.run(Thread.java:745)

The official Kubernetes project has solved this with a simple shell script that invokes Zookeeper's RUOK command: https://github.com/kubernetes/contrib/blob/master/statefulsets/zookeeper/zkOk.sh

A simple implementation for cp-zookeeper would be:

#!/bin/bash
OK=$(echo ruok | nc 127.0.0.1 $ZOOKEEPER_CLIENT_PORT)
if [ "$OK" == "imok" ]; then
    exit 0
else
    exit 1
fi

Could we please have this script, or something like it, included in https://github.com/confluentinc/cp-docker-images/tree/3.3.x/debian/zookeeper/include/etc/confluent/docker in the next release?

gAmUssA commented 6 years ago

@zerogjoe you can do the following in your deployment manifest (container spec section) without external scripts (assuming you can template zookeeper template port)

         livenessProbe:
            exec:
              command:    ['/bin/bash', '-c', 'echo "ruok" | nc -w 2 -q 2 localhost 2181 | grep imok']
            initialDelaySeconds: 15
            timeoutSeconds: 5
          readinessProbe:
            exec:
              command: ['/bin/bash', '-c', 'echo "ruok" | nc -w 2 -q 2 localhost 2181 | grep imok']
reitzig commented 3 years ago

Doing what @gAmUssA suggests in a docker-compose setup, I'm seeing warnings and low performance in the clients.

IsmailMarmoush commented 2 years ago

Here's a one liner, which doesn't produce warnings

healthcheck:
      test: echo ruok | nc 127.0.0.1 2181 || exit -1
      interval: 10s
      timeout: 5s
      retries: 3

And don't forget to add the following env var, according to https://github.com/confluentinc/cp-docker-images/issues/827

environment:
      KAFKA_OPTS: "-Dzookeeper.4lw.commands.whitelist=ruok"