confluentinc / cp-docker-images

[DEPRECATED] Docker images for Confluent Platform.
Apache License 2.0
1.14k stars 704 forks source link

Unable to override default dataDir and LogDir for zookeeper #608

Open gunjanarora opened 6 years ago

gunjanarora commented 6 years ago

Hi,

Unable to override the default data and log dir as it appears to be hard coded in zookeeper.properties.template

Also, even if they point to /var/lib/zookeeper/data and /var/lib/zookeeper/logs, if my PVC is mapped to /var/lib/zookeeper, the /data and /logs directories are created everytime I restart my kubernetes pods.

This results in loss of any state changes.

prasannakumarpv commented 6 years ago

That's because of VOLUME directive in docker file at https://github.com/confluentinc/cp-docker-images/blob/master/debian/zookeeper/Dockerfile VOLUME ["/var/lib/${COMPONENT}/data", "/var/lib/${COMPONENT}/log", "/etc/${COMPONENT}/secrets"]

this results in data and log dirs getting mounted on the host running this docker/pod and not in the storage mounted through PVC. Fix should start by removing this as both data and log dir are getting mounted in the statefulset yaml. Also, mkdir -p would then be unnecessary as part of RUN.

GMartinez-Sisti commented 5 years ago

This might help for this and all the issues attached: https://github.com/gdraheim/docker-copyedit (found here https://stackoverflow.com/questions/44020785/remove-a-volume-in-a-dockerfile).

bluebenno commented 5 years ago

I can confirm this behaviour when running under kubes. Don't restart all the zookeeper nodes at once or you will be sorry!

You can modify (perhaps read as hack) the containers entrypoint:

 "ZOOKEEPER_SERVER_ID=$((${HOSTNAME##*-}+1)) && mkdir -p /var/lib/zookeeper/LOG /var/lib/zookeeper/DATA && echo $ZOOKEEPER_SERVER_ID > /var/lib/zookeeper/DATA/myid && /etc/confluent/docker/configure && cat /etc/kafka/zookeeper.properties | sed -e 's/data$/DATA/' -e 's/log$/LOG/'> /tmp/zookeeper.properties && exec /usr/bin/zookeeper-server-start /tmp/zookeeper.properties"

/var/lib/zookeeper is the mounted EBS (in our case) I'm chopping off some of the docker containers startup. i.e. the bits that I don't care about. YMMV

GMartinez-Sisti commented 5 years ago

Since we don't want to maintain theses images ourselves we are creating a multi stage docker image based on the ones in this repo.

We are adding dumb-init, our custom loggers and doing this hack to fix the mount issue:

RUN set -eux && \
    echo "Ignore errors with 'sed: cannot rename /etc/*: Device or resource busy'" && \
    for x in zookeeper kafka '${COMPONENT}' '"${COMPONENT}"'; do \
        find /etc -type f -exec sed -i "s|/var/lib/$x/*|/data/|" {} \; ; \
    done

We mount the persistent volumes in /data. This works with confluentinc/cp-kafka:5.0.0 and confluentinc/cp-zookeeper:5.0.0. We'll need to test thoroughly if we need to update them.