31z4 / zookeeper-docker

Docker image packaging for Apache Zookeeper
MIT License
285 stars 244 forks source link

Zookeeper auto purge process does not purge files in docker #94

Closed githubacnt closed 3 years ago

githubacnt commented 4 years ago

Hi, I am building a zookeeper docker image from official docker images. I am running it on docker version 3 and am currently using Zookeeper 3.4.14. I have 3 containers, each running a zookeeper server and I am setting the environment variable as shown in the docker-compose.yml file

environment: ZOO_AUTOPURGE_PURGEINTERVAL: 24 ZOO_AUTOPURGE_SNAPRETAINCOUNT: 3

I can clearly see the values reflecting back in my zoo.cfg.

cat /conf/zoo.cfg clientPort=xxxx dataDir=/data dataLogDir=/datalog tickTime=2000 initLimit=5 syncLimit=2 autopurge.snapRetainCount=3 autopurge.purgeInterval=24 maxClientCnxns=60 server.1=zoo1:xxxx:xxxx server.2=zoo2:xxxx:xxxx server.3=0.0.0.0:xxxx:xxxx

Getting in the container and doing a printenv, I can see the values reflecting back as well.

printenv ZOO_DATA_LOG_DIR=/datalog HOSTNAME=0536b195f621 JAVA_HOME=/usr/local/openjdk-8 ZOO_DATA_DIR=/data JAVA_BASE_URL=https://github.com/AdoptOpenJDK/openjdk8-upstream-binaries/releases/download/jdk8u232-b09/OpenJDK8U-jre_ ZOO_INIT_LIMIT=5 PWD=/datalog/version-2 JAVA_URL_VERSION=8u232b09 ZOO_AUTOPURGE_SNAPRETAINCOUNT=3 HOME=/root LANG=C.UTF-8 ZOO_SYNC_LIMIT=2 ZOO_SERVERS=server.1=zoo1:xxxx:xxxx server.2=zoo2:xxxx:xxxx server.3=0.0.0.0:xxxx:xxxx SHLVL=1 ZOO_MY_ID=3 ZOO_MAX_CLIENT_CNXNS=60 ZOO_TICK_TIME=2000 ZOO_CONF_DIR=/conf PATH=/usr/local/openjdk-8/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/zookeeper-3.4.14/bin ZOOCFGDIR=/conf ZOO_AUTOPURGE_PURGEINTERVAL=24 JAVA_VERSION=8u232 ZOO_LOG_DIR=/logs OLDPWD=/zookeeper-3.4.14 _=/usr/bin/printenv

I can also clearly see the purge task being completed in the logs as well using this command as well: docker logs <containerID> -tf | grep "PurgeTask:DatadirCleanupManager" 020-02-18T16:49:56.605549689Z 2020-02-18 16:49:56,604 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-18T16:49:56.636000804Z 2020-02-18 16:49:56,635 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2020-02-19T16:49:56.606280261Z 2020-02-19 16:49:56,605 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-19T16:49:56.657389039Z 2020-02-19 16:49:56,657 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2020-02-20T16:49:56.605362615Z 2020-02-20 16:49:56,604 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-20T16:49:56.612265088Z 2020-02-20 16:49:56,611 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2020-02-21T16:49:56.605773207Z 2020-02-21 16:49:56,604 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-21T16:49:56.643037255Z 2020-02-21 16:49:56,642 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2020-02-22T16:49:56.605712054Z 2020-02-22 16:49:56,605 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-22T16:49:56.661826480Z 2020-02-22 16:49:56,661 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2020-02-23T16:49:56.606569211Z 2020-02-23 16:49:56,604 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-23T16:49:56.629269327Z 2020-02-23 16:49:56,628 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2020-02-24T16:49:56.605299157Z 2020-02-24 16:49:56,604 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2020-02-24T16:49:56.606483941Z 2020-02-24 16:49:56,606 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.

Expected behavior

After setting autopurge.purgeInterval and autopurge.snapRetainCount, zookeeper should purge the logs and only retain the snapshots as defined.

Actual behavior

Zookeeper is failing to delete the logs and I can see old snapshots and logs all the way back to 7 days. The zookeeper vm runs out of memory because the purge is not happening.

Steps to reproduce the behavior

The same settings can be used for zoo.cfg as provided above. Docker version 3 is to be used with official docker zookeeper image version 3.4.14. The issue happens only with docker images.

System configuration

The same settings as above can be used.

Additional info

zkCleanup.sh -n x still works even in the docker container and I can see the logs and snapshots being purged but I shouldn't have to do it manually if there are properties provided for purging.

31z4 commented 4 years ago

My only suggestion is reaching out the upstream. This isn't the first time of having issues with auto purge (https://github.com/31z4/zookeeper-docker/issues/30, https://github.com/31z4/zookeeper-docker/issues/73, https://github.com/apache/zookeeper/pull/475). Could be a regression I guess.

Also, have you had a chance to check is the issues persists in 3.5?

31z4 commented 3 years ago

Closing it because the issue seems to be inactive for quite some time now.