trynocoding commented 5 days ago

Describe the bug There are a lot of zombie processes in zookeeper pods，pulsar-broker also has a zombie process

To Reproduce Steps to reproduce the behavior: helm install pulsar apache/pulsar --set volumes.persistence=false --set affinity.anti_affinity=false --version 3.7.0 --set kube-prometheus-stack.enabled=false --set components.pulsar_manager=true

[root@master ~]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION pulsar default 1 2024-11-25 22:35:04.273744005 -0500 -0500 deployed pulsar-3.7.0 4.0.0
[root@master ~]#

Expected behavior No zombie processes

Screenshots

Desktop (please complete the following information): [root@master ~]# cat /etc/redhat-release CentOS Stream release 9 [root@master ~]# uname -r 5.14.0-410.el9.x86_64 [root@master ~]#

lhotari commented 5 days ago

There are a lot of zombie processes in zookeeper pods，pulsar-broker also has a zombie process

Please share some examples of the process command lines, as text please. Do zombie processes have any information, such as the command line?

lhotari commented 5 days ago

Desktop (please complete the following information): [root@master ~]# cat /etc/redhat-release CentOS Stream release 9 [root@master ~]# uname -r 5.14.0-410.el9.x86_64 [root@master ~]#

@trynocoding which k8s implementation are you using?

trynocoding commented 5 days ago

k8s version: [root@master ~]# kubectl get no NAME STATUS ROLES AGE VERSION master Ready control-plane 38d v1.27.7 [root@master ~]#

UID PID PPID C STIME TTY TIME CMD pulsar 1 0 0 Nov26 ? 00:08:28 /opt/jvm/bin/java -Dzookeeper.4lw.commands.whitelist= -Dzookeeper.snapshot.trust.empty=true -Dzookeeper.tcpKeepAlive=true -cp /pulsar/conf:::/pulsar/lib/: -Dlog4j2.formatMsgNoL ookups=true -Dorg.xerial.snappy.use.systemlib=true -Dlog4j.configurationFile=log4j2.yaml -Djute.maxbuffer=10485760 -Djava.net.preferIPv4Stack=true -Dzookeeper.clientTcpKeepAlive=true --add-opens java.base/java.io=ALL-UNNAMED --add -opens java.base/java.util.zip=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED -Dio.netty.tryReflectionSetAccessible=true -Dorg.apache.pulsar.sh ade.io.netty.tryReflectionSetAccessible=true --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/jdk.internal.misc=ALL-UNNAMED --add-opens java.base/jdk.internal.platform=ALL-UNNAMED -Xms64m -Xmx128m -XX:+UseG1GC -XX: MaxGCPauseMillis=10 -Dcom.sun.management.jmxremote -Djute.maxbuffer=10485760 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+DoEscapeAnalysis -XX:+DisableExplicitGC -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSha redMem -Xlog:async -Xlog:gc,safepoint:/pulsar/logs/pulsargc%p.log:time,uptime,tags:filecount=10,filesize=20M -Dpulsar.allocator.exit_on_oom=true -Dio.netty.recycler.maxCapacityPerThread=4096 -Dpulsar.log.appender=RoutingAppende r -Dpulsar.log.dir=/pulsar/logs -Dpulsar.log.level=info -Dpulsar.log.root.level=info -Dpulsar.log.immediateFlush=false -Dpulsar.routing.appender.default=Console -Dlog4j2.is.webapp=false -Dpulsar.functions.process.container.log.dir =/pulsar/logs -Dpulsar.functions.java.instance.jar=/pulsar/instances/java-instance.jar -Dpulsar.functions.python.instance.file=/pulsar/instances/python-instance/python_instance_main.py -Dpulsar.functions.extra.dependencies.dir=/pu lsar/instances/deps -Dpulsar.functions.instance.classpath=/pulsar/conf:::/pulsar/lib/: -Dpulsar.functions.log.conf=/pulsar/conf/functions_log4j2.xml -Dbookkeeper.metadata.bookie.drivers=org.apache.pulsar.metadata.bookkeeper.Pulsa rMetadataBookieDriver -Dbookkeeper.metadata.client.drivers=org.apache.pulsar.metadata.bookkeeper.PulsarMetadataClientDriver -Dpulsar.log.file=zookeeper.log org.apache.zookeeper.server.quorum.QuorumPeerMain /pulsar/conf/zookeeper.c onf pulsar 172 1 0 Nov26 ? 00:00:00 [timeout] pulsar 174 1 0 Nov26 ? 00:00:00 [timeout] pulsar 261 1 0 Nov26 ? 00:00:00 [timeout] pulsar 262 1 0 Nov26 ? 00:00:00 [timeout] pulsar 288 1 0 Nov26 ? 00:00:00 [timeout] pulsar 290 1 0 Nov26 ? 00:00:00 [timeout] ......

pulsar-zookeeper-2:/pulsar$ cat /proc/172/status Name: timeout State: Z (zombie) Tgid: 172 Ngid: 0 Pid: 172 PPid: 1 TracerPid: 0 Uid: 10000 10000 10000 10000 Gid: 0 0 0 0 FDSize: 0 Groups: 0 NStgid: 172 NSpid: 172 NSpgid: 172 NSsid: 172 Threads: 1 SigQ: 0/126951 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000000000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 00000000a80425fb CapAmb: 0000000000000000 NoNewPrivs: 0 Seccomp: 0 Seccomp_filters: 0 Speculation_Store_Bypass: thread vulnerable SpeculationIndirectBranch: conditional enabled Cpus_allowed: ffffffff,ffffffff,ffffffff Cpus_allowed_list: 0-95 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 Mems_allowed_list: 0 voluntary_ctxt_switches: 3 nonvoluntary_ctxt_switches: 0 pulsar-zookeeper-2:/pulsar$

[root@master ~]# kubectl get sts pulsar-zookeeper -oyaml|grep livenessProbe -A12 livenessProbe: exec: command:

timeout
"30"
bash
-c
'{ echo ruok; sleep 1; } | nc 127.0.0.1 2181 | grep imok' failureThreshold: 10 initialDelaySeconds: 20 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 30 [root@master ~]#

As far as I can see, about 2 zombie processes are spawned every 30s, with the same cycle as the livenessProbe cycle, which I'm guessing is caused by the livenessProbe

lhotari commented 5 days ago

k8s version: [root@master ~]# kubectl get no NAME STATUS ROLES AGE VERSION master Ready control-plane 38d v1.27.7 [root@master ~]#

@trynocoding which type of k8s implementation is this? How did you install it? Is it minikube, Kind, microk8s, k3s, Rancher Desktop, or any of the typical k8s envs used for k8s development?

I haven't yet tried to reproduce this issue so I haven't checked if this reproduces in my environment.

trynocoding commented 2 days ago

@lhotari hi，I'm using sealos to deploy a k8s environment in a VM [root@master images]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master Ready control-plane 41d v1.27.7 192.66.111.120 CentOS Stream 9 5.14.0-533.el9.x86_64 containerd://1.7.23 [root@master images]#

I uninstalled version 4.0.0 pulsar, installed version 3.0.7 pulsar, ran it for a while, and found no zombie processes. [root@master images]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION pulsar default 1 2024-12-01 15:47:31.223038065 +0800 CST deployed pulsar-3.6.0 3.0.7

Then, I uninstalled version 3.0.7 pulsar, installed version 4.0.0 pulsar, ran it for a while, and found zombie processes.

lhotari commented 2 days ago

Thanks for doing the experiment with 3.0.7, @trynocoding. I found a blog post explaining the possible issue: https://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html . I also found https://github.com/kubernetes/kubernetes/issues/84210 .

I'll try reproducing this in my k8s test environments to see if this is specific to the k8s environment.

lhotari commented 2 days ago

One significant difference between Pulsar 3.0.x and Pulsar 4.0.x is that Pulsar 3.0.x uses Ubuntu base image and Pulsar 4.0.x uses Alpine base image. That might be contributing to this issue.

lhotari commented 2 days ago

I'll try reproducing this in my k8s test environments to see if this is specific to the k8s environment.

I can reproduce the same issue.

pulsar-zookeeper-0:/pulsar$ ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
pulsar         1       0  4 05:51 ?        00:00:07 /opt/jvm/bin/java -Dzookeeper.4lw.commands.whitelist=* -Dzookeeper.snapshot.trus
pulsar       225       1  0 05:51 ?        00:00:00 [timeout] <defunct>
pulsar       237       1  0 05:51 ?        00:00:00 [timeout] <defunct>
pulsar       248       0  0 05:52 pts/0    00:00:00 bash
pulsar       269       1  0 05:52 ?        00:00:00 [timeout] <defunct>
pulsar       281       1  0 05:52 ?        00:00:00 [timeout] <defunct>
pulsar       296       1  0 05:52 ?        00:00:00 [timeout] <defunct>
pulsar       308       1  0 05:52 ?        00:00:00 [timeout] <defunct>
pulsar       326       1  0 05:53 ?        00:00:00 [timeout] <defunct>
pulsar       338       1  0 05:53 ?        00:00:00 [timeout] <defunct>
pulsar       352       1  0 05:53 ?        00:00:00 [timeout] <defunct>
pulsar       364       1  0 05:53 ?        00:00:00 [timeout] <defunct>
pulsar       369     248  0 05:54 pts/0    00:00:00 ps -ef

The zombie processes don't get reaped.

lhotari commented 2 days ago

This problem is related to the Alpine base image. I created a minideb based base image of apachepulsar/pulsar-all:4.0.0 using this solution: https://gist.github.com/lhotari/3ffef8117743f7044e6bbdc3933bc029 It is pushed to lhotari/pulsar-all:4.0.0-minideb.

The problem doesn't reproduce when installing with this image (--set defaultPulsarImageRepository=lhotari/pulsar-all,defaultPulsarImageTag=4.0.0-minideb).

helm install pulsar apache/pulsar --set defaultPulsarImageRepository=lhotari/pulsar-all,defaultPulsarImageTag=4.0.0-minideb --set volumes.persistence=false --set affinity.anti_affinity=false --version 3.7.0 --set kube-prometheus-stack.enabled=false

We'll have to try to find a solution that solves the problem for the Alpine base image. All resources that I found were mentioning https://github.com/krallin/tini or https://github.com/Yelp/dumb-init as the solution. tini is available as a apk package for Alpine.

trynocoding commented 2 days ago

Thank you for providing useful conclusions and information. I'm not very familiar with operating systems. Could you help explain the differences between Alpine and Ubuntu in terms of signal handling and process management? In a container, the business process is PID 1, which implies that the business process lacks the ability to reap child processes, thus leading to the existence of zombie processes. Why, in the case of Ubuntu, can these zombie processes be reaped? This may be beyond the scope of this question

lhotari commented 2 days ago

@trynocoding The timeout wrapper added in #214 seems to be problematic with Alpine. Reverting that change is a possible workaround.

Thank you for providing useful conclusions and information. I'm not very familiar with operating systems. Could you help explain the differences between Alpine and Ubuntu in terms of signal handling and process management? In a container, the business process is PID 1, which implies that the business process lacks the ability to reap child processes, thus leading to the existence of zombie processes. Why, in the case of Ubuntu, can these zombie processes be reaped? This may be beyond the scope of this question

In the case of docker containers, the operating system itself doesn't play a major role, but the libraries that the image uses from the operating system base image. A major difference between Alpine and Ubuntu is that Alpine uses musl C standard library and Ubuntu uses glibc C standard library.

trynocoding commented 2 days ago

@trynocoding The timeout wrapper added in https://github.com/apache/pulsar-helm-chart/pull/214 seems to be problematic with Alpine. Reverting that change is a possible workaround.

Yeah，It works.

If the k8s version is higher than 1.20, this can be a solution, the best way may be to add dumb-init or tini for the container to manage the business process as you said, thank you very much for your help!

lhotari commented 2 days ago

Yeah，It works.

If the k8s version is higher than 1.20, this can be a solution, the best way may be to add dumb-init or tini for the container to manage the business process as you said, thank you very much for your help!

It doesn't seem to be necessary to install dumb-init or tini in this case. I made an experiment where I installed coreutils package into the image. coreutils includes the timeout utility. By default, the Alpine image will use busybox to provide timeout.

I experimented with a docker image built with this type of Dockerfile which adds coreutils package:

FROM apachepulsar/pulsar-all:4.0.0
USER 0
RUN apk add --no-cache coreutils
USER 10000

lhotari commented 2 days ago

I created https://github.com/apache/pulsar/pull/23667 to address this problem.

lhotari commented 2 days ago

I have also created #556 to address the issue since the timeout wrapper for the probes isn't needed.

trynocoding commented 2 days ago

It doesn't seem to be necessary to install dumb-init or tini in this case. I made an experiment where I installed coreutils package into the image. coreutils includes the timeout utility. By default, the Alpine image will use busybox to provide timeout.

pulsar-400-zookeeper-2:/pulsar$ apk info -W $(which timeout)
/usr/bin/timeout symlink target is owned by busybox-1.36.1-r29
pulsar-400-zookeeper-2:/pulsar$ ls -l $(which timeout)
lrwxrwxrwx    1 root     root            12 Sep  6 11:34 /usr/bin/timeout -> /bin/busybox
pulsar-400-zookeeper-2:/pulsar$ ls -l /bin/busybox
-rwxr-xr-x    1 root     root        808712 Jun 10 07:11 /bin/busybox
pulsar-400-zookeeper-2:/pulsar$

I've learned something, thank you very much. If https://github.com/apache/pulsar-helm-chart/pull/556 is reverted, it seems that no other pods are using timeouts. Using the timeout provided by coreutils is still a good approach.

trynocoding commented 2 days ago

I have also created #556 to address the issue since the timeout wrapper for the probes isn't needed.

Why revert，https://github.com/apache/pulsar-helm-chart/pull/214 was meant to address no longer exist?

lhotari commented 2 days ago

I have also created #556 to address the issue since the timeout wrapper for the probes isn't needed.

Why revert，#214 was meant to address no longer exist?

214 was added to address a problem in Zookeeper where probes could hang. That problem has been addressed in current ZK versions. It's better to revert #214 so that existing Pulsar releases and the upcoming Pulsar 4.0.1 release won't have issues with zombies in Zookeeper pods.

apache / pulsar-helm-chart

[timeout] <defunct> Zookeeper and Broker Pods Contain Zombie Processes #554

214 was added to address a problem in Zookeeper where probes could hang. That problem has been addressed in current ZK versions. It's better to revert #214 so that existing Pulsar releases and the upcoming Pulsar 4.0.1 release won't have issues with zombies in Zookeeper pods.