redhat-developer / odo

odo - Developer-focused CLI for fast & iterative container-based application development on Podman and Kubernetes. Implementation of the open Devfile standard.
https://odo.dev
Apache License 2.0
795 stars 243 forks source link

supervisorD is leaving zombie processes #1076

Closed kadel closed 5 years ago

kadel commented 5 years ago

restarting httpd via supervisor leaves zombie processes behind

ps faxu
1000180+   223  0.0  0.0  11820  1840 ?        Ss   15:07   0:00 /bin/sh
1000180+   376  0.0  0.0  51704  1688 ?        R+   15:10   0:00  \_ ps faxu
1000180+     1  0.0  0.1 291224  5988 ?        Ssl  15:07   0:00 /var/lib/supervisord/bin/supervisord -c /var/lib/supervisord/conf/supervisor.conf
1000180+    41  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+    42  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+    43  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+    44  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+    45  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+    46  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+    49  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+    74  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+    75  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   149  0.1  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   165  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+   166  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+   167  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+   168  0.0  0.0      0     0 ?        Z    15:07   0:00 [cat] <defunct>
1000180+   169  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   170  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   171  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   172  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   173  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
1000180+   217  0.0  0.0      0     0 ?        Z    15:07   0:00 [httpd] <defunct>
kadel commented 5 years ago

I can see this also with nodejs

16  1716 ?        Ss   17:17   0:00 /bin/sh
1000160+ 18513  0.0  0.0  51704  1692 ?        R+   17:17   0:00  \_ ps afxu
1000160+     1  0.4  0.2 360060  9244 ?        Ssl  17:08   0:02 /var/lib/supervisord/bin/supervisord -c /var/lib/supervisord/conf/supervisor.conf
1000160+ 18275  0.8  0.0      0     0 ?        Z    17:17   0:00 [npm] <defunct>
1000160+ 18299  0.4  0.0      0     0 ?        Z    17:17   0:00 [node] <defunct>
1000160+ 18378  0.6  0.0      0     0 ?        Z    17:17   0:00 [npm] <defunct>
1000160+ 18398  0.3  0.0      0     0 ?        Z    17:17   0:00 [node] <defunct>
1000160+ 18471  0.0  0.0  11688  1408 ?        S    17:17   0:00 /bin/bash /var/lib/supervisord/bin/setup-and-run
1000160+ 18480  1.9  0.7 733196 28476 ?        Sl   17:17   0:00  \_ npm
1000160+ 18500  1.4  0.6 568648 25772 ?        Sl   17:17   0:00      \_ node app.js
kadel commented 5 years ago

might be related to https://github.com/ochinchina/supervisord/issues/60

kadel commented 5 years ago

I've quick fix: https://github.com/redhat-developer/odo-supervisord-image/tree/fix-zombie-problem https://github.com/kadel/odo/tree/fix-zombie-problem

The problem is that we run supevisorD as pid1, pid1 should be also responsible for cleaning ophraned process, but supervisorD doesn't do that. Running https://github.com/Yelp/dumb-init as pid1 solves that problem.