Closed hveini closed 1 year ago
Thanks @hveini for this thorough bug report and even finding a possible solution!
I also did some digging now and was very puzzled by how differently the restarting and general signal handling works in docker.
Looking at the installed signal handlers in /proc, or rather their absence...
(shell in container) /home/netdisco # cat /proc/1/status
Name: netdisco-backend
...
SigPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000080
SigCgt: 0000000000004000 that is binary 0100000000000000
... I finally found these:
Long story short, to have proper signal handling in a multi-process environment like the netdisco processes, we should use an init process instead of using netdisco-backend as PID 1 directly. This just means adding init:true to both services in docker-compose:
ram@cicd:/tmp/issue49 $ grep -B 3 init docker-compose.yml
netdisco-backend:
image: netdisco/netdisco:latest-backend
init: true
...
netdisco-web:
image: netdisco/netdisco:latest-web
init: true
With this both processes seem to reload fine on config changes, just like when running in a non-containerized environment.
Expected Behavior
netdisco-backend should no leave any zombie process on config file change.
Current Behavior
netdisco-backend leaves zombie process on config file change.
--> each time the config file changes, the new set of workers are created and old ones are left as zombies.
Possible Solution
I managed to fix this by adding "$SIG{CHLD} = 'IGNORE';" in netdisco-backend.
I added a volume for the file in docker-compose.yml:
And copied/changed the file:
Then restarted docker, and tried again:
Not sure if this is any good solution, or does it brakes something else. But in my case, the features I'm using seems to be working.
Steps to Reproduce
Context
I'm using netdisco only for spcific devices, and use my own scheduler (by netdisco-do). So deployment.yml has:
and "discover_only" list is pediodically changed by my own scheduler-script, causing config file change, which now causes a lot of zombie process over time. I do not use web interface at all, but read the data through rest-api.
Environment
Config info (deployment.yml and docker env settings)