Bug, netdisco-docker is leaving zombie processes on config file change

Expected Behavior

netdisco-backend should no leave any zombie process on config file change.

Current Behavior

netdisco-backend leaves zombie process on config file change.

# docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"
CONTAINER ID   IMAGE                                   NAMES
28d3722f1500   netdisco/netdisco:2.055000-backend      docker-netdisco-backend-1
b055c2f28a9f   netdisco/netdisco:2.055000-web          docker-netdisco-web-1
9427fad0bd24   netdisco/netdisco:2.055000-postgresql   docker-netdisco-postgresql-1
# ps -AF|grep '[<]defunct>'
# touch netdisco/config/deployment.yml
# ps -AF|grep '[<]defunct>'
nd2      21473 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #1 sched: ] <defunct>
nd2      21475 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #3 poll: i] <defunct>
nd2      21476 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #4 poll: i] <defunct>
nd2      21477 21257  0     0     0   3 10:21 ?        00:00:00 [nd2: #5 poll: i] <defunct>
nd2      21478 21257  0     0     0   3 10:21 ?        00:00:00 [nd2: #6 poll: i] <defunct>
nd2      21479 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #7 poll: i] <defunct>
nd2      21480 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #8 poll: i] <defunct>
nd2      21481 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #9 poll: i] <defunct>
nd2      21482 21257  0     0     0   0 10:21 ?        00:00:00 [nd2: #10 poll: ] <defunct>
nd2      21483 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #11 poll: ] <defunct>
nd2      21484 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #12 poll: ] <defunct>
nd2      21485 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #13 poll: ] <defunct>
nd2      21486 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #14 poll: ] <defunct>
nd2      21487 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #15 poll: ] <defunct>
nd2      21488 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #16 poll: ] <defunct>
nd2      21489 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #17 poll: ] <defunct>
nd2      21490 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #18 poll: ] <defunct>
nd2      21492 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #20 poll: ] <defunct>
nd2      21493 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #21 poll: ] <defunct>
nd2      21494 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #22 poll: ] <defunct>
nd2      21495 21257  0     0     0   1 10:21 ?        00:00:00 [nd2: #23 poll: ] <defunct>
nd2      21496 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #24 poll: ] <defunct>
nd2      21497 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #25 poll: ] <defunct>
nd2      21498 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #26 poll: ] <defunct>
nd2      21499 21257  0     0     0   0 10:21 ?        00:00:00 [nd2: #27 poll: ] <defunct>
nd2      21594 21257  0     0     0   3 10:21 ?        00:00:01 [nd2: #2 mgr: id] <defunct>
nd2      21646 21257  0     0     0   2 10:21 ?        00:00:00 [nd2: #19 poll: ] <defunct>
# touch netdisco/config/deployment.yml 
# ps -AF|grep '[<]defunct>'|wc -l
54
# touch netdisco/config/deployment.yml 
# ps -AF|grep '[<]defunct>'|wc -l
81

--> each time the config file changes, the new set of workers are created and old ones are left as zombies.

Possible Solution

I managed to fix this by adding "$SIG{CHLD} = 'IGNORE';" in netdisco-backend.

I added a volume for the file in docker-compose.yml:

...
  netdisco-backend:
    image: docker.io/netdisco/netdisco:2.055000-backend
    volumes:
      - "./netdisco/nd-site-local:/home/netdisco/nd-site-local"
      - "./netdisco/config:/home/netdisco/environments"
      - "./netdisco/logs:/home/netdisco/logs"
      - "./netdisco/netdisco-backend:/home/netdisco/perl5/bin/netdisco-backend"
...

And copied/changed the file:

# docker exec docker-netdisco-backend-1 cat /home/netdisco/perl5/bin/netdisco-backend > netdisco/netdisco-backend-o
# docker exec docker-netdisco-backend-1 cat /home/netdisco/perl5/bin/netdisco-backend > netdisco/netdisco-backend
# chmod +x netdisco/netdisco-backend 
# nano netdisco/netdisco-backend
# diff netdisco/netdisco-backend netdisco/netdisco-backend-o 
8,9d7
< $SIG{CHLD} = 'IGNORE';
<

Then restarted docker, and tried again:

# docker-compose down && docker-compose up -d
[+] Running 4/4
 ⠿ Container docker-netdisco-backend-1     Removed 0.4s
 ⠿ Container docker-netdisco-web-1         Removed 0.3s
 ⠿ Container docker-netdisco-postgresql-1  Removed 0.3s
 ⠿ Network docker_default                  Removed 0.2s
[+] Running 4/4
 ⠿ Network docker_default                  Created 0.2s
 ⠿ Container docker-netdisco-postgresql-1  Started 0.5s
 ⠿ Container docker-netdisco-web-1         Started 1.7s
 ⠿ Container docker-netdisco-backend-1     Started 1.6s
# ps -AF|grep '[<]defunct>'
# touch netdisco/config/deployment.yml
# ps -AF|grep '[<]defunct>'
# touch netdisco/config/deployment.yml
# ps -AF|grep '[<]defunct>'

Not sure if this is any good solution, or does it brakes something else. But in my case, the features I'm using seems to be working.

Steps to Reproduce

deploy netdisco-docker
change the "netdisco/config/deployment.yml" - file
find any defunct process

Context

I'm using netdisco only for spcific devices, and use my own scheduler (by netdisco-do). So deployment.yml has:

...
schedule:
  discoverall: null
  macwalk: null
  arpwalk: null
  nbtwalk: null
  expire:
    when: '30 23 * * *'
discover_only:
  - 127.0.0.1
...

and "discover_only" list is pediodically changed by my own scheduler-script, causing config file change, which now causes a lot of zombie process over time. I do not use web interface at all, but read the data through rest-api.

Environment

netdisco container versions:
- netdisco-postgresql: 2.055000
- netdisco-backend: 2.055000
- netdisco-web: 2.055000
docker engine version: 20.10.12, build e91ed57
docker-compose version: v2.2.2
host operating system: CentOS 7.9

Config info (deployment.yml and docker env settings)

# cat docker-compose.yml|grep -vE '^\s*#|^\s*$'
version: '3.9'
services:
  netdisco-postgresql:
    image: docker.io/netdisco/netdisco:2.055000-postgresql
    volumes:
      - "./netdisco/pgdata:/var/lib/postgresql/data"
    ports:
      - "5433:5432"
    restart: always
  netdisco-backend:
    image: docker.io/netdisco/netdisco:2.055000-backend
    volumes:
      - "./netdisco/nd-site-local:/home/netdisco/nd-site-local"
      - "./netdisco/config:/home/netdisco/environments"
      - "./netdisco/logs:/home/netdisco/logs"
      - "./netdisco/netdisco-backend:/home/netdisco/perl5/bin/netdisco-backend"
    environment:
      NETDISCO_DOMAIN:  discover
      NETDISCO_DB_HOST: netdisco-postgresql
    depends_on:
      - netdisco-postgresql
    dns_opt:
      - 'ndots:0'
      - 'timeout:1'
      - 'retries:0'
      - 'attempts:1'
      - edns0
      - trustad
    restart: always
  netdisco-web:
    image: docker.io/netdisco/netdisco:2.055000-web
    volumes:
      - "./netdisco/nd-site-local:/home/netdisco/nd-site-local"
      - "./netdisco/config:/home/netdisco/environments"
    environment:
      NETDISCO_DOMAIN:  discover
      NETDISCO_DB_HOST: netdisco-postgresql
    ports:
      - "5000:5000"
    depends_on:
      - netdisco-postgresql
    dns_opt:
      - 'ndots:0'
      - 'timeout:1'
      - 'retries:0'
      - 'attempts:1'
      - edns0
      - trustad
    restart: always

# cat netdisco/config/deployment.yml|grep -vE '^\s*#|^\s*$'
database:
  name: 'netdisco'
  user: 'netdisco'
  pass: 'netdisco'
site_local_files: true
no_auth: false
community:
  - public
snmp_auth:
  - tag: v3u1
    user: netdisco
    auth:
      pass: disconet
      proto: SHA
    priv:
      pass: disconet
      proto: AES
schedule:
  discoverall: null
  macwalk: null
  arpwalk: null
  nbtwalk: null
  expire:
    when: '30 23 * * *'
expire_devices: 2
workers:
  tasks: '25'
  timeout: 600
  sleep_time: 1
  min_runtime: 0.5
  max_deferrals: 0
  retry_after: 0
snmptimeout: 300000
snmpretries: 1
path: '/netdisco/'
log: warning
dns:
  max_outstanding: 50
  hosts_file: '/etc/hosts'
  no: ["0.0.0.0/0","::/0"]
discover_only:
  - 127.0.0.1

netdisco / netdisco-docker