latchset / tang

Tang binding daemon
GNU General Public License v3.0
500 stars 57 forks source link

Tang running in standalone mode is leaving defunct processes after each request is served #109

Closed phirince closed 1 year ago

phirince commented 1 year ago

There is a "bug" in the signal handling part of the tang code (at least in some environments) which causes the SIGCHLD signal handler not being invoked after the first request. It seems like the signal handler is reset after the first time it is called. Here's how I'm running the daemon

[sudo] password for pphilip:
Listening on 0.0.0.0:9090
Listening on [::]:9090

Here's the process list after 3 curl requests

» ps -elf|grep 'tang[d]'
4 S root     24484 22557  0  80   0 - 70068 poll_s 00:44 pts/2    00:00:00 sudo -u tang /usr/libexec/tangd -l /var/db/tang
4 S tang     24507 24484  0  80   0 -  5931 poll_s 00:44 pts/2    00:00:00 /usr/libexec/tangd -l /var/db/tang
1 Z tang     24860 24507  0  80   0 -     0 do_exi 00:45 pts/2    00:00:00 [tangd] <defunct>
1 Z tang     24881 24507  0  80   0 -     0 do_exi 00:45 pts/2    00:00:00 [tangd] <defunct>

you can see 2 defunct processes because the first child was reaped by the signal handler

I couldn't find out exactly why this was happening, but according to the man page of signal(2):

The behavior of signal() varies across UNIX versions, and has also varied historically across different versions of Linux. Avoid its use: use sigaction(2) instead.

There are two solutions to the problem:

  1. Reinstall the signal handler at the end of the signal handler call
  2. Use sigaction instead

I have verified that both the methods eliminates this problem on my platform. But I prefer the second option because it is supposed to be more portable, unless there is some concern about non-linux platforms.

PR #110 opened to address this issue -PP