Closed Deracination closed 4 years ago
I posted this on Test::More, but posting here as well. I am open tot he possibility I am wrong here, but this is my research and conclusions:
This appears to happen with prove and yath. If you run it with perl it does appear to exit immediately, but a ps
will show you that your perl did not actually exit, it is left as a zombie. I do not think this is an issue with Test::More, Prove, or Yath.
I will also point out that the same exact thing happens if you remove the 'use Test::More' and just have it call system(sleep 5000)
.
I may be wrong in my understanding of signals, but I do not believe the sigchld will be sent by the zombie until the sleep returns. As such waitpid will not return. a good harness will use waitpid on the process it spawns, which is why prove/yath wait. The shell has its own process management, but the fact is the sleep is still running, and the perl is a zombie even thought the shell returned.
I think ultimately the bug is that you are "starting a daemon", but not actually starting a daemon, you are just spawning a child process. There is a difference. https://metacpan.org/pod/Proc::Daemon might be able to help you. Here is some more details: http://world.std.com/~swmcd/steven/tech/daemon.html.
Ultimately I do think everything here is working as desired. And no Test::* code has anything to do with the behavior you are seeing. A non-perl harness in place of prove would have the same issue. I think your tests need to be fixed to manage processes/spawn daemons more correctly.
I just re-read your initial question. You say that this started after updating your libs... but I can reproduce the problem with no libs at all. As I said the following script will still hang forever with prove:
system('sleep 500');
No need to load Test::More or any other libs. Did you happen to also upgrade perl versions, maybe perl's own process management was fixed from some bad behavior?
Just to be safe I went to check this on a super old perl (5.8.9) with an older prove. Same exact behavior, no Test::More involved at all.
exodist@abydos 5.8.9_no_thr $ cat xxx.t
system('sleep 5000');
exodist@abydos 5.8.9_no_thr $ prove xxx.t
xxx....^C
exodist@abydos 5.8.9_no_thr $ prove -V
prove v2.64, using Test::Harness v2.64 and Perl v5.8.9
Just to be safe I tried on that same perl+prove but with an older Test::More (pre test2) installed and put your use Test::More
back. It still had the issue, so there was nothing fancy in the old Test::* that prevented this behavior.
I made a Perl version change from 5.24.3 to 5.24.4 but I didn't think that would carry a big difference. I'll see if I can track down any other changes. It was certainly working (apparently) ok a few weeks ago.
The test case I provided was the minimum to demonstrate the problem. We actually use system() to launch a set of Net::Daemon-based daemons which I think are correctly detaching from the parent process, etc.
@exodist I would note there is a difference between system('sleep 5000')
(your one) and system('sleep 5000 &')
(the OP). The first one is obviously synchronous. The second one I'd expect to fork twice (spawning the shell, the shell spawns the "sleep" and doesn't wait), thereby leaving no parent for the second one.
You said:
I do not believe the sigchld will be sent by the zombie
Zombies don't do anything; they are simply entries in the process table, waiting to have their exit status collected by a parent. My reading of the original point is that there is an expectation that a double fork (as I understand it) should not leave zombies. At least can we ensure that you're not talking about cross-purposes?
(at cross-purposes, sorry)
oof, good catch, I did forget the '&'. I will have to re-run my checks.
Ok, I updated my script and re-ran all the checks. Old perl, new perl, old prove, new prove, old Test::More, new Test::More, no Test::More, yath, etc. Same results for all of them, the '&' made no difference.
Example: with prove this hangs until the sleep ends but exits immediately when run with perl
Try perl foo.pl | cat
. That will hang as well because the pipe doesn't close until the sleep is done…
Sorry to be wasting your time - it seems that the non-exit of prove occurs under all versions when a test launches a sub-process via system, exactly the same as the perl foo.pl|cat
example from @Leont
The naive attempt to launch a set of daemons from system isn't going to work so we'll rework it to use Proc::Daemon or similar.
Sorry to be wasting your time - it seems that the non-exit of prove occurs under all versions when a test launches a sub-process via system, exactly the same as the perl foo.pl|cat example from @Leont
What happens if your forking daemon closes STDIN/STDOUT/STDERR? (as forking daemons should)
prove exits normally if the daemon is launched with Proc::Daemon which does all the right things
Test::Harness version 3.36_01, Test::More version 1.302177
prove'ing a test which forks a daemon process doesn't exit until the daemon is killed. This is new since updating Test:: libs to all latest versions. Previously the test would exit.
This is a bit of a problem as I use a function in our test suite to restart some daemons if code has changed. This is no longer usable since the first test never exits.
Example: with
prove
this hangs until the sleep ends but exits immediately when run withperl
use Test::More; system('sleep 5000 &');
Also posted to Test::More: https://github.com/Test-More/test-more/issues/868#issue-679853617