atoomic / perl

a repo to show what could be p7
Other
18 stars 8 forks source link

t/op/threads.t: Test can hang after only 9 out of 30 unit tests run #264

Open jkeenan opened 4 years ago

jkeenan commented 4 years ago

t/op/threads.t appears to be vulnerable to hangs which cause the file to be graded as FAIL.

See this smoke-test report of our simulation in Perl 5: http://perl5.test-smoke.org/report/118070.

I went to the NetBSD VM that generated this smoke testing report. Using the same perl that generated the second of the two FAILS in that report, I first called:

$ ./perl -Ilib -V:config_args
config_args='-des -Dusedevel -Duseithreads -DDEBUGGING';
[perl-reporter-08] $ ./perl -Ilib t/op/threads.t
1..30
ok 1 - delete() under threads
ok 2 - weaken ref under threads
ok 3 - weaken ref \#2 under threads
 # parent 19460: continue
 # kid 1 before sort
 # parent 19460: continue
 # kid 2 before sort
 # parent 19460: waiting for join
 # kid 1 after sort, sleeping 1
 # kid 2 after sort, sleeping 1
 # kid 1 exit
 # parent 19460: thread exited
 # parent 19460: waiting for join
 # kid 2 exit
 # parent 19460: thread exited
ok 4
ok 5 - cloning constant subs
ok 6 - Ensure PL_linestr can be cloned
ok 7 - threads in CHECK block
ok 8 - threads in INIT block
ok 9 - Bug \#41138
ok 10 - [perl \#45053]
ok 11
ok 12
ok 13 - clone seen-evals
ok 14 - undefing a typeglob doesn't cause a crash during cloning
ok 15 - No del_backref panic [perl \#70748]
ok 16 - No del_backref panic [perl \#70748] (2)
ok 17 - returning a closure
ok 18 - Test for 34394ecd06e704e9
ok 19 - RT \#73046
ok 20 - 0 refcnt neither on tmps stack nor in @_
ok 21 - RT \#73086 - clone used to clone active pads
ok 22 - Just special casing lexicals in ?{ ... }
ok 23 - 0 refcnt during CLONE
ok 24 - avoid peephole recursion
ok 25 - Pipes shared between threads do not block when closed
ok 26 - globs cloned and joined are not recloned
ok 27 - no crash when deleting $::{INC} in thread
ok 28 - no crash modifying extended array element
ok 29 - RT \#36664: Strange behavior of shared array
ok 30 - RT \#41121 binmode(STDOUT,":encoding(utf8) does not crash

So far so good. But then I called:

[perl-reporter-08] $ cd t ; ./perl harness -v op/threads.t; cd -

ok 1 - delete() under threads
ok 2 - weaken ref under threads
ok 3 - weaken ref \#2 under threads
ok 4
ok 5 - cloning constant subs
ok 6 - Ensure PL_linestr can be cloned
ok 7 - threads in CHECK block
ok 8 - threads in INIT block
ok 9 - Bug \#41138

The program hung there for several minutes, during which I began typing this report. Then ...

 # Test process timed out - terminating
Failed 21/30 subtests 

Test Summary Report
-------------------
op/threads.t (Wstat: 139 Tests: 9 Failed: 0)
  Non-zero wait status: 139
  Parse errors: Bad plan.  You planned 30 tests but ran 9.
Files=1, Tests=9, 183 wallclock secs ( 0.02 usr  0.05 sys +  2.36 cusr  0.57 csys =  3.00 CPU)
Result: FAIL

ISTR this problem has appeared previously. Granted, this failure occurred in the smoke-me/jkeenan/cumberland-blues branch in Perl 5 -- not in tag alpha-02-MC-4 in our repository. But I know this is not the first time I've seen a hang at this point.

Thank you very much. Jim Keenan

jkeenan commented 4 years ago

t/op/threads.t appears to be vulnerable to hangs which cause the file to be graded as FAIL.

See this smoke-test report of our simulation in Perl 5: http://perl5.test-smoke.org/report/118070.

Parse errors: Bad plan. You planned 30 tests but ran 9. Files=1, Tests=9, 183 wallclock secs ( 0.02 usr 0.05 sys + 2.36 cusr 0.57 csys = 3.00 CPU) Result: FAIL



ISTR this problem has appeared previously. Granted, this failure occurred in the smoke-me/jkeenan/cumberland-blues branch in Perl 5 -- not in tag `alpha-02-MC-4` in our repository. But I know this is not the first time I've seen a hang at this point.

Indeed, we got it a lot when we ran alpha-01 thru smoke-testing. See http://perl5.test-smoke.org/submatrix?test=../t/op/threads.t&pversion=7.0.0

jkeenan commented 4 years ago

Relevant code in t/op/threads.t:

135 # [perl #45053] Memory corruption with heavy module loading in threads
136 #
137 # run-time usage of newCONSTSUB (as done by the IO boot code) wasn't
138 # thread-safe - got occasional coredumps or malloc corruption
139 watchdog(180, "process");
140 {
141     local $SIG{__WARN__} = sub {};   # Ignore any thread creation failure warnings
142     my @t;
143     for (1..10) {
144         my $thr = threads->create( sub { require IO });
145         last if !defined($thr);      # Probably ran out of memory
146         push(@t, $thr);
147     }
148     $_->join for @t;
149     ok(1, '[perl #45053]');
150 }
151 
152 sub matchit {
153     is (ref $_[1], "Regexp");
154     like ($_[0], $_[1]);
155 }
156 
157 threads->new(\&matchit, "Pie", qr/pie/i)->join();
158 
159 # tests in threads don't get counted, so
160 curr_test(curr_test() + 2);
jkeenan commented 4 years ago

@atoomic, @toddr

This ticket is the last one that I think we need to resolve before merging alpha-dev-02-strict into alpha and deeming Objective 2 achieved.

I think we need to rule out the possibility that these (admittedly intermittent) test failures were caused by changes we made after beginning work on strict-by-default.

If, instead, the failures are due to a poor interaction between the unit test and a memory-constrained environment, then we will need to take up the issue with P5P.

Can you take a look?

Thank you very much. Jim Keenan

atoomic commented 4 years ago

@jkeenan do you know if this is a specific issue to this branch or blead also have the same problem?

jkeenan commented 4 years ago

@jkeenan do you know if this is a specific issue to this branch or blead also have the same problem?

Unfortunately, with the apparent demise of perl.test-smoke.org, I am no longer able to answer that question aside from what I've already posted in this ticket. :-(

atoomic commented 4 years ago

I recompiled Perl on FreeBSD (using the NYC perlmonger server) using alpha-dev-02-strict@a9a5af8e53

> git clean -dxf; ./Configure -Dcc="ccache gcc" -Dusedevel -Duseithreads -DDEBUGGING -des
> TEST_JOBS=4 make -j4 test_harness
> ./perl -Ilib -V:config_args
config_args='-Dcc=ccache gcc -Dusedevel -Duseithreads -DDEBUGGING -des';

Then run the test multiple times... the 100 test passes...

> cd t; for i in $(seq 100); do echo "====== $i"; ./perl harness -v op/threads.t; done; cd -
...
...
====== 100
op/threads.t ..
1..30
ok 1 - delete() under threads
ok 2 - weaken ref under threads
ok 3 - weaken ref \#2 under threads
# parent 83305: continue
# kid 1 before sort
# parent 83305: continue
# parent 83305: waiting for join
# kid 2 before sort
# kid 1 after sort, sleeping 1
# kid 2 after sort, sleeping 1
# kid 1 exit
# parent 83305: thread exited
# parent 83305: waiting for join
# kid 2 exit
# parent 83305: thread exited
ok 4
ok 5 - cloning constant subs
ok 6 - Ensure PL_linestr can be cloned
ok 7 - threads in CHECK block
ok 8 - threads in INIT block
ok 9 - Bug \#41138
ok 10 - [perl \#45053]
ok 11
ok 12
ok 13 - clone seen-evals
ok 14 - undefing a typeglob doesn't cause a crash during cloning
ok 15 - No del_backref panic [perl \#70748]
ok 16 - No del_backref panic [perl \#70748] (2)
ok 17 - returning a closure
ok 18 - Test for 34394ecd06e704e9
ok 19 - RT \#73046
ok 20 - 0 refcnt neither on tmps stack nor in @_
ok 21 - RT \#73086 - clone used to clone active pads
ok 22 - Just special casing lexicals in ?{ ... }
ok 23 - 0 refcnt during CLONE
ok 24 - avoid peephole recursion
ok 25 - Pipes shared between threads do not block when closed
ok 26 - globs cloned and joined are not recloned
ok 27 - no crash when deleting $::{INC} in thread
ok 28 - no crash modifying extended array element
ok 29 - RT \#36664: Strange behavior of shared array
ok 30 - RT \#41121 binmode(STDOUT,":encoding(utf8) does not crash
ok
All tests successful.
Files=1, Tests=30,  2 wallclock secs ( 0.02 usr  0.00 sys +  2.03 cusr  0.09 csys =  2.14 CPU)
Result: PASS
~/perl7

I cannot reproduce the described issue.

I do not say it does not exist, but this seem an uncommon issue.

If there is such an issue, I would also doubt that this is related to strict. It's more a problem about interation between threads/processes.

I do not think this issue should be a blocker to move forward. If this is a common pattern later we can tackle it at this time.

jkeenan commented 4 years ago

[snip]

I cannot reproduce the described issue.

I do not say it does not exist, but this seem an uncommon issue.

If there is such an issue, I would also doubt that this is related to strict. It's more a problem about interation between threads/processes.

I do not think this issue should be a blocker to move forward. If this is a common pattern later we can tackle it at this time.

Okay, thanks for investigating this. I will close this issue and prepare a Merge Candidate tag.

jkeenan commented 4 years ago

I'm going to re-open this issue, though not as a blocker to the completion of Objective 2.

Here is an additional case where this test failed in our simulation branch in Perl 5: http://perl.develop-help.com/raw/?id=257242

NetBSD 9.0, 7 out of 8 configurations PASS; FAIL on -Duseithreads -Duse64bitall without debugging.

jimk