Closed p5pRT closed 10 years ago
The tests for perl 5.20.0 sometimes take a very long time to run on OpenBSD. The problem appears to be with lib/Benchmark.t. I see there was an earlier report http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg208168.html but that doesn't appear to be sufficient.
I can reliable reproduce the failure by running lib/Benchmark.t but I can also reproduce it with pretty much any use of Benchmark:
perl -MBenchmark -e 'timethis(1\, sub { })'
OpenBSD5.5$ time ktrace -tc /tmp/perl-v5.20.0/bin/perl5.20.0 -MBenchmark -e 'timethis(1\, sub { })' timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (warning: too few iterations for a reliable count) 0m57.36s real 0m0.02s user 0m55.09s system
$ kdump | grep 'CALL gettimeofday' | wc -l 2073647 $ kdump | grep 'CALL getrusage' | wc -l 2073646
Compare that to a linux box (also 5.20.0\, configured with defaults).
DebianJessie$ time strace -c perl -MBenchmark -e 'timethis(1\, sub { })' timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (warning: too few iterations for a reliable count) % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000001 0 10051 times ...snip\, nothing else of consequence...
real 0m0.095s user 0m0.036s sys 0m0.052s
I tried to track down where the time was going using some debug prints and it looks like the culprit is the busy wait in runloops ("Wait for the user timer to tick..."). The combination of the while loop condition and empty loop body uses almost no user cpu time relative to the system cpu time used for the getrusage/gettimeofday calls so this can end up using quite a bit of wall clock before a single user clock tick.
This doesn't show up on linux because gettimeofday is implemented in user space via vdso (see man vdso) so the while loop uses more user cpu time per iteration there.
It looks like if that while loop in Benchmark.pm needs to stay then it should do something non-trivial in the loop body similar to the fix made in "5900049 avoid ultra-lightweight code".
On Sat May 31 18:13:26 2014\, jeremy.devenport@gmail.com wrote:
This is a bug report for perl from jeremy.devenport@gmail.com\, generated with the help of perlbug 1.40 running under perl 5.20.0.
----------------------------------------------------------------- [Please describe your issue here]
The tests for perl 5.20.0 sometimes take a very long time to run on OpenBSD. The problem appears to be with lib/Benchmark.t. I see there was an earlier report http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg208168.html but that doesn't appear to be sufficient.
I can reliable reproduce the failure by running lib/Benchmark.t but I can also reproduce it with pretty much any use of Benchmark:
perl -MBenchmark -e 'timethis(1\, sub { })'
OpenBSD5.5$ time ktrace -tc /tmp/perl-v5.20.0/bin/perl5.20.0 -MBenchmark -e 'timethis(1\, sub { })' timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (warning: too few iterations for a reliable count) 0m57.36s real 0m0.02s user 0m55.09s system
$ kdump | grep 'CALL gettimeofday' | wc -l 2073647 $ kdump | grep 'CALL getrusage' | wc -l 2073646
Compare that to a linux box (also 5.20.0\, configured with defaults).
DebianJessie$ time strace -c perl -MBenchmark -e 'timethis(1\, sub { })' timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (warning: too few iterations for a reliable count) % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000001 0 10051 times ...snip\, nothing else of consequence...
real 0m0.095s user 0m0.036s sys 0m0.052s
I tried to track down where the time was going using some debug prints and it looks like the culprit is the busy wait in runloops ("Wait for the user timer to tick..."). The combination of the while loop condition and empty loop body uses almost no user cpu time relative to the system cpu time used for the getrusage/gettimeofday calls so this can end up using quite a bit of wall clock before a single user clock tick.
This doesn't show up on linux because gettimeofday is implemented in user space via vdso (see man vdso) so the while loop uses more user cpu time per iteration there.
It looks like if that while loop in Benchmark.pm needs to stay then it should do something non-trivial in the loop body similar to the fix made in "5900049 avoid ultra-lightweight code".
Porters: Investigation and resolution of this ticket would benefit from having smoke reports run on OpenBSD. Can anyone configure this and send results to perl.develop-help.org?
Thank you very much. Jim Keenan
The RT System itself - Status changed from 'new' to 'open'
On Sat\, May 31\, 2014 at 06:13:26PM -0700\, Jeremy Devenport wrote:
I tried to track down where the time was going using some debug prints and it looks like the culprit is the busy wait in runloops ("Wait for the user timer to tick..."). The combination of the while loop condition and empty loop body uses almost no user cpu time relative to the system cpu time used for the getrusage/gettimeofday calls so this can end up using quite a bit of wall clock before a single user clock tick.
This doesn't show up on linux because gettimeofday is implemented in user space via vdso (see man vdso) so the while loop uses more user cpu time per iteration there.
can you confirm that on your system\, the following one-liner takes many seconds to execute?:
time perl -e'$t = (times)[0]; while ($t == (times)[0]) {}'
And if so\, can you find a minimum bit of 'make work' to add in the {} block that reduces the wall time to something sensible?
-- Monto Blanco... scorchio!
On Sat\, Jun 07\, 2014 at 03:52:52PM +0100\, Dave Mitchell wrote:
On Sat\, May 31\, 2014 at 06:13:26PM -0700\, Jeremy Devenport wrote:
I tried to track down where the time was going using some debug prints and it looks like the culprit is the busy wait in runloops ("Wait for the user timer to tick..."). The combination of the while loop condition and empty loop body uses almost no user cpu time relative to the system cpu time used for the getrusage/gettimeofday calls so this can end up using quite a bit of wall clock before a single user clock tick.
This doesn't show up on linux because gettimeofday is implemented in user space via vdso (see man vdso) so the while loop uses more user cpu time per iteration there.
can you confirm that on your system\, the following one-liner takes many seconds to execute?:
time perl -e'$t = (times)[0]; while ($t == (times)[0]) {}'
And if so\, can you find a minimum bit of 'make work' to add in the {} block that reduces the wall time to something sensible?
I can't seem to reproduce this with stock perl-5.20.0\, I tried on a couple architectures.
$ sysctl kern.version hw.model kern.version=OpenBSD 5.5-current (GENERIC.MP) #122: Sat May 10 14:16:11 MDT 2014 deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
hw.model=Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz $ LD_LIBRARY_PATH=./ ./perl -Ilib -v | head -2
This is perl 5\, version 20\, subversion 0 (v5.20.0) built for OpenBSD.amd64-openbsd $ LD_LIBRARY_PATH=./ time ./perl -Ilib -e1 0.00 real 0.00 user 0.01 sys $ LD_LIBRARY_PATH=./ time ./perl -Ilib -MBenchmark -e1 0.06 real 0.01 user 0.06 sys $ LD_LIBRARY_PATH=./ time ./perl -Ilib -MBenchmark -e 'timethis(1\, sub { })' timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (warning: too few iterations for a reliable count) 0.16 real 0.02 user 0.14 sys
kern.version=OpenBSD 5.5-current (GENERIC) #122: Fri May 23 21:23:59 MDT 2014 deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
hw.model=Intel(R) Pentium(R) M processor 1.30GHz ("GenuineIntel" 686-class) $ LD_LIBRARY_PATH=./ ./perl -Ilib -v | head -2
This is perl 5\, version 20\, subversion 0 (v5.20.0) built for OpenBSD.i386-openbsd
$ LD_LIBRARY_PATH=./ time ./perl -Ilib -e1
0.00 real 0.00 user 0.00 sys
$ LD_LIBRARY_PATH=./ time ./perl -Ilib -MBenchmark -e1
0.04 real 0.03 user 0.02 sys
$ LD_LIBRARY_PATH=./ time ./perl -Ilib -e'$t = (times)[0]; while ($t == (times)[0]) {}'
0.01 real 0.01 user 0.01 sys
$ LD_LIBRARY_PATH=./ time ./perl -Ilib -MBenchmark -e 'timethis(1\, sub { })'
timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
(warning: too few iterations for a reliable count)
0.06 real 0.06 user 0.00 sys
I also tried on a much slower machine\, but I only have 5.20.0 plus OpenBSD's local patches available and didn't have a couple days to compile a stock version of perl. It didn't manifest here either\, although loading Benchmark is obviously more noticeably slower.
$ sysctl kern.version hw.model
kern.version=OpenBSD 5.5-current (GENERIC) #39: Thu May 8 02:11:39 MDT 2014
deraadt@alpha.openbsd.org:/usr/src/sys/arch/alpha/compile/GENERIC
hw.model=AlphaStation 200 4/166 $ LD_LIBRARY_PATH=./ ./perl -Ilib -v | head -2
This is perl 5\, version 20\, subversion 0 (v5.20.0) built for alpha-openbsd $ LD_LIBRARY_PATH=./ time ./perl -Ilib -e1 0.52 real 0.13 user 0.32 sys $ LD_LIBRARY_PATH=./ time ./perl -Ilib -MBenchmark -e1 3.08 real 1.97 user 0.95 sys $ LD_LIBRARY_PATH=./ time ./perl -Ilib -e'$t = (times)[0]; while ($t == (times)[0]) {}' 0.54 real 0.13 user 0.34 sys $ LD_LIBRARY_PATH=./ time ./perl -Ilib -MBenchmark -e 'timethis(1\, sub { })' timethis 1: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (warning: too few iterations for a reliable count) 3.15 real 2.00 user 0.99 sys
can you confirm that on your system\, the following one-liner takes many seconds to execute?:
time perl -e'$t = (times)[0]; while ($t == (times)[0]) {}'
Yes\, that one-liner demonstrates the same problem of high system/user cpu usage: 0m6.14s real 0m0.01s user 0m6.10s system
It also seems to be relevant that I'm running OpenBSD under VirtualBox. I did a few more experiments with other OSs under VirtualBox and the same OpenBSD version under VMware Player with both a simple use of timethis() and with your one-liner. (I've reformatted the timing output to be consistent across platforms)
$ time ./perl -Ilib -MBenchmark -e 'timethis(1\, sub { }) for 1..10' VirtualBox OpenBSD 5.5 1m59.00s real 0m0.21s user 1m58.19s system FreeBSD 10 0m4.80s real 0m0.16s user 0m4.64s system Ubuntu Jessie 0m0.26s real 0m0.20s user 0m0.04s system VMware Player OpenBSD 5.5 0m0.39s real 0m0.20s user 0m0.20s system
$ time ./perl -e'$t = (times)[0]; while ($t == (times)[0]) {}' VirtualBox OpenBSD 5.5 0m6.14s real 0m0.01s user 0m6.10s system FreeBSD 10 0m1.18s real 0m0.00s user 0m1.18s system Ubuntu Jessie 0m0.03s real 0m0.01s user 0m0.02s system
VMware Player OpenBSD 5.5 0m0.00s real 0m0.01s user 0m0.00s system
It looks like FreeBSD under VirtualBox also takes longer than it should (4 seconds to do 10 no-op benchmarks) but is probably fast enough to be acceptable. And the same version of OpenBSD using VMware works just fine. So this seems to be a problem specifically when running under VirtualBox.
Adding somewhere between 1000 and 10\,000 floating point divisions into the body of the while loop in Benchmark.pm seems to be enough to get it to behave.
With this change:
- while ( ( $t0 = Benchmark->new(0) )->[1] == $tbase ) {} ; + while ( ( $t0 = Benchmark->new(0) )->[1] == $tbase ) { + for (my $i=0; $i \< 10_000; $i++) { $i / 1.5 } + }
$ time ./perl -Ilib -MBenchmark -e 'timethis(1\, sub { }) for 1..10' VirtualBox OpenBSD 5.5 0m0.23s real 0m0.21s user 0m0.03s system
On Sun\, Jun 08\, 2014 at 08:55:56AM -0400\, Jeremy Devenport wrote:
can you confirm that on your system\, the following one-liner takes many seconds to execute?:
time perl -e'$t = (times)[0]; while ($t == (times)[0]) {}'
Yes\, that one-liner demonstrates the same problem of high system/user cpu usage: 0m6.14s real 0m0.01s user 0m6.10s system
I've pushed the following fix to blead:
commit 78462dd3f2a26a5a5f9ec040277bad147e6c5d52 Author: David Mitchell \davem@​iabyn\.com AuthorDate: Mon Jun 9 11:41:22 2014 +0100 Commit: David Mitchell \davem@​iabyn\.com CommitDate: Mon Jun 9 11:51:18 2014 +0100
Benchmark.pm: avoid long walltime on calibration
RT #122003
runloop() in Benchmark.pm does the equivalent of the following:
$t = (times)[0];
while ($t == (times)[0]) {}
so that it finishes in a position where the user CPU counter has just
incremented.
This is all well and good\, but on some platforms (specifically OpenBSD
running under VirtualBox)\, the system CPU burned each time round the loop
to get times() is far greater than the user CPU burned. This can cause the
loop to run for minutes consuming system CPU until enough user CPU has
been burned to tick the user CPU counter.
The fix in this commit is to replace the empty body of the while loop with
something that does a gradually increasing amount of busy work.
Affected files ...
M lib/Benchmark.pm
Differences ...
A walk of a thousand miles begins with a single step... then continues for another 1\,999\,999 or so.
On Mon\, Jun 09\, 2014 at 11:53:59AM +0100\, Dave Mitchell wrote:
On Sun\, Jun 08\, 2014 at 08:55:56AM -0400\, Jeremy Devenport wrote:
can you confirm that on your system\, the following one-liner takes many seconds to execute?:
time perl -e'$t = (times)[0]; while ($t == (times)[0]) {}'
Yes\, that one-liner demonstrates the same problem of high system/user cpu usage: 0m6.14s real 0m0.01s user 0m6.10s system
I've pushed the following fix to blead:
commit 78462dd3f2a26a5a5f9ec040277bad147e6c5d52 Author: David Mitchell \davem@​iabyn\.com AuthorDate: Mon Jun 9 11:41:22 2014 +0100 Commit: David Mitchell \davem@​iabyn\.com CommitDate: Mon Jun 9 11:51:18 2014 +0100
Benchmark\.pm​: avoid long walltime on calibration RT \#122003 runloop\(\) in Benchmark\.pm does the equivalent of the following​: $t = \(times\)\[0\]; while \($t == \(times\)\[0\]\) \{\} so that it finishes in a position where the user CPU counter has just incremented\. This is all well and good\, but on some platforms \(specifically OpenBSD running under VirtualBox\)\, the system CPU burned each time round the loop to get times\(\) is far greater than the user CPU burned\. This can cause the loop to run for minutes consuming system CPU until enough user CPU has been burned to tick the user CPU counter\. The fix in this commit is to replace the empty body of the while loop with something that does a gradually increasing amount of busy work\.
This has fixed the tardiness of this particular test on OpenBSD\, NetBSD\, MidnightBSD and DragonflyBSD that I noticed when running on VirtualBox.
Many thanks!
@iabyn - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#122003 (status was 'resolved')
Searchable as RT122003$