Open p5pRT opened 19 years ago
I would expect
grep {EXPR} LIST
to be somewhat slower than
grep EXPR\, LIST
For instance\, the following benchmark shows the block version to be 5% slower:
use Benchmark qw /cmpthese/;
our @array = 101 .. 200;
cmpthese -2 => { grep_eq_expr => '$::a1 = grep $_ == 50\, @::array'\, grep_eq_blck => '$::a2 = grep {$_ == 50} @::array'\, };
__END__ Rate grep_eq_blck grep_eq_expr grep_eq_blck 28902/s -- -5% grep_eq_expr 30282/s 5% --
If\, however\, the expression is a regexp\, things change drastically:
use Benchmark qw /cmpthese/;
our @array = 101 .. 200;
cmpthese -2 => { grep_re_expr => '$::b1 = grep /^50$/\, @::array'\, grep_re_blck => '$::b2 = grep {/^50$/} @::array'\, };
__END__ Rate grep_re_blck grep_re_expr grep_re_blck 6455/s -- -69% grep_re_expr 20848/s 223% --
This difference in speed surprises me.
It's not the fact that it's a regexp that's run inside a block\, as a similar for loop doesn't show the huge difference:
use Benchmark qw /cmpthese/;
our @array = 101 .. 200;
cmpthese -2 => {
for_eq_expr => '$::c1 = 0; $_ == 50 and $::c1 ++ for @::array'\,
for_eq_blck => '$::c2 = 0; for (@::array) {$_ == 50 and $::c2 ++}'\,
};
__END__ Rate for_eq_blck for_eq_expr for_eq_blck 23640/s -- -5% for_eq_expr 24992/s 6% --
use Benchmark qw /cmpthese/;
our @array = 101 .. 200;
cmpthese -2 => {
for_re_expr => '$::d1 = 0; /^50$/ and $::d1 ++ for @::array'\,
for_re_blck => '$::d2 = 0; for (@::array) {/^50$/ and $::d2 ++}'\,
};
__END__ Rate for_re_blck for_re_expr for_re_blck 19091/s -- -2% for_re_expr 19545/s 2% --
In both cases\, the block version is slower\, but by a small margin.
So\, I suspect a bug somewhere.
I'll be leaving for my honeymoon and YAPC::AU tomorrow morning\, so it might be a while before I'll be able to reply to any questions.
Abigail
On Thu\, Nov 04\, 2004 at 11:55:31PM +0000 abigail@abigail.nl (via RT) wrote:
# New Ticket Created by abigail@abigail.nl # Please include the string: [perl #32331] # in the subject line of all future correspondence about this issue. # \<URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=32331 >
This is a bug report for perl from abigail@abigail.nl\, generated with the help of perlbug 1.35 running under perl v5.8.5.
----------------------------------------------------------------- [Please enter your report here]
I would expect
grep \{EXPR\} LIST
to be somewhat slower than
grep EXPR\, LIST
For instance\, the following benchmark shows the block version to be 5% slower:
use Benchmark qw /cmpthese/; our @​array = 101 \.\. 200; cmpthese \-2 => \{ grep\_eq\_expr => '$​::a1 = grep $\_ == 50\, @​​::array'\, grep\_eq\_blck => '$​::a2 = grep \{$\_ == 50\} @​​::array'\, \}; \_\_END\_\_ Rate grep\_eq\_blck grep\_eq\_expr grep\_eq\_blck 28902/s \-\- \-5% grep\_eq\_expr 30282/s 5% \-\-
If\, however\, the expression is a regexp\, things change drastically:
use Benchmark qw /cmpthese/; our @​array = 101 \.\. 200; cmpthese \-2 => \{ grep\_re\_expr => '$​::b1 = grep /^50$/\, @​​::array'\, grep\_re\_blck => '$​::b2 = grep \{/^50$/\} @​​::array'\, \}; \_\_END\_\_ Rate grep\_re\_blck grep\_re\_expr grep\_re\_blck 6455/s \-\- \-69% grep\_re\_expr 20848/s 223% \-\-
This difference in speed surprises me.
[...]
Just for the record: I can confirm this. The problem is still in blead (tested on #23466).
Tassilo -- $_=q#"\,}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({ pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#; $_=reverse\,s+(?\<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
The RT System itself - Status changed from 'new' to 'open'
I've tried this with 5.6.1\, 5.8.3 and 5.8.5\, all with the same result. I modified the program several ways. It appears to be that the regexp is being re-parsed for every invocation of the block. I have no idea why\, but I see\, for example\, that pre-parsing the regexp results in slightly better performance. That is:
Sorry\, I submitted before I was done. My final version of the test:
use Benchmark qw /cmpthese/;
our @array = 101 .. 20000; our $b1; our $b2;
cmpthese -2 => { grep_re_expr => '$::b1 = grep /50/\, @::array'\, grep_re_blck => '$::b1 = grep {/50/} @::array'\, grep_re_subr => '$::b2 = grep {is50($_)} @::array'\, };
die unless $b1 == $b2;
sub is50 { return ($_[0] =~ /50/); }
I also tried:
our $p = qr{^50$};
and then something like:
grep_re_blck => '$::b1 = grep {/$p/} @::array'\,
which was very slightly faster (relative to the expr version with the pre-compiled regexp)\, so this makes me suspect repeated re-parsing of the regexp.
On Fri Dec 03 11:22:39 2004\, ajs wrote:
Sorry\, I submitted before I was done. My final version of the test:
use Benchmark qw /cmpthese/; our @​array = 101 \.\. 20000; our $b1; our $b2; cmpthese \-2 => \{ grep\_re\_expr => '$​::b1 = grep /50/\, @​​::array'\, grep\_re\_blck => '$​::b1 = grep \{/50/\} @​​::array'\, grep\_re\_subr => '$​::b2 = grep \{is50\($\_\)\} @​​::array'\, \}; die unless $b1 == $b2;
sub is50 { return ($_[0] =~ /50/); }
I also tried:
our $p = qr{^50$};
and then something like:
grep_re_blck => '$::b1 = grep {/$p/} @::array'\,
which was very slightly faster (relative to the expr version with the pre-compiled regexp)\, so this makes me suspect repeated re-parsing of the regexp.
This is still present in blead; But as far as I can tell\, the slowdown is because the block version needs to create and enter/leave a scope\, while the expression version has no such penalty; So unless I'm wrong\, I think this has to be marked as WontFix.
On Sun Apr 29 18:49:32 2012\, Hugmeir wrote:
On Fri Dec 03 11:22:39 2004\, ajs wrote:
Sorry\, I submitted before I was done. My final version of the test:
use Benchmark qw /cmpthese/; our @​array = 101 \.\. 20000; our $b1; our $b2; cmpthese \-2 => \{ grep\_re\_expr => '$​::b1 = grep /50/\, @​​::array'\, grep\_re\_blck => '$​::b1 = grep \{/50/\} @​​::array'\, grep\_re\_subr => '$​::b2 = grep \{is50\($\_\)\} @​​::array'\, \}; die unless $b1 == $b2;
sub is50 { return ($_[0] =~ /50/); }
I also tried:
our $p = qr{^50$};
and then something like:
grep_re_blck => '$::b1 = grep {/$p/} @::array'\,
which was very slightly faster (relative to the expr version with the pre-compiled regexp)\, so this makes me suspect repeated re-parsing of the regexp.
This is still present in blead; But as far as I can tell\, the slowdown is because the block version needs to create and enter/leave a scope\, while the expression version has no such penalty; So unless I'm wrong\, I think this has to be marked as WontFix.
But the enter/leave shouldn’t make this much difference:
The first script\, which uses $_==50:
$ pbpaste|perl5.15.9 Rate grep_eq_blck grep_eq_expr grep_eq_blck 42400/s -- -4% grep_eq_expr 44245/s 4% --
The second script\, which uses /^50$/:
$ pbpaste|perl5.15.9 Rate grep_re_blck grep_re_expr grep_re_blck 9481/s -- -69% grep_re_expr 30488/s 222% --
--
Father Chrysostomos
Some say that
grep { /PATTERN } @x
is better style than
grep /PATTERN/\, @x
(Perl Best Practices and perlcritic BuiltinFunctions::RequireBlockGrep)
To keep these people happy it would be good to put in an optimization for the former code to keep it just as fast as the latter.
If speeding it up is a WONTFIX\, then perhaps perlcritic's rules need to change.
-- Ed Avis \eda@​waniasset\.com
On Tue\, May 01\, 2012 at 10:48:53AM +0000\, Ed Avis wrote:
Some say that
grep \{ /PATTERN \} @​x
is better style than
grep /PATTERN/\, @​x
(Perl Best Practices and perlcritic BuiltinFunctions::RequireBlockGrep)
To keep these people happy it would be good to put in an optimization for the former code to keep it just as fast as the latter.
If speeding it up is a WONTFIX\, then perhaps perlcritic's rules need to change.
It is a safe general-case transformation? Is there any behaviour one can put into a pattern that would notice the lack of scope exit after each run? ie the difference at steps 8\, 9 and b in the second case:
$ perl -MO=Concise\,-exec -e 'grep /PATTERN/\, @x' 1 \<0> enter 2 \<;> nextstate(main 1 -e:1) v:{ 3 \<0> pushmark s 4 \<$> gv(*x) s 5 \<1> rv2av[t1] lKM/1 6 \<@> grepstart K 7 \<|> grepwhile(other->8)[t2] vK 8 \</> match(/"PATTERN"/) s/RTIME goto 7 9 \<@> leave[1 ref] vKP/REFC -e syntax OK
$ perl -MO=Concise\,-exec -e 'grep {/PATTERN/} @x' 1 \<0> enter 2 \<;> nextstate(main 2 -e:1) v:{ 3 \<0> pushmark s 4 \<$> gv(*x) s 5 \<1> rv2av[t1] lKM/1 6 \<@> grepstart K* 7 \<|> grepwhile(other->8)[t2] vK 8 \<0> enter s 9 \<;> nextstate(main 1 -e:1) v:{ a \</> match(/"PATTERN"/) s/RTIME b \<@> leave sKP goto 7 c \<@> leave[1 ref] vKP/REFC -e syntax OK
The idea of an optimisation seems nice. But I don't know if it's trivially safe. Or whether it would need to check the pattern for complexity\, and only work on "simple" things.
Nicholas Clark
On Tue\, May 01\, 2012 at 10:48:53AM +0000\, Ed Avis wrote:
Some say that
grep \{ /PATTERN \} @​x
is better style than
grep /PATTERN/\, @​x
(Perl Best Practices and perlcritic BuiltinFunctions::RequireBlockGrep)
To keep these people happy it would be good to put in an optimization for the former code to keep it just as fast as the latter.
If speeding it up is a WONTFIX\, then perhaps perlcritic's rules need to change.
It definitely isn't yet a WONTFIX: there's something specifically about using a regex (as opposed to just == say) that makes it *much* more slower in a block than would be expected.
Until the reason is diagnosed\, we can't really decide what to do about it (but I'm not intending to look at it just yet).
-- Modern art: "That's easy\, I could have done that!" "Ah\, but you didn't!"
On Tue\, May 01\, 2012 at 04:14:12AM -0700\, Dave Mitchell via RT wrote:
On Tue\, May 01\, 2012 at 10:48:53AM +0000\, Ed Avis wrote:
Some say that
grep \{ /PATTERN \} @​x
is better style than
grep /PATTERN/\, @​x
(Perl Best Practices and perlcritic BuiltinFunctions::RequireBlockGrep)
To keep these people happy it would be good to put in an optimization for the former code to keep it just as fast as the latter.
If speeding it up is a WONTFIX\, then perhaps perlcritic's rules need to change.
It definitely isn't yet a WONTFIX: there's something specifically about using a regex (as opposed to just == say) that makes it *much* more slower in a block than would be expected.
Note that the slowdown isn't a "regexp in a block". It's a "regexp in a grep block". As the original bug report shows\, the difference between for-expr and for-block are much less (with 5.15.9\, the difference between for-expr and for-block is less than 10% on my machine).
With map\, I get an inbetween value (using 5.15.9):
use Benchmark qw /cmpthese/;
our @array = 1 .. 200;
cmpthese -2 => { map_re_blck => '$::d1 = map {/^50$/ ? $_ : ()} @::array'\, map_re_expr => '$::d2 = map /^50$/ ? $_ : ()\, @::array'\, };
__END__ Rate map_re_blck map_re_expr map_re_blck 8756/s -- -39% map_re_expr 14422/s 65% --
Also\, if the matches are failing\, the difference between grep-block and grep-expr is bigger (using 5.15.9):
use Benchmark qw /cmpthese/;
our @array1 = 5000 .. 5099; our @array2 = 4000 .. 4099;
cmpthese -2 => { grep_re_blck1 => '$::d1 = grep {/^50/} @::array1'\, grep_re_expr1 => '$::d2 = grep /^50/\, @::array1'\, grep_re_blck2 => '$::d1 = grep {/^50/} @::array2'\, grep_re_expr2 => '$::d2 = grep /^50/\, @::array2'\, };
__END__ Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2 grep_re_blck1 5879/s -- -32% -51% -81% grep_re_blck2 8635/s 47% -- -28% -72% grep_re_expr1 12041/s 105% 39% -- -61% grep_re_expr2 30629/s 421% 255% 154% --
Until the reason is diagnosed\, we can't really decide what to do about it (but I'm not intending to look at it just yet).
The bug is quite old (I've known about it for longer than the bug report is old -- it is as least as old as 5.005)\, so there's no urgency. OTOH\, "grep {/PATTERN/}" is a often used idiom.
Abigail
As @nwc observed, with $::b1 = grep /^50$/, @::array
, the interesting part of the op tree seems to be just:
l <|> grepwhile(other->m)[t12] sK ->n
...
m </> match(/"^50$"/) s ->l
With $::b1 = grep {/^50$/} @::array
, a lot more goes on in each iteration:
l <|> grepwhile(other->m)[t12] sK ->q
...
p <@> leave sKP ->l
m <0> enter s ->n
n <;> nextstate(main 6 -e:1) v:{ ->o
o </> match(/"^50$"/) s ->p
Using the wrapper of our @array = 101 .. 200; for (1..1_000_000) { $::b1 = grep EXPR-or-BLOCK @::array }
, perf record
and perf report
on blead show >1% of run time contributors as the following for the first form:
20.66% miniperl miniperl [.] Perl_pp_grepwhile
18.93% miniperl miniperl [.] Perl_pp_match
18.48% miniperl miniperl [.] Perl_re_intuit_start
18.21% miniperl miniperl [.] Perl_leave_scope
8.71% miniperl miniperl [.] Perl_regexec_flags
4.93% miniperl miniperl [.] Perl_save_vptr
3.36% miniperl miniperl [.] Perl_sv_2pv_flags
3.27% miniperl miniperl [.] Perl_pop_scope
and this for the second:
12.70% miniperl miniperl [.] Perl_pp_match
12.26% miniperl miniperl [.] Perl_sv_setsv_flags
11.22% miniperl miniperl [.] Perl_pp_grepwhile
10.96% miniperl miniperl [.] Perl_sv_upgrade
8.39% miniperl miniperl [.] Perl_re_intuit_start
7.46% miniperl miniperl [.] Perl_leave_scope
5.27% miniperl miniperl [.] Perl_pp_enter
5.15% miniperl miniperl [.] Perl_leave_adjust_stacks
5.14% miniperl miniperl [.] Perl_pp_leave
4.85% miniperl miniperl [.] Perl_sv_clear
3.54% miniperl miniperl [.] Perl_regexec_flags
3.23% miniperl miniperl [.] Perl_sv_free2
2.98% miniperl miniperl [.] Perl_pp_nextstate
1.53% miniperl miniperl [.] Perl_save_vptr
1.39% miniperl miniperl [.] Perl_sv_2pv_flags
1.08% miniperl miniperl [.] Perl_pop_scope
A good chunk of the extra runtime seems to be due to Perl_leave_adjust_stacks
creating, upgrading and assigning to a new mortal SV on every iteration and something (maybe the additional nextstate
) clearing and freeing that SV.
Neither of $::b2 = grep {/^50$/} @::array
or $::b1 = grep $_ == 50, @::array
do this. The relevant bits of the op tree are basically identical:
l <|> grepwhile(other->m)[t13] sK ->p
...
o <2> eq sK/2 ->l
- <1> ex-rv2sv sK/1 ->n
m <#> gvsv[*_] s ->n
n <$> const[IV 50] s ->o
vs:
l <|> grepwhile(other->m)[t13] sK ->p
...
o <2> eq sK/2 ->l
- <1> ex-rv2sv sK/1 ->n
m <#> gvsv[*_] s ->n
n <$> const[IV 50] s ->o
A good chunk of the extra runtime seems to be due to
Perl_leave_adjust_stacks
creating, upgrading and assigning to a new mortal SV on every iteration and something (maybe the additionalnextstate
) clearing and freeing that SV.
The SV for which the mortal copy is being made is &PL_sv_no
. It seems like it shouldn't be necessary to make a mortal copy of this?
Using Abigail's benchmark against blead, things look tighter than they used to be:
Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2
grep_re_blck1 118606/s -- -6% -53% -62%
grep_re_blck2 126029/s 6% -- -50% -59%
grep_re_expr1 250863/s 112% 99% -- -19%
grep_re_expr2 310597/s 162% 146% 24% --
Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2
grep_re_blck1 170393/s -- -14% -32% -45%
grep_re_blck2 198189/s 16% -- -21% -36%
grep_re_expr1 252057/s 48% 27% -- -19%
grep_re_expr2 312076/s 83% 57% 24% --
The remaining overhead then seems down to the enter
, leave
, extra nextstate
, and lesser action in Perl_leave_adjust_stacks
.
grep {/^50$/}
gets an ENTER
+LEAVE
pair while grep { $ == 50}
gets a SCOPE
in _Perl_opscope, with the difference being whether or not the LINESEQ
has the PARENS
flag.
1 lineseq LISTOP(0x55c8a0130718) ===> [0x0]
PARENT ===> [0x0]
FLAGS = (UNKNOWN,KIDS,PARENS,SLABBED)
|
2 +--nextstate COP(0x55c8a01300d0) ===> [SELF]
| FLAGS = (VOID,SLABBED,MORESIB)
| LINE = 1
| PACKAGE = "main"
| HINTS = 00000100
| SEQ = 3
|
3 +--match PMOP(0x55c8a0130758) ===> [0x0]
FLAGS = (UNKNOWN,SLABBED)
PMf_PRE /^50$/
PMFLAGS = ()
vs
1 lineseq LISTOP(0x561034011630) ===> [0x0]
PARENT ===> [0x0]
FLAGS = (UNKNOWN,KIDS,SLABBED)
|
2 +--nextstate COP(0x561034011670) ===> [SELF]
| FLAGS = (VOID,SLABBED,MORESIB)
| LINE = 1
| PACKAGE = "main"
| SEQ = 3
|
3 +--eq BINOP(0x5610340116d0) ===> 4 [gv 0x5610340110f8]
FLAGS = (SCALAR,KIDS,SLABBED)
PRIVATE = (0x2)
|
5 +--rv2sv UNOP(0x561034011748) ===> 6 [const 0x561034011710]
| FLAGS = (SCALAR,KIDS,SLABBED,MORESIB)
| PRIVATE = (0x1)
| |
4 | +--gv SVOP(0x5610340110f8) ===> 5 [rv2sv 0x561034011748]
| FLAGS = (SCALAR,SLABBED)
| GV = main::_ (0x561033fe6738)
|
6 +--const SVOP(0x561034011710) ===> 3 [eq 0x5610340116d0]
FLAGS = (SCALAR,SLABBED)
SV = IV(50)
I don't know why the difference in the LINESEQs...
I don't know why the difference in the LINESEQs...
OPf_PARENS
is applied to grep {/^50$/}
by _Svoidnonfinal because (PL_hints & HINT_BLOCK_SCOPE)
is true, whereas it is not true for grep { $_ == 0}
.
(Updated from original comment which suggested that it was set by the parser.)
The flag seems to be set by Perl_pmruntime
in the following snippet, but the rationale for doing so is not outlined in any comments:
PL_hints |= HINT_BLOCK_SCOPE;
pm = cPMOPo;
assert(floor==0 || (pm->op_pmflags & PMf_HAS_CV));
if (is_compiletime) {
Commenting out that line does allow miniperl to produce a simpler OP tree for grep {^50$/}
:
+--scope LISTOP(0x556104efd018) ===> 16 [null 0x556104efcfb0]
FLAGS = (SCALAR,KIDS,SLABBED)
|
+--null (ex-nextstate) COP(0x556104efc9d0) ===> 12 [match 0x556104efd058]
| FLAGS = (VOID,SLABBED,MORESIB)
| LINE = 1
| PACKAGE = "main"
| SEQ = 4294967248
|
+--match PMOP(0x556104efd058) ===> 11 [grepwhile 0x556104efcec8]
FLAGS = (SCALAR,SLABBED)
PMf_PRE /^50$/
PMFLAGS = ()
but perl then won't build:
$ make perl
./miniperl -Ilib make_patchnum.pl
Updating 'git_version.h' and 'lib/Config_git.pl'
./miniperl -Ilib configpm
written lib/Config.pod
updated lib/Config.pm
updated lib/Config_heavy.pl
./miniperl -Ilib make_ext.pl ext/ExtUtils-Miniperl/pm_to_blib MAKE="make" LIBPERL_A=libperl.a
Running pm_to_blib for ext/ExtUtils-Miniperl directly
miniperl: sv.c:3757: S_glob_assign_glob: Assertion `isGV_with_GP(_gvgp)' failed.
Aborted
make: *** [makefile:603: ext/ExtUtils-Miniperl/pm_to_blib] Error 134
Maybe someone more familiar with the regex engine would be able to figure out if there are circumstances in which Perl_pmruntime
does not need to set HINT_BLOCK_SCOPE
?
Migrated from rt.perl.org#32331 (status was 'open')
Searchable as RT32331$