Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.93k stars 552 forks source link

Slowdown in m//g on COW strings of certain lengths #15266

Open p5pRT opened 8 years ago

p5pRT commented 8 years ago

Migrated from rt.perl.org#127855 (status was 'open')

Searchable as RT127855$

p5pRT commented 8 years ago

From sschoeling@linet-services.de

Created by sschoeling@linet-services.de

The following oneliner creates a large string\, creates a COW view with offset 1 and m//g walks through it​:

  perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256. It's not noticable for strings with 100k bytes or less\, but grows quadratically from there.

Some times from this machine​: 128000​: 0.687 128001​: 0.016

256000​: 3.819 256001​: 0.028

512000​: 15.614 512001​: 0.055

perl versions​: 5.18.2 seems to be good 5.19.11 is the earliest buggy version I have available every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Perl Info ``` Flags: category=core severity=low Site configuration information for perl 5.22.0: Configured by sschoeling at Tue Jun 2 10:19:46 CEST 2015. Summary of my perl5 (revision 5 version 22 subversion 0) configuration: Commit id: 70f63a4c7dba89e8e48b44de7978faae4319e693 Platform: osname=linux, osvers=3.13.0-53-generic, archname=x86_64-linux uname='linux plum-chiew 3.13.0-53-generic #89-ubuntu smp wed may 20 10:34:39 utc 2015 x86_64 x86_64 x86_64 gnulinux ' config_args='' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.8.2', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' @INC for perl 5.22.0: /home/sschoeling/lib/perl5/site_perl/5.22.0/x86_64-linux /home/sschoeling/lib/perl5/site_perl/5.22.0 /home/sschoeling/lib/perl5/5.22.0/x86_64-linux /home/sschoeling/lib/perl5/5.22.0 /home/sschoeling/lib/perl5/site_perl/5.18.0 /home/sschoeling/lib/perl5/site_perl . Environment for perl 5.22.0: HOME=/home/sschoeling LANG=en_US.UTF-8 LANGUAGE=en_US LC_ADDRESS=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8 LC_MEASUREMENT=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_NAME=de_DE.UTF-8 LC_NUMERIC=de_DE.UTF-8 LC_PAPER=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8 LC_TIME=de_DE.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/opt/sschoeling/perlbrew/bin:/home/sschoeling/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games PERLBREW_BASHRC_VERSION=0.66 PERLBREW_HOME=/home/sschoeling/.perlbrew PERLBREW_ROOT=/opt/sschoeling/perlbrew PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 8 years ago

From @jkeenan

On Thu Apr 07 12​:20​:45 2016\, sschoeling@​linet-services.de wrote​:

This is a bug report for perl from sschoeling@​linet-services.de\, generated with the help of perlbug 1.40 running under perl 5.22.0.

----------------------------------------------------------------- [Please describe your issue here]

The following oneliner creates a large string\, creates a COW view with offset 1 and m//g walks through it​:

perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256. It's not noticable for strings with 100k bytes or less\, but grows quadratically from there.

Some times from this machine​: 128000​: 0.687 128001​: 0.016

256000​: 3.819 256001​: 0.028

512000​: 15.614 512001​: 0.055

perl versions​: 5.18.2 seems to be good 5.19.11 is the earliest buggy version I have available every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Confirmed on blead.

##### $ ./perl -v | head -2 | tail -1 This is perl 5\, version 23\, subversion 10 (v5.23.10 (v5.23.9-56-g87c118b)) built for x86_64-linux

[perl] 6 $ time ./perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.302s user 0m0.298s sys 0m0.004s [perl] 7 $ time ./perl -e 'my $f = "0" x (4e6); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m14.689s user 0m14.688s sys 0m0.008s [perl] 8 $ time ./perl -e 'my $f = "0" x (4e6-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.294s user 0m0.290s sys 0m0.004s [perl] 9 $ time ./perl -e 'my $f = "0" x (4e6+256+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.293s user 0m0.290s sys 0m0.004s [perl] 10 $ time ./perl -e 'my $f = "0" x (4e6+256-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.301s user 0m0.297s sys 0m0.004s [perl] 11 $ time ./perl -e 'my $f = "0" x (4e6+256); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m7.211s user 0m7.194s sys 0m0.016s [perl] 12 $ time ./perl -e 'my $f = "0" x (4e6-256-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.287s user 0m0.283s sys 0m0.004s [perl] 13 $ time ./perl -e 'my $f = "0" x (4e6-256+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.294s user 0m0.287s sys 0m0.008s [perl] 14 $ time ./perl -e 'my $f = "0" x (4e6-256); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m6.437s user 0m6.432s sys 0m0.008s

#####

-- James E Keenan (jkeenan@​cpan.org)

p5pRT commented 8 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 8 years ago

From @demerphq

On 8 April 2016 at 12​:41\, James E Keenan via RT \perlbug\-followup@​perl\.org wrote​:

On Thu Apr 07 12​:20​:45 2016\, sschoeling@​linet-services.de wrote​:

This is a bug report for perl from sschoeling@​linet-services.de\, generated with the help of perlbug 1.40 running under perl 5.22.0.

----------------------------------------------------------------- [Please describe your issue here]

The following oneliner creates a large string\, creates a COW view with offset 1 and m//g walks through it​:

perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256. It's not noticable for strings with 100k bytes or less\, but grows quadratically from there.

Some times from this machine​: 128000​: 0.687 128001​: 0.016

256000​: 3.819 256001​: 0.028

512000​: 15.614 512001​: 0.055

perl versions​: 5.18.2 seems to be good 5.19.11 is the earliest buggy version I have available every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Confirmed on blead.

##### $ ./perl -v | head -2 | tail -1 This is perl 5\, version 23\, subversion 10 (v5.23.10 (v5.23.9-56-g87c118b)) built for x86_64-linux

[perl] 6 $ time ./perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.302s user 0m0.298s sys 0m0.004s [perl] 7 $ time ./perl -e 'my $f = "0" x (4e6); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m14.689s user 0m14.688s sys 0m0.008s [perl] 8 $ time ./perl -e 'my $f = "0" x (4e6-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.294s user 0m0.290s sys 0m0.004s [perl] 9 $ time ./perl -e 'my $f = "0" x (4e6+256+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.293s user 0m0.290s sys 0m0.004s [perl] 10 $ time ./perl -e 'my $f = "0" x (4e6+256-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.301s user 0m0.297s sys 0m0.004s [perl] 11 $ time ./perl -e 'my $f = "0" x (4e6+256); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m7.211s user 0m7.194s sys 0m0.016s [perl] 12 $ time ./perl -e 'my $f = "0" x (4e6-256-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.287s user 0m0.283s sys 0m0.004s [perl] 13 $ time ./perl -e 'my $f = "0" x (4e6-256+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.294s user 0m0.287s sys 0m0.008s [perl] 14 $ time ./perl -e 'my $f = "0" x (4e6-256); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m6.437s user 0m6.432s sys 0m0.008s

#####

I bet this is related to making $& work properly\, and possibly how temps are freed.

My bet is in the slow cases we are copying the string every character.

We used to not copy the subject of a //g in scalar context to make things like this fast. When COW was introduced we switched to using it to using COW to copy.

This fixed bugs like this​:

$foo="bar"; $ok= $foo=~/(...)/g; print $1; # prints "bar" $foo="bah"; print $1; # prints "bah"

My initial thought is somewhere this is going wrong. That it happens on strings whose length is a multiple of 256 suggests it even more. We can only have 255 references to a string before it has to be copied\, AND we store the refcount in the *last* char in the string.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @demerphq

On 8 April 2016 at 12​:53\, demerphq \demerphq@​gmail\.com wrote​:

On 8 April 2016 at 12​:41\, James E Keenan via RT \perlbug\-followup@​perl\.org wrote​:

On Thu Apr 07 12​:20​:45 2016\, sschoeling@​linet-services.de wrote​:

This is a bug report for perl from sschoeling@​linet-services.de\, generated with the help of perlbug 1.40 running under perl 5.22.0.

----------------------------------------------------------------- [Please describe your issue here]

The following oneliner creates a large string\, creates a COW view with offset 1 and m//g walks through it​:

perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256. It's not noticable for strings with 100k bytes or less\, but grows quadratically from there.

Some times from this machine​: 128000​: 0.687 128001​: 0.016

256000​: 3.819 256001​: 0.028

512000​: 15.614 512001​: 0.055

perl versions​: 5.18.2 seems to be good 5.19.11 is the earliest buggy version I have available every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Confirmed on blead.

##### $ ./perl -v | head -2 | tail -1 This is perl 5\, version 23\, subversion 10 (v5.23.10 (v5.23.9-56-g87c118b)) built for x86_64-linux

[perl] 6 $ time ./perl -e 'my $f = "0" x (4e6+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.302s user 0m0.298s sys 0m0.004s [perl] 7 $ time ./perl -e 'my $f = "0" x (4e6); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m14.689s user 0m14.688s sys 0m0.008s [perl] 8 $ time ./perl -e 'my $f = "0" x (4e6-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.294s user 0m0.290s sys 0m0.004s [perl] 9 $ time ./perl -e 'my $f = "0" x (4e6+256+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.293s user 0m0.290s sys 0m0.004s [perl] 10 $ time ./perl -e 'my $f = "0" x (4e6+256-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.301s user 0m0.297s sys 0m0.004s [perl] 11 $ time ./perl -e 'my $f = "0" x (4e6+256); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m7.211s user 0m7.194s sys 0m0.016s [perl] 12 $ time ./perl -e 'my $f = "0" x (4e6-256-1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.287s user 0m0.283s sys 0m0.004s [perl] 13 $ time ./perl -e 'my $f = "0" x (4e6-256+1); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.294s user 0m0.287s sys 0m0.008s [perl] 14 $ time ./perl -e 'my $f = "0" x (4e6-256); $g = substr($f\, 1); 1 while $g =~ m/0/g;' ^C

real 0m6.437s user 0m6.432s sys 0m0.008s

#####

I bet this is related to making $& work properly\, and possibly how temps are freed.

My bet is in the slow cases we are copying the string every character.

We used to not copy the subject of a //g in scalar context to make things like this fast. When COW was introduced we switched to using it to using COW to copy.

This fixed bugs like this​:

$foo="bar"; $ok= $foo=~/(...)/g; print $1; # prints "bar" $foo="bah"; print $1; # prints "bah"

My initial thought is somewhere this is going wrong. That it happens on strings whose length is a multiple of 256 suggests it even more. We can only have 255 references to a string before it has to be copied\, AND we store the refcount in the *last* char in the string.

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

norole​:yorton@​Styx​:blead​:/git_tree/perl$ time ./perl -e 'my $f = ("0" x (100)) . ("1" x (4e6 -100)); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.075s user 0m0.063s sys 0m0.012s norole​:yorton@​Styx​:blead​:/git_tree/perl$ time ./perl -e 'my $n=1000; my $f = ("0" x ($n)) . ("1" x (4e6 - $n)); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m0.579s user 0m0.570s sys 0m0.012s norole​:yorton@​Styx​:blead​:/git_tree/perl$ time ./perl -e 'my $n=10000; my $f = ("0" x ($n)) . ("1" x (4e6 - $n)); $g = substr($f\, 1); 1 while $g =~ m/0/g;'

real 0m5.647s user 0m5.638s sys 0m0.008s

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @demerphq

On 8 April 2016 at 13​:06\, demerphq \demerphq@​gmail\.com wrote​:

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte (for the null)\, and not 2 bytes which could hold the refcount.

I have pushed three patches for this\, the last changes SvGROW() to overallocate by 2 always under COW. IMO this should fix a lot of things\, and theory should not break anything. If it turns out practice and theory differ we can revert bcc9f606 and go back to the more specific fix. Assuming we dont revert in a future commit we can probably cleannup a lot of calls to SvGROW\, which almost always add 1 or 2 to a base value\, which would not longer be needed.

commit bcc9f606509ad2fad50e16f081103451b7dc49e1 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 21​:25​:20 2016 +0200

  More generalized fix for #127855\, overallocate in SvGROW and not just sv_grow()

  If we overallocate in SvGROW() and not just sv_grow() we can ensure   we have more cases where we can COW. We need to ensure we always   have room for a null and a reference count.

commit e19cb11142087974d956f263d24e146b968025d5 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 20​:46​:43 2016 +0200

  fix #127855\, in Perl_sv_setpvn() we have to overallocate to enable COW

  We need to overallocate by 2 to do COW strings. One for the null\,   one for the refcount.

commit d14d5855c45a17996ec1e16e2f3cfa8ecef24809 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 21​:17​:34 2016 +0200

  test for #127855 - Slowdown in m//g on COW strings of certain lengths

Yves

p5pRT commented 8 years ago

From @jkeenan

On Fri Apr 08 12​:43​:02 2016\, demerphq wrote​:

On 8 April 2016 at 13​:06\, demerphq \demerphq@​gmail\.com wrote​:

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte (for the null)\, and not 2 bytes which could hold the refcount.

I have pushed three patches for this\, the last changes SvGROW() to overallocate by 2 always under COW. IMO this should fix a lot of things\, and theory should not break anything. If it turns out practice and theory differ we can revert bcc9f606 and go back to the more specific fix. Assuming we dont revert in a future commit we can probably cleannup a lot of calls to SvGROW\, which almost always add 1 or 2 to a base value\, which would not longer be needed.

commit bcc9f606509ad2fad50e16f081103451b7dc49e1 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

commit e19cb11142087974d956f263d24e146b968025d5 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 20​:46​:43 2016 +0200

fix \#127855\, in Perl\_sv\_setpvn\(\) we have to overallocate to enable COW

We need to overallocate by 2 to do COW strings\. One for the null\,
one for the refcount\.

commit d14d5855c45a17996ec1e16e2f3cfa8ecef24809 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 21​:17​:34 2016 +0200

test for \#127855 \- Slowdown in m//g on COW strings of certain lengths

Yves

These patches worked for me on both Linux/x86_64 and older Darwin/PPC. -- James E Keenan (jkeenan@​cpan.org)

p5pRT commented 8 years ago

From @iabyn

On Fri\, Apr 08\, 2016 at 09​:42​:22PM +0200\, demerphq wrote​:

commit d14d5855c45a17996ec1e16e2f3cfa8ecef24809 Author​: Yves Orton \demerphq@​gmail\.com Date​: Fri Apr 8 21​:17​:34 2016 +0200

test for \#127855 \- Slowdown in m//g on COW strings of certain lengths

I took the libery of pushing a couple of updates to that​:

commit 31f8924d9e997070af9ca909e3a78848c4d1972b Author​: David Mitchell \davem@​iabyn\.com AuthorDate​: Sat Apr 9 13​:17​:21 2016 +0100 Commit​: David Mitchell \davem@​iabyn\.com CommitDate​: Sat Apr 9 13​:37​:56 2016 +0100

  new perf test in pat.t​: avoid timing failure  
  A new performance test in re/pat.t added by v5.23.9-58-gd14d585 could   occasionally fails to a timing issue. It was checking that the the   test took less than 1 sec to run; but since the clock usually has a   granularity of 1 sec\, it could fail if the test ran over a tick boundary.  
  Change the condition to \<= 1 sec\, and increase the time the test takes on   a bad perl - it was taking 4sec on my system; it now takes about 14sec\,   so there's less chance of a bad perl passing.

M t/re/pat.t

commit 68b940afc96546256736bc5d8185075ddc12b205 Author​: David Mitchell \davem@&#8203;iabyn\.com AuthorDate​: Sat Apr 9 16​:43​:20 2016 +0100 Commit​: David Mitchell \davem@&#8203;iabyn\.com CommitDate​: Sat Apr 9 16​:43​:20 2016 +0100

  move perf test from re/pat.t to re/speed.t  
  These days we generally put re tests which rely on timing in a separate   file\, t/re/speed.t; move a recently-added test there.

M t/re/pat.t M t/re/speed.t

-- Indomitable in retreat\, invincible in advance\, insufferable in victory   -- Churchill on Montgomery

p5pRT commented 8 years ago

From @demerphq

On 9 April 2016 at 17​:50\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Fri\, Apr 08\, 2016 at 09​:42​:22PM +0200\, demerphq wrote​:

commit d14d5855c45a17996ec1e16e2f3cfa8ecef24809 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Apr 8 21​:17​:34 2016 +0200

test for \#127855 \- Slowdown in m//g on COW strings of certain lengths

I took the libery of pushing a couple of updates to that​:

Sounds good. Sorry I was not aware of/failed to notice re/speed.t

commit 31f8924d9e997070af9ca909e3a78848c4d1972b Author​: David Mitchell \davem@&#8203;iabyn\.com AuthorDate​: Sat Apr 9 13​:17​:21 2016 +0100 Commit​: David Mitchell \davem@&#8203;iabyn\.com CommitDate​: Sat Apr 9 13​:37​:56 2016 +0100

new perf test in pat\.t&#8203;: avoid timing failure

A new performance test in re/pat\.t added by v5\.23\.9\-58\-gd14d585 could
occasionally fails to a timing issue\. It was checking that the the
test took less than 1 sec to run; but since the clock usually has a
granularity of 1 sec\, it could fail if the test ran over a tick boundary\.

Change the condition to \<= 1 sec\, and increase the time the test takes on
a bad perl \- it was taking 4sec on my system; it now takes about 14sec\,
so there's less chance of a bad perl passing\.

M t/re/pat.t

commit 68b940afc96546256736bc5d8185075ddc12b205 Author​: David Mitchell \davem@&#8203;iabyn\.com AuthorDate​: Sat Apr 9 16​:43​:20 2016 +0100 Commit​: David Mitchell \davem@&#8203;iabyn\.com CommitDate​: Sat Apr 9 16​:43​:20 2016 +0100

move perf test from re/pat\.t to re/speed\.t

These days we generally put re tests which rely on timing in a separate
file\, t/re/speed\.t; move a recently\-added test there\.

M t/re/pat.t M t/re/speed.t

-- Indomitable in retreat\, invincible in advance\, insufferable in victory -- Churchill on Montgomery

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @iabyn

On Fri\, Apr 08\, 2016 at 09​:42​:22PM +0200\, demerphq wrote​:

On 8 April 2016 at 13​:06\, demerphq \demerphq@&#8203;gmail\.com wrote​:

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte (for the null)\, and not 2 bytes which could hold the refcount.

I have pushed three patches for this\, the last changes SvGROW() to overallocate by 2 always under COW. IMO this should fix a lot of things\, and theory should not break anything. If it turns out practice and theory differ we can revert bcc9f606 and go back to the more specific fix. Assuming we dont revert in a future commit we can probably cleannup a lot of calls to SvGROW\, which almost always add 1 or 2 to a base value\, which would not longer be needed.

commit bcc9f606509ad2fad50e16f081103451b7dc49e1 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in   [perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv\,len) include a calculation involving (len+2)\, the calculation can wrap for large len's\, resulting in a small or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit\, and readdress the issue more thoroughly afterwards. I already have SV growing on my Big List of Things to Look At Properly Something.

-- "I do not resent criticism\, even when\, for the sake of emphasis\, it parts for the time with reality".   -- Winston Churchill\, House of Commons\, 22nd Jan 1941.

p5pRT commented 8 years ago

From @demerphq

On 28 April 2016 at 11​:20\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Fri\, Apr 08\, 2016 at 09​:42​:22PM +0200\, demerphq wrote​:

On 8 April 2016 at 13​:06\, demerphq \demerphq@&#8203;gmail\.com wrote​:

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte (for the null)\, and not 2 bytes which could hold the refcount.

I have pushed three patches for this\, the last changes SvGROW() to overallocate by 2 always under COW. IMO this should fix a lot of things\, and theory should not break anything. If it turns out practice and theory differ we can revert bcc9f606 and go back to the more specific fix. Assuming we dont revert in a future commit we can probably cleannup a lot of calls to SvGROW\, which almost always add 1 or 2 to a base value\, which would not longer be needed.

commit bcc9f606509ad2fad50e16f081103451b7dc49e1 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in [perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv\,len) include a calculation involving (len+2)\, the calculation can wrap for large len's\, resulting in a small or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit\, and readdress the issue more thoroughly afterwards. I already have SV growing on my Big List of Things to Look At Properly Something.

That will leave the original bug fixed right? Will we have this problem in sv_grow() as well?

Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @iabyn

On Thu\, Apr 28\, 2016 at 11​:30​:17AM +0200\, demerphq wrote​:

On 28 April 2016 at 11​:20\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Fri\, Apr 08\, 2016 at 09​:42​:22PM +0200\, demerphq wrote​:

On 8 April 2016 at 13​:06\, demerphq \demerphq@&#8203;gmail\.com wrote​:

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte (for the null)\, and not 2 bytes which could hold the refcount.

I have pushed three patches for this\, the last changes SvGROW() to overallocate by 2 always under COW. IMO this should fix a lot of things\, and theory should not break anything. If it turns out practice and theory differ we can revert bcc9f606 and go back to the more specific fix. Assuming we dont revert in a future commit we can probably cleannup a lot of calls to SvGROW\, which almost always add 1 or 2 to a base value\, which would not longer be needed.

commit bcc9f606509ad2fad50e16f081103451b7dc49e1 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in [perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv\,len) include a calculation involving (len+2)\, the calculation can wrap for large len's\, resulting in a small or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit\, and readdress the issue more thoroughly afterwards. I already have SV growing on my Big List of Things to Look At Properly Something.

That will leave the original bug fixed right? Will we have this problem in sv_grow() as well?

Turns out that reverting just the one commit introduced a subtle bug with

  my $s = some_string;   $s = "$s";

Where pp_stringify() and sv_copypv() (which it calls) don't check whether the target SV is the same as the src. Normally this doesn't matter\, apart from a wasted copy onto itself. However\, once SvGROW starts over-allocating\, the SV;s buffer gets realloced\, and then the copy is done from a freed buffer. This doesn't seem to occur wiuth both commits present\, since the general over-allocation in SvGROW() means that $s already has plenty of head room when sv_setpvn() calls SvGROW(sv\, len+1) and so doen't get realloced. With just the sv_setpvn() fix\, it *can* get re-alloced.

However\, I now think both commits are wrong. There is already a mechanism in sv_grow() to allocate +1 for COW\, it just wasn't being triggered when SvLEN's lower 8 bits were all zeros\, as explained by the following commit\, which is part of the branch smoke-me/davem/cow1. This branch​:

* reverts your two commits * adds a test for $s = "$s" * adds the commit below.

Rik\, I think our two options at this stage are​:

a) Just revert the two commits for 5.24.0. This means that a slowdown   which appeared in 5.20.0 on /.../g for certain long strings will still   be present in 5.24.0. b) merge my branch\, which fixes that slowdown\, but of course introduces   the risk of further breakage (unlikely as I would of course think that   is).

-- A major Starfleet emergency breaks out near the Enterprise\, but fortunately some other ships in the area are able to deal with it to everyone's satisfaction.   -- Things That Never Happen in "Star Trek" #13

p5pRT commented 8 years ago

From @demerphq

On 2 May 2016 at 16​:26\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Thu\, Apr 28\, 2016 at 11​:30​:17AM +0200\, demerphq wrote​:

On 28 April 2016 at 11​:20\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Fri\, Apr 08\, 2016 at 09​:42​:22PM +0200\, demerphq wrote​:

On 8 April 2016 at 13​:06\, demerphq \demerphq@&#8203;gmail\.com wrote​:

We are copying the subject string every 0 in the string. I dont know why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte (for the null)\, and not 2 bytes which could hold the refcount.

I have pushed three patches for this\, the last changes SvGROW() to overallocate by 2 always under COW. IMO this should fix a lot of things\, and theory should not break anything. If it turns out practice and theory differ we can revert bcc9f606 and go back to the more specific fix. Assuming we dont revert in a future commit we can probably cleannup a lot of calls to SvGROW\, which almost always add 1 or 2 to a base value\, which would not longer be needed.

commit bcc9f606509ad2fad50e16f081103451b7dc49e1 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in [perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv\,len) include a calculation involving (len+2)\, the calculation can wrap for large len's\, resulting in a small or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit\, and readdress the issue more thoroughly afterwards. I already have SV growing on my Big List of Things to Look At Properly Something.

That will leave the original bug fixed right? Will we have this problem in sv_grow() as well?

Turns out that reverting just the one commit introduced a subtle bug with

my $s = some\_string;
$s = "$s";

Where pp_stringify() and sv_copypv() (which it calls) don't check whether the target SV is the same as the src. Normally this doesn't matter\, apart from a wasted copy onto itself. However\, once SvGROW starts over-allocating\, the SV;s buffer gets realloced\, and then the copy is done from a freed buffer. This doesn't seem to occur wiuth both commits present\, since the general over-allocation in SvGROW() means that $s already has plenty of head room when sv_setpvn() calls SvGROW(sv\, len+1) and so doen't get realloced. With just the sv_setpvn() fix\, it *can* get re-alloced.

However\, I now think both commits are wrong. There is already a mechanism in sv_grow() to allocate +1 for COW\, it just wasn't being triggered when SvLEN's lower 8 bits were all zeros\, as explained by the following commit\, which is part of the branch smoke-me/davem/cow1. This branch​:

* reverts your two commits * adds a test for $s = "$s" * adds the commit below.

While I am fine with whatever you think is best for right now I have some concerns that I really feel do need to be addressed.

Reverting would return us to the following definition​:

# define SvGROW(sv\,len) (SvLEN(sv) \< (len) ? sv_grow(sv\,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies\, which means that all code that uses SvGROW(sv\, wanted_len+1) will produce buffers which are not COWable. I think that is a lot of code.

It seems to me that it is a mistake to expect the caller to keep track of the space required for the COW refcount\, and to keep track of the space required for NULL. What happens for instance if we choose to switch the refcount to a different type\, where the required extra-space might be more than one byte? It seems to me that in a perfect world we should be able to overallocate however much we choose and it shouldn't break things.

It also seems odd that logic that causes us to overallocate more often would produce a segfault. Do you have any insight on that?

Rik\, I think our two options at this stage are​: encounter a) Just revert the two commits for 5.24.0. This means that a slowdown which appeared in 5.20.0 on /.../g for certain long strings will still be present in 5.24.0. b) merge my branch\, which fixes that slowdown\, but of course introduces the risk of further breakage (unlikely as I would of course think that is).

I am fine with either.

BTW\, I think its worth noting the general history of this bug. It is very very old. For a long time /g in scalar context did not copy its subject var\, because we considered this bug more important than the bugs that come from not copying the subject var in a match (such as capture vars being wrong\, or even accessing outside of the string). Later on we introduced COW\, and (as I am sure you remember ;-) fixed the /g in scalar context problem by using COW. But now we are vulnerable to the same historic performance issues with any string that is not COWable.

It really seems to me that when COW was added we modified sv_grow() but failed to update SvGROW() correspondingly\, and that possibly SvGROW() has been serving more than one purpose that just happened to be well served by a single piece of logic\, but that post-COW it needs to be split in two.

cheers\, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @iabyn

On Mon\, May 02\, 2016 at 07​:43​:39PM +0200\, demerphq wrote​:

Reverting would return us to the following definition​:

# define SvGROW(sv\,len) (SvLEN(sv) \< (len) ? sv_grow(sv\,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies\, which means that all code that uses SvGROW(sv\, wanted_len+1) will produce buffers which are not COWable. I think that is a lot of code.

Something creating a new string\, "abc"\0 say\, will call SvGROW(sv\, 3+1) which will call sv_grow()\, which will bump the size to (at least) 3+1+1\, giving space for the COW refcount.

Something extending an existing string will cause sv_grow() to be called (and thus space given for the COW refcount)\, unless there is already spare space in the PVX buffer. Only in the case where the extra string content's length exactly matches the spare space in the PVX will there not be room for a future COW refcount.

So I think this is case is (relatively) rare.

It seems to me that it is a mistake to expect the caller to keep track of the space required for the COW refcount.

We're not (in general). sv_grow does it no behalf of the caller.

\, and to keep track of the space required for NULL.

It's clearly documented since 5.004 that SvGROW's caller has to do the +1\, so I think we're ok there.

It also seems odd that logic that causes us to overallocate more often would produce a segfault. Do you have any insight on that?

It's specifically the 'x' operator\, which is where it easy to grow very large strings. If you do ('a' x ~0) then adding extra to the length when its already max causes the value ti wrap and only a small buffer to be allocated. pp_repeat then fills what it thinks is a successfully allocated large buffer with 'a's and SEGVs. In particular the code in pp_repeat expects that itd safe to pass a max int as an arg to SvGROW() without it wrapping and failing. Obviously pp_repeat's expectations could be modified\, or an SvGROW which does +1 could be written more defensively to not wrap.

Rik\, I think our two options at this stage are​: encounter a) Just revert the two commits for 5.24.0. This means that a slowdown which appeared in 5.20.0 on /.../g for certain long strings will still be present in 5.24.0. b) merge my branch\, which fixes that slowdown\, but of course introduces the risk of further breakage (unlikely as I would of course think that is).

I am fine with either.

I vote for (b)\, if that's ok with Rik.

BTW\, I think its worth noting the general history of this bug. It is very very old. For a long time /g in scalar context did not copy its subject var\, because we considered this bug more important than the bugs that come from not copying the subject var in a match (such as capture vars being wrong\, or even accessing outside of the string). Later on we introduced COW\, and (as I am sure you remember ;-) fixed the /g in scalar context problem by using COW. But now we are vulnerable to the same historic performance issues with any string that is not COWable.

It really seems to me that when COW was added we modified sv_grow() but failed to update SvGROW() correspondingly\, and that possibly SvGROW() has been serving more than one purpose that just happened to be well served by a single piece of logic\, but that post-COW it needs to be split in two.

For post-5.24\, I want to consider more radical options\, for example using a mechanism similar to OOK (or even to hijack OOK) to store the COW refcount at the beginning of the string then make PVX() point to the second byte of the malloced buffer.

-- My get-up-and-go just got up and went.

p5pRT commented 8 years ago

From @cpansprout

On Mon May 02 11​:44​:13 2016\, davem wrote​:

On Mon\, May 02\, 2016 at 07​:43​:39PM +0200\, demerphq wrote\, quoting Dave Mitchell​:

b) merge my branch\, which fixes that slowdown\, but of course introduces the risk of further breakage (unlikely as I would of course think that is).

I am fine with either.

I vote for (b)\, if that's ok with Rik.

I think (b) is a good idea. The patch looks safe to me.

For post-5.24\, I want to consider more radical options\, for example using a mechanism similar to OOK (or even to hijack OOK) to store the COW refcount at the beginning of the string then make PVX() point to the second byte of the malloced buffer.

That was Nicholas Clark’s original proposal\, but others pointed out that on some systems word-aligned string comparison was much faster than non-aligned\, and that on such systems mallocked strings would be aligned. That was why I chose the end of the buffer.

--

Father Chrysostomos

p5pRT commented 8 years ago

From @demerphq

On 2 May 2016 at 20​:43\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Mon\, May 02\, 2016 at 07​:43​:39PM +0200\, demerphq wrote​:

Reverting would return us to the following definition​:

# define SvGROW(sv\,len) (SvLEN(sv) \< (len) ? sv_grow(sv\,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies\, which means that all code that uses SvGROW(sv\, wanted_len+1) will produce buffers which are not COWable. I think that is a lot of code.

Something creating a new string\, "abc"\0 say\, will call SvGROW(sv\, 3+1) which will call sv_grow()\, which will bump the size to (at least) 3+1+1\, giving space for the COW refcount.

Something creating a new string "abc" wont call SvGROW at all. It will call something like newSV().

Something extending an existing string will cause sv_grow() to be called (and thus space given for the COW refcount)\, unless there is already spare space in the PVX buffer. Only in the case where the extra string content's length exactly matches the spare space in the PVX will there not be room for a future COW refcount.

I don't see how you say that. Any time they use either of the last two bytes available the string will become unCOWable.

SvLEN=10\, Requested = 9\, Uncowable. SVLEN=10\, Requested = 10\, Uncowable.

So I think this is case is (relatively) rare.

I want to see data that agrees. I think its a lot less rare than you think.

I think that lots of code does concatenate ops on SV's and will thus occasionally produce unCOWable strings. IMO we should NEVER produce an unCOWable string.

It seems to me that it is a mistake to expect the caller to keep track of the space required for the COW refcount.

We're not (in general). sv_grow does it no behalf of the caller.

No. SvGROW() doesnt know about it\, so it forces the user to know about it.

\, and to keep track of the space required for NULL.

It's clearly documented since 5.004 that SvGROW's caller has to do the +1\, so I think we're ok there.

Just because we documented it doesnt mean its smart or right. I dont think its either.

It also seems odd that logic that causes us to overallocate more often would produce a segfault. Do you have any insight on that?

It's specifically the 'x' operator\, which is where it easy to grow very large strings. If you do ('a' x ~0) then adding extra to the length when its already max causes the value ti wrap and only a small buffer to be allocated.

Ok\, thanks. So then we can fix SvGROW to not do that.

pp_repeat then fills what it thinks is a successfully allocated large buffer with 'a's and SEGVs. In particular the code in pp_repeat expects that itd safe to pass a max int as an arg to SvGROW() without it wrapping and failing. Obviously pp_repeat's expectations could be modified\, or an SvGROW which does +1 could be written more defensively to not wrap.

I think we should do the latter.

It occurs to me that the wrapping problem is a good example of why the caller should not have to account for the trailing null\, as it means they have to manage the wrapping problem *as well*\, and as far as I can tell none of our code that uses SvGROW does.

Rik\, I think our two options at this stage are​: encounter a) Just revert the two commits for 5.24.0. This means that a slowdown which appeared in 5.20.0 on /.../g for certain long strings will still be present in 5.24.0. b) merge my branch\, which fixes that slowdown\, but of course introduces the risk of further breakage (unlikely as I would of course think that is).

I am fine with either.

I vote for (b)\, if that's ok with Rik.

BTW\, I think its worth noting the general history of this bug. It is very very old. For a long time /g in scalar context did not copy its subject var\, because we considered this bug more important than the bugs that come from not copying the subject var in a match (such as capture vars being wrong\, or even accessing outside of the string). Later on we introduced COW\, and (as I am sure you remember ;-) fixed the /g in scalar context problem by using COW. But now we are vulnerable to the same historic performance issues with any string that is not COWable.

It really seems to me that when COW was added we modified sv_grow() but failed to update SvGROW() correspondingly\, and that possibly SvGROW() has been serving more than one purpose that just happened to be well served by a single piece of logic\, but that post-COW it needs to be split in two.

For post-5.24\, I want to consider more radical options\, for example using a mechanism similar to OOK (or even to hijack OOK) to store the COW refcount at the beginning of the string then make PVX() point to the second byte of the malloced buffer.

FWIW\, I am not a fan of this idea. I was looking into making the refcount a varint\, and using as much space at the end as possible for the refcount so that we can reference a string more than 255 times. Consider when we hit the max count\, we could do a realloc to add some bytes for a larger refcount. If the realloc returns the original pointer we are good. If it doesnt then we do what we currently do.

I would be less against the idea if we were to raise the space allocation for the refcount so it could hold say an integer.

I really do not consider the debate about SvGROW settled yet. I really do not think it works right now\, and I feel like we are sweeping it under the rug here.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @demerphq

On 3 May 2016 at 08​:11\, demerphq \demerphq@&#8203;gmail\.com wrote​:

I really do not consider the debate about SvGROW settled yet. I really do not think it works right now\, and I feel like we are sweeping it under the rug here.

Er\, I apologise\, that could be interpreted it differently than I intended.

I think its fine to ignore this for now\, and settle it in a future release.

I just feel like we should understand things better before we say that SvGROW is correct in the face of COW\, and the more I think about this bug the more I think we can not say that. I just spent a while reviewing a lot of SvGROW calls\, and the more I review the more uncomfortable I am about whether we are doing things right or not.

Anyway\, for now lets just revert my patches and go with Daves solution to the original bug.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 8 years ago

From @iabyn

On Mon\, May 02\, 2016 at 02​:01​:21PM -0700\, Father Chrysostomos via RT wrote​:

For post-5.24\, I want to consider more radical options\, for example using a mechanism similar to OOK (or even to hijack OOK) to store the COW refcount at the beginning of the string then make PVX() point to the second byte of the malloced buffer.

That was Nicholas Clark’s original proposal\, but others pointed out that on some systems word-aligned string comparison was much faster than non-aligned\, and that on such systems mallocked strings would be aligned. That was why I chose the end of the buffer.

Ah\, that's an interesting point.

-- print+qq&$}$"$/$s$\,$a$d$g$s$@​$.$q$\,$​:$.$q$^$\,$@​$a$~$;$.$q$m&if+map{m\,^\d{0\\,}\,\,${$​::{$'}}=chr($"+=$&||1)}q&10m22\,42}6​:17a2~2.3@​3;^2dg3q/s"&=~m*\d\*.*g

p5pRT commented 8 years ago

From @iabyn

[ to be read in the context of your followup-email\, that we're now just discussing post-5.24 stuff ]

On Tue\, May 03\, 2016 at 08​:11​:04AM +0200\, demerphq wrote​:

On 2 May 2016 at 20​:43\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Mon\, May 02\, 2016 at 07​:43​:39PM +0200\, demerphq wrote​:

Reverting would return us to the following definition​:

# define SvGROW(sv\,len) (SvLEN(sv) \< (len) ? sv_grow(sv\,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies\, which means that all code that uses SvGROW(sv\, wanted_len+1) will produce buffers which are not COWable. I think that is a lot of code.

Something creating a new string\, "abc"\0 say\, will call SvGROW(sv\, 3+1) which will call sv_grow()\, which will bump the size to (at least) 3+1+1\, giving space for the COW refcount.

Something creating a new string "abc" wont call SvGROW at all. It will call something like newSV().

And newSV() calls sv_grow() to allocate an initial buffer. The point I was making is that most "normal" string creations\, automatically get the extra +1 for the cow ref in addition to the +1 NUL.

Something extending an existing string will cause sv_grow() to be called (and thus space given for the COW refcount)\, unless there is already spare space in the PVX buffer. Only in the case where the extra string content's length exactly matches the spare space in the PVX will there not be room for a future COW refcount.

I don't see how you say that. Any time they use either of the last two bytes available the string will become unCOWable.

SvLEN=10\, Requested = 9\, Uncowable. SVLEN=10\, Requested = 10\, Uncowable.

Consider the following PVX buffers\, where X represents an spare byte in the allocated buffer​:

"abc\0"   No spare space in the buffer; any attempt to concatenate 1 or more   bytes will trigger a call to sv_grow()\, which will allocate a spare   byte for COW.

"abc\0X"   Concatenating 1 byte will convert the string to "abcd\0"\, with no   space for COW.   Concatenating 2 or more bytes will trigger a call to sv_grow()\, which   will allocate a spare byte for COW.

"abc\0XX"   Concatenating 1 byte will still leave a spare byte for COW.   Concatenating 2 bytes will convert the string to "abcde\0"\, with no   space for COW.   Concatenating 3 or more bytes will trigger a call to sv_grow()\, which   will allocate a spare byte for COW.

So in general only concatenating a number of bytes exactly equal to the number of spare bytes in the buffer will may the string non-COWable.

For example\, with this code​:

  use Devel​::Peek;   my $s = 'x' x $ARGV[0];   Dump $s;   $s .= $ARGV[1];   Dump $s;

$ perl5238 ~/tmp/p 7 '' 2>&1 | egrep 'CUR|LEN'   CUR = 7   LEN = 10   CUR = 7   LEN = 10 $ perl5238 ~/tmp/p 7 'x' 2>&1 | egrep 'CUR|LEN'   CUR = 7   LEN = 10   CUR = 8   LEN = 10 $ perl5238 ~/tmp/p 7 'xx' 2>&1 | egrep 'CUR|LEN'   CUR = 7   LEN = 10   CUR = 9   LEN = 10 $ perl5238 ~/tmp/p 7 'xxx' 2>&1 | egrep 'CUR|LEN'   CUR = 7   LEN = 10   CUR = 10   LEN = 24

It's only the 'xx' concat that leaves an uncowable SV.

I think that lots of code does concatenate ops on SV's and will thus occasionally produce unCOWable strings. IMO we should NEVER produce an unCOWable string.

Its a worthy goal\, we just need to decide whether the means to achieve it are reasonable.

\, and to keep track of the space required for NULL.

It's clearly documented since 5.004 that SvGROW's caller has to do the +1\, so I think we're ok there.

Just because we documented it doesnt mean its smart or right. I dont think its either.

Well we're kind of stuck with it. We could introduce a new SvGROW0 macro say\, that does its own +1\, then change core to use it\, and define SvGROW in terms of it. At least that way any wrap-arounds could be handled consistently within the macro rather than requiring every caller to check.

FWIW\, I am not a fan of this idea. I was looking into making the refcount a varint\, and using as much space at the end as possible for the refcount so that we can reference a string more than 255 times. Consider when we hit the max count\, we could do a realloc to add some bytes for a larger refcount. If the realloc returns the original pointer we are good. If it doesnt then we do what we currently do.

I would be less against the idea if we were to raise the space allocation for the refcount so it could hold say an integer.

I think the cases where a string buffer is COW-shared more than 255 times are fairly rare\, and the cost is only that COW copies one in 256 times. I'm not sure that the extra complexity of a variable length ref count is worth it.

One thing I definitely want to do post 5.24 is introduce a startup-time probe of the OS's malloc/realloc algorithm; from what I've seen when I first raised this on the list and got people to run a little test program on various platforms\, is that all small allocs are internally of size A + Bi for i=0\,1\,2..\, regardless of the request size\, but where A and B vary across platforms and are determined when the perl executable first starts. For example on my 64-git glibc\, A=24 and B=32 (IIRC). Knowing A and B means that sv_grow()\, av_extend()\, EXTEND() etc can all over-allocate in an intelligent way.

-- All wight. I will give you one more chance. This time\, I want to hear no Wubens. No Weginalds. No Wudolf the wed-nosed weindeers.   -- Life of Brian