Open p5pRT opened 10 years ago
This is a bug report for perl from Niko Tyni \ntyni@​debian\.org\, generated with the help of perlbug 1.39 running under perl 5.18.2.
Trying to fix Devel-Size [rt.cpan.org #95493] for Perl 5.20\, I ran into the following:
% perl5182 -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)' SV = PV(0x25b4a50) at 0x25d5720 REFCNT = 1 FLAGS = (POK\,OOK\,pPOK) OFFSET = 1 PV = 0x25cebe1 ( "\1" . ) "2345"\0 CUR = 4 LEN = 15
% perl5200 -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)' SV = PV(0x1179a90) at 0x11951d8 REFCNT = 1 FLAGS = (POK\,pPOK) PV = 0x118f9e0 "2345"\0 CUR = 4 LEN = 10
This changed with commit v5.19.0-246-g13b0f67\, "re-enable Copy-on-Write by default." Clearly COW stops the OOK hack triggering in this specific case.
The example is from the 'Offsets' section in perlguts\, so at least that needs updating. For the benefit of that\, and also Devel-Size\, is there another simple way to trigger the OOK hack noawadays? I couldn't come up with one yet.
Flags: category=core severity=low
Site configuration information for perl 5.18.2:
Configured by Debian Project at Mon Jul 14 20:39:08 UTC 2014.
Summary of my perl5 (revision 5 version 18 subversion 2) configuration:
Platform:
osname=linux\, osvers=3.14-1-amd64\, archname=x86_64-linux-gnu-thread-multi
uname='linux estella 3.14-1-amd64 #1 smp debian 3.14.10-1 (2014-07-07) x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -fwrapv -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Dldflags= -Wl\,-z\,relro -Dlddlflags=-shared -Wl\,-z\,relro -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.18 -Darchlib=/usr/lib/perl/5.18 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.18.2 -Dsitearch=/usr/local/lib/perl/5.18.2 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -Uversiononly -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.18.2 -des'
hint=recommended\, useposix=true\, d_sigaction=define
useithreads=define\, usemultiplicity=define
useperlio=define\, d_sfio=undef\, uselargefiles=define\, usesocks=undef
use64bitint=define\, use64bitall=define\, uselongdouble=undef
usemymalloc=n\, bincompat5005=undef
Compiler:
cc='cc'\, ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fstack-protector -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\,
optimize='-O2 -g'\,
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fstack-protector -fno-strict-aliasing -pipe -I/usr/local/include'
ccversion=''\, gccversion='4.9.0'\, gccosandvers=''
intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678
d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16
ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8
alignbytes=8\, prototype=define
Linker and Libraries:
ld='cc'\, ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
libc=\, so=so\, useshrplib=true\, libperl=libperl.so.5.18.2
gnulibc_version='2.19'
Dynamic Linking:
dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-Wl\,-E'
cccdlflags='-fPIC'\, lddlflags='-shared -L/usr/local/lib -fstack-protector'
Locally applied patches: DEBPKG:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN. DEBPKG:debian/db_file_ver - http://bugs.debian.org/340047 Remove overly restrictive DB_File version check. DEBPKG:debian/doc_info - Replace generic man(1) instructions with Debian-specific information. DEBPKG:debian/enc2xs_inc - http://bugs.debian.org/290336 Tweak enc2xs to follow symlinks and ignore missing @INC directories. DEBPKG:debian/errno_ver - http://bugs.debian.org/343351 Remove Errno version check due to upgrade problems with long-running processes. DEBPKG:debian/libperl_embed_doc - http://bugs.debian.org/186778 Note that libperl-dev package is required for embedded linking DEBPKG:fixes/respect_umask - Respect umask during installation DEBPKG:debian/writable_site_dirs - Set umask approproately for site install directories DEBPKG:debian/extutils_set_libperl_path - EU:MM: Set location of libperl.a to /usr/lib DEBPKG:debian/no_packlist_perllocal - Don't install .packlist or perllocal.pod for perl or vendor DEBPKG:debian/prefix_changes - Fiddle with *PREFIX and variables written to the makefile DEBPKG:debian/fakeroot - Postpone LD_LIBRARY_PATH evaluation to the binary targets. DEBPKG:debian/instmodsh_doc - Debian policy doesn't install .packlist files for core or vendor. DEBPKG:debian/ld_run_path - Remove standard libs from LD_RUN_PATH as per Debian policy. DEBPKG:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable. DEBPKG:debian/mod_paths - Tweak @INC ordering for Debian DEBPKG:debian/module_build_man_extensions - http://bugs.debian.org/479460 Adjust Module::Build manual page extensions for the Debian Perl policy DEBPKG:debian/prune_libs - http://bugs.debian.org/128355 Prune the list of libraries wanted to what we actually need. DEBPKG:fixes/net_smtp_docs - [rt.cpan.org #36038] http://bugs.debian.org/100195 Document the Net::SMTP 'Port' option DEBPKG:debian/perlivp - http://bugs.debian.org/510895 Make perlivp skip include directories in /usr/local DEBPKG:debian/cpanplus_definstalldirs - http://bugs.debian.org/533707 Configure CPANPLUS to use the site directories by default. DEBPKG:debian/cpanplus_config_path - Save local versions of CPANPLUS::Config::System into /etc/perl. DEBPKG:debian/deprecate-with-apt - http://bugs.debian.org/702096 Point users to Debian packages of deprecated core modules DEBPKG:debian/squelch-locale-warnings - http://bugs.debian.org/508764 Squelch locale warnings in Debian package maintainer scripts DEBPKG:debian/skip-upstream-git-tests - Skip tests specific to the upstream Git repository DEBPKG:debian/patchlevel - http://bugs.debian.org/567489 List packaged patches for 5.18.2-7 in patchlevel.h DEBPKG:debian/skip-kfreebsd-crash - http://bugs.debian.org/628493 [perl #96272] Skip a crashing test case in t/op/threads.t on GNU/kFreeBSD DEBPKG:fixes/document_makemaker_ccflags - http://bugs.debian.org/628522 [rt.cpan.org #68613] Document that CCFLAGS should include $Config{ccflags} DEBPKG:debian/find_html2text - http://bugs.debian.org/640479 Configure CPAN::Distribution with correct name of html2text DEBPKG:debian/hurd_test_skip_stack - http://bugs.debian.org/650175 Disable failing GNU/Hurd tests dist/threads/t/stack.t DEBPKG:fixes/manpage_name_Test-Harness - http://bugs.debian.org/650451 [rt.cpan.org #73399] cpan/Test-Harness: add NAME headings in modules with POD DEBPKG:debian/makemaker-pasthru - http://bugs.debian.org/660195 [rt.cpan.org #28632] Make EU::MM pass LD through to recursive Makefile.PL invocations DEBPKG:debian/perl5db-x-terminal-emulator.patch - http://bugs.debian.org/668490 Invoke x-terminal-emulator rather than xterm in perl5db.pl DEBPKG:debian/cpan-missing-site-dirs - http://bugs.debian.org/688842 Fix CPAN::FirstTime defaults with nonexisting site dirs if a parent is writable DEBPKG:fixes/memoize_storable_nstore - [rt.cpan.org #77790] http://bugs.debian.org/587650 Memoize::Storable: respect 'nstore' option not respected DEBPKG:fixes/net_ftp_failed_command - [rt.cpan.org #37700] http://bugs.debian.org/491062 Net::FTP: cope gracefully with a failed command DEBPKG:fixes/perlbug-patchlist - [3541c11] http://bugs.debian.org/710842 [perl #118433] Make perlbug look up the list of local patches at run time DEBPKG:fixes/module_metadata_security_doc - [68cdd4b] CVE-2013-1437 documentation fix DEBPKG:fixes/module_metadata_taint_fix - [bff978f] http://bugs.debian.org/722210 [rt.cpan.org #88576] untaint version\, if needed\, in Module::Metadata DEBPKG:fixes/IPC-SysV-spelling - http://bugs.debian.org/730558 [rt.cpan.org #86736] Fix spelling of IPC_CREAT in IPC-SysV documentation DEBPKG:fixes/goto-sub-crash - [bfa371b] http://bugs.debian.org/736187 [perl #119949] Stop undef *_\, goto &sub from crashing DEBPKG:debian/regcomp-mips-optim - http://bugs.debian.org/754054 Downgrade the optimization of regcomp.c on mips due to a gcc-4.9 bug
@INC for perl 5.18.2: /etc/perl /usr/local/lib/perl/5.18.2 /usr/local/share/perl/5.18.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.18 /usr/share/perl/5.18 /usr/local/lib/site_perl .
Environment for perl 5.18.2: HOME=/home/niko LANG=en_US.UTF-8 LANGUAGE (unset) LC_CTYPE=fi_FI.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/niko/bin:/home/niko/bin:/home/niko/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/sbin:/usr/sbin PERL_BADLANG (unset) SHELL=/bin/zsh
On 18 July 2014 08:58\, Niko Tyni \perlbug\-followup@​perl\.org wrote:
# New Ticket Created by Niko Tyni # Please include the string: [perl #122322] # in the subject line of all future correspondence about this issue. # \<URL: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122322 >
This is a bug report for perl from Niko Tyni \ntyni@​debian\.org\, generated with the help of perlbug 1.39 running under perl 5.18.2.
----------------------------------------------------------------- Trying to fix Devel-Size [rt.cpan.org #95493] for Perl 5.20\, I ran into the following:
% perl5182 -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)' SV = PV(0x25b4a50) at 0x25d5720 REFCNT = 1 FLAGS = (POK\,OOK\,pPOK) OFFSET = 1 PV = 0x25cebe1 ( "\1" . ) "2345"\0 CUR = 4 LEN = 15
% perl5200 -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)' SV = PV(0x1179a90) at 0x11951d8 REFCNT = 1 FLAGS = (POK\,pPOK) PV = 0x118f9e0 "2345"\0 CUR = 4 LEN = 10
This changed with commit v5.19.0-246-g13b0f67\, "re-enable Copy-on-Write by default." Clearly COW stops the OOK hack triggering in this specific case.
The example is from the 'Offsets' section in perlguts\, so at least that needs updating. For the benefit of that\, and also Devel-Size\, is there another simple way to trigger the OOK hack noawadays? I couldn't come up with one yet.
I would assume this is a serious performance failure. COW clashing with other optimisations has been a common theme since COW was introduced.
Yves
The RT System itself - Status changed from 'new' to 'open'
demerphq \demerphq@​gmail\.com wrote:
I would assume this is a serious performance failure.
Yes\, it seems so:
$ for f in 5.18.2 5.20.0; do time perl$f -wE 'say for ""\, $^V; for (1 .. 1000) { my $s = "x" x 1e8; $s =~ s/.// }'; done
v5.18.2
real 0m19.769s user 0m19.677s sys 0m0.082s
v5.20.0
real 0m39.145s user 0m29.205s sys 0m9.933s
-- Aaron Crane ** http://aaroncrane.co.uk/
On Mon\, Jul 21\, 2014 at 03:34:00PM +0100\, Aaron Crane wrote:
demerphq \demerphq@​gmail\.com wrote:
I would assume this is a serious performance failure.
Yes\, it seems so:
$ for f in 5.18.2 5.20.0; do time perl$f -wE 'say for ""\, $^V; for (1 .. 1000) { my $s = "x" x 1e8; $s =~ s/.// }'; done
v5.18.2
real 0m19.769s user 0m19.677s sys 0m0.082s
v5.20.0
real 0m39.145s user 0m29.205s sys 0m9.933s
Note that this is in fact a bug fix that has performance loss as a side-effect:
$ perl5180 -le'$_="ABC"; s/.//; print ord eval q{$&}' 1 $ perl5200 -le'$_="ABC"; s/.//; print ord eval q{$&}' 65
Basically in general\, a pattern match has to make a temporary copy of the string it successfully matched\, so that $&\, $1 etc hold the right values if the string is subsequently modified. As an optimisation\, this copy was skipped if the pattern didn't contain any captures\, and $`/$&/$' hadn't yet been seen. The example above shows that this assumption was a bit dodgy. In 5.20.0\, COW means that the temporary copy can usually be done essentially free of charge\, so the hacky optimisation can be dispensed with.
However\, when the match includes a substitution\, things become more problematic. The original string is now COW (due to the pattern match doing its "cheap" temporary copy)\, and when pp_subst() tries to modify the original string\, it triggers an unCOWing and a complete copy of the original string being made.
Note that s/.// is a bit of special case. Replace it with s/(.)// say\, and you'll find that older perls are just as slow as 5.20.0. (its a bit more complicated than that\, and 5.18.x was a bit better in that regard than the perls before that). Also\, there are many permutations depending on the nature of the replacement string (fixed\, variable\, //e etc)\, and the presence of s///r and so on.
But I do rather wonder whether COW could be improved to still share the same buffer in the presence of just different start/end offsets - so that s/^...// and s/...$// don't trigger a decowing.
But its all very messy and too much for my brain at this time of night.
-- "But Sidley Park is already a picture\, and a most amiable picture too. The slopes are green and gentle. The trees are companionably grouped at intervals that show them to advantage. The rill is a serpentine ribbon unwound from the lake peaceably contained by meadows on which the right amount of sheep are tastefully arranged." -- Lady Croom\, "Arcadia"
On 23 July 2014 23:19\, Dave Mitchell \davem@​iabyn\.com wrote:
On Mon\, Jul 21\, 2014 at 03:34:00PM +0100\, Aaron Crane wrote:
demerphq \demerphq@​gmail\.com wrote:
I would assume this is a serious performance failure.
Yes\, it seems so:
$ for f in 5.18.2 5.20.0; do time perl$f -wE 'say for ""\, $^V; for (1 .. 1000) { my $s = "x" x 1e8; $s =~ s/.// }'; done
v5.18.2
real 0m19.769s user 0m19.677s sys 0m0.082s
v5.20.0
real 0m39.145s user 0m29.205s sys 0m9.933s
Note that this is in fact a bug fix that has performance loss as a side-effect:
$ perl5180 \-le'$\_="ABC"; s/\.//; print ord eval q\{$&\}' 1 $ perl5200 \-le'$\_="ABC"; s/\.//; print ord eval q\{$&\}' 65
Basically in general\, a pattern match has to make a temporary copy of the string it successfully matched\, so that $&\, $1 etc hold the right values if the string is subsequently modified. As an optimisation\, this copy was skipped if the pattern didn't contain any captures\, and $`/$&/$' hadn't yet been seen. The example above shows that this assumption was a bit dodgy. In 5.20.0\, COW means that the temporary copy can usually be done essentially free of charge\, so the hacky optimisation can be dispensed with.
Removing characters from the front or back of a string efficiently is pretty useful. So is inplace modification of the string.
I really think that the long history of "you dont need $&" shows that ensuring the correctness of $& should take a back seed to performance.
On the other hand correctness is pleasing. :-)
I was wondering if we could wriggle out of this conundrum with a new modifier for /s.
I cant help but observe that I already used a modifier to work around the $& issue\, (the /p modifier)\, yet people still want to solve the bugs associated to it\, even if they have ugly performance implications.
I wonder if we would have success if we had the opposite modifier (say /P) that said "I dont want to use $& or fiends\, I want you to be fast"\, which told perl to do the s/// as quick and dirty efficient as possible and ignore the correctness of $& and friends after the match. (Although since IMO most people dont use $& and friends\, it seems improperly huffman encoded to require this.)
Yves
On Thu Jul 24 00:18:47 2014\, demerphq wrote:
I wonder if we would have success if we had the opposite modifier (say /P) that said "I dont want to use $& or fiends\, I want you to be fast"\, which told perl to do the s/// as quick and dirty efficient as possible and ignore the correctness of $& and friends after the match. (Although since IMO most people dont use $& and friends\, it seems improperly huffman encoded to require this.)
Wasnât this suggested as /n or something similar last year? I seem to remember you supported the idea and there were several messages back and forth in which you might have thought I disagreed\, though I did not.
--
Father Chrysostomos
On Wed Jul 23 14:19:47 2014\, davem wrote:
But I do rather wonder whether COW could be improved to still share the same buffer in the presence of just different start/end offsets - so that s/^...// and s/...$// don't trigger a decowing.
But its all very messy and too much for my brain at this time of night.
OOK offsets are stored in the string buffer itself. But perhaps we could bring the old IV system back at least for COW scalars.
--
Father Chrysostomos
On Thu Jul 17 23:58:03 2014\, ntyni@debian.org wrote:
This is a bug report for perl from Niko Tyni \ntyni@​debian\.org\, generated with the help of perlbug 1.39 running under perl 5.18.2.
----------------------------------------------------------------- Trying to fix Devel-Size [rt.cpan.org #95493] for Perl 5.20\, I ran into the following:
% perl5182 -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)' SV = PV(0x25b4a50) at 0x25d5720 REFCNT = 1 FLAGS = (POK\,OOK\,pPOK) OFFSET = 1 PV = 0x25cebe1 ( "\1" . ) "2345"\0 CUR = 4 LEN = 15
% perl5200 -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)' SV = PV(0x1179a90) at 0x11951d8 REFCNT = 1 FLAGS = (POK\,pPOK) PV = 0x118f9e0 "2345"\0 CUR = 4 LEN = 10
This changed with commit v5.19.0-246-g13b0f67\, "re-enable Copy-on- Write by default." Clearly COW stops the OOK hack triggering in this specific case.
The example is from the 'Offsets' section in perlguts\, so at least that needs updating. For the benefit of that\, and also Devel-Size\, is there another simple way to trigger the OOK hack noawadays? I couldn't come up with one yet.
Here is one way:
$ ./perl -Ilib -MDevel::Peek -le '$a=""; $a .= ""; Dump($a)' SV = PV(0x7fd534007098) at 0x7fd53402ff30 REFCNT = 1 FLAGS = (POK\,pPOK) PV = 0x7fd533c067c8 ""\0 CUR = 0 LEN = 10
Note the LEN. Then extend the string to one less than that:
$ ./perl -Ilib -MDevel::Peek -le '$a=""; $a .= "123456789"; Dump($a)' SV = PV(0x7fde84007098) at 0x7fde8402ff30 REFCNT = 1 FLAGS = (POK\,pPOK) PV = 0x7fde83c067c8 "123456789"\0 CUR = 9 LEN = 10
OK\, itâs still 10. Now try substitution:
$ ./perl -Ilib -MDevel::Peek -le '$a=""; $a .= "123456789"; $a=~s/.//; Dump($a)' SV = PV(0x7feae8807098) at 0x7feae882ff30 REFCNT = 1 FLAGS = (POK\,OOK\,pPOK) OFFSET = 1 PV = 0x7feae84067d9 ( "\1" . ) "23456789"\0 CUR = 8 LEN = 9
--
Father Chrysostomos
On Sun\, Sep 07\, 2014 at 07:17:57AM -0700\, Father Chrysostomos via RT wrote:
On Wed Jul 23 14:19:47 2014\, davem wrote:
But I do rather wonder whether COW could be improved to still share the same buffer in the presence of just different start/end offsets - so that s/^...// and s/...$// don't trigger a decowing.
But its all very messy and too much for my brain at this time of night.
OOK offsets are stored in the string buffer itself. But perhaps we could bring the old IV system back at least for COW scalars.
But as I pointed out earlier in this thread\, strings are normally expected to be null-terminated (not having htem so would probably break a lot of stuff)\, so even if we get OOK COW to work\, we wouldn't be able to have shared string buffers where the end point differed.
-- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry Wall
On Mon Sep 08 04:51:05 2014\, davem wrote:
On Sun\, Sep 07\, 2014 at 07:17:57AM -0700\, Father Chrysostomos via RT wrote:
On Wed Jul 23 14:19:47 2014\, davem wrote:
But I do rather wonder whether COW could be improved to still share the same buffer in the presence of just different start/end offsets - so that s/^...// and s/...$// don't trigger a decowing.
But its all very messy and too much for my brain at this time of night.
OOK offsets are stored in the string buffer itself. But perhaps we could bring the old IV system back at least for COW scalars.
But as I pointed out earlier in this thread\, strings are normally expected to be null-terminated (not having htem so would probably break a lot of stuff)\, so even if we get OOK COW to work\, we wouldn't be able to have shared string buffers where the end point differed.
I have done s/// on just the beginning enough times that I think it would be worth optimising it.
I have updated the docs in commit f942a0dfa92\, but I think this ticket should stay open (but not block 5.22) until the optimisation can be implemented.
--
Father Chrysostomos
Can I just say how much I love the subject of this thread?
D
Migrated from rt.perl.org#122322 (status was 'open')
Searchable as RT122322$