Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.9k stars 540 forks source link

Assertion failed in Perl_reg_numbered_buff_fetch, file regcomp.c, line 7459 #14081

Closed p5pRT closed 9 years ago

p5pRT commented 10 years ago

Migrated from rt.perl.org#122747 (status was 'resolved')

Searchable as RT122747$

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

Created by Mark.Martinec@ijs.si

Have been running 5.20.1-RC2 here under FreeBSD 10.0 for a couple of days without a problem. The application is a mail content filter (amavisd-new + SpamAssassin)\, which means that perl is in heavy use in a complex situation\, involving tainted variables and UTF-8 character strings.

Today one of the forked child process has crashed (SIGABRT) due Assertion failed​:

  Assertion failed​:   ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\,   function Perl_reg_numbered_buff_fetch\,   file regcomp.c\, line 7459.

Perl is build with gcc 4.8.4 20140828 with debugging enabled\, with -fstack-protector-strong and jmalloc memory protections enabled (MALLOC_CONF="abort​:true\,junk​:true\,redzone​:true").

The extra safeguards were there just in case - to rule out some potential cases of memory corruption\, although this crash does not seem to be related to memory corruption).

The coredump shows the following​:   (some names (plain ascii) were replaced by xxx to preserve   privacy\,the number of characters was not changed)

# gdb /usr/local/bin/perl /var/coredumps/perl-97654.core GNU gdb (GDB) 7.8 [GDB v7.8 for FreeBSD] Copyright [...] Reading symbols from /usr/local/bin/perl...done. [New process 101359] [New Thread 802006800 (LWP 101359)] Core was generated by `perl'. Program terminated with signal SIGABRT\, Aborted. #0 0x000000080171026a in thr_kill () from /lib/libc.so.7

(gdb) bt #0 0x000000080171026a in thr_kill () from /lib/libc.so.7 #1 0x00000008017d7ac9 in abort () from /lib/libc.so.7 #2 0x00000008017bb0b1 in __assert () from /lib/libc.so.7 #3 0x0000000800940240 in Perl_reg_numbered_buff_fetch (r=0x817a602b8\, paren=1\, sv=0x8173c1180) at regcomp.c​:7459 #4 0x000000080099668a in Perl_magic_get (sv=0x8173c1180\, mg=0x8174b2ed0) at mg.c​:805 #5 0x00000008009943a6 in Perl_mg_get (sv=0x8173c1180) at mg.c​:201 #6 0x0000000800a72e74 in Perl_save_scalar (gv=0x8021cd990) at scope.c​:219 #7 0x0000000800967e60 in Perl_save_re_context () at regcomp.c​:16475 #8 0x0000000800b2278c in Perl__core_swash_init (pkg=0x800bf9546 "utf8"\, name=0x800bf94ff "ToCf"\, listsv=0x800e273e0 \<PL_sv_undef>\, minbits=4\, none=0\,   invlist=0x0\, flags_p=0x0) at utf8.c​:2583 #9 0x0000000800b20cc2 in Perl_to_utf8_case (   p=0x80c8a2f7e "\342\200\234Intelligence without ambition is a bird without wings.\342\200\235 -Salvador Dali Save a tree. Please don't print this e-mail unless it's really necessary\n"\, ustrp=0x7fffffffd070 " \345\235\a\b"\, lenp=0x7fffffffcaa8\, swashp=0x800e27a68 \<PL_utf8_tofold>\, normal=0x800bf94ff "ToCf"\,   special=0x800bf9212 "") at utf8.c​:2028 #10 0x0000000800b22024 in Perl__to_utf8_fold_flags (   p=0x80c8a2f7e "\342\200\234Intelligence without ambition is a bird without wings.\342\200\235 -Salvador Dali Save a tree. Please don't print this e-mail unless it's really necessary\n"\, ustrp=0x7fffffffd070 " \345\235\a\b"\, lenp=0x7fffffffcaa8\, flags=2 '\002') at utf8.c​:2397 #11 0x0000000800b0aa5a in S_regmatch (reginfo=0x7fffffffd350\,   startpos=0x80c8a2f63 "xxxxx.xxxxxxxx@​outlook.com \342\200\234Intelligence without ambition is a bird without wings.\342\200\235 -Salvador Dali Save a tree. Please don't print this e-mail unless it's really necessary\n"\, prog=0x81811d030) at regexec.c​:4207 #12 0x0000000800b05846 in S_regtry (reginfo=0x7fffffffd350\, startposp=0x7fffffffd1b8) at regexec.c​:3200 #13 0x0000000800b051d5 in Perl_regexec_flags (rx=0x817a602b8\,   stringarg=0x80c8a2e00 "-- _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Xxxxx Xxxxxxxx University of Ljubljana Faculty of Natural Sciences and Engineering Department of Geology A\302\271ker\303\250eva xx or Xxxxxx xx SI-1000 Ljubljana Slovenia tel.​:"...\,   strend=0x80c8a2f70 "k@​outlook.com \342\200\234Intelligence without ambition is a bird without wings.\342\200\235 -Salvador Dali Save a tree. Please don't print this e-mail unless it's really necessary\n"\,   strbeg=0x80c8a2e00 "-- _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Xxxxx Xxxxxxxx University of Ljubljana Faculty of Natural Sciences and Engineering Department of Geology A\302\271ker\303\250eva xx or Xxxxxx xx SI-1000 Ljubljana Slovenia tel.​:"...\, minend=0\, sv=0x8178d82e8\, data=0x0\, flags=1) at regexec.c​:3058 #14 0x00000008009dc8e1 in Perl_pp_subst () at pp_hot.c​:2130 #15 0x00000008009801c0 in Perl_runops_debug () at dump.c​:2427 #16 0x00000008008938d9 in S_run_body (oldscope=1) at perl.c​:2451 #17 0x0000000800892de7 in perl_run (my_perl=0x802020048) at perl.c​:2372 #18 0x000000000040100c in main (argc=4\, argv=0x7fffffffd858\, env=0x7fffffffd880) at perlmain.c​:114

This happened during processing of the first MIME part (a rather short plain text part\, ISO-8859-2\, 8bit) of an otherwise rather large mail message with attachment.

The crash occurs within SpamAssassin code (the last debug log from SpamAssassin was​: SA dbg​: FreeMail​: From address​: ...)\, although I can't reproduce the failure when spamassassin is run from a command line - it only happens (reproducibly) when the SpamAssassin perl module is spawned from amavisd and given this particular mail message.

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.20.1: Configured by mark at Mon Sep 8 18:40:33 CEST 2014. Summary of my perl5 (revision 5 version 20 subversion 1) configuration: Platform: osname=freebsd, osvers=10.0-release-p7, archname=amd64-freebsd uname='freebsd dorothy.ijs.si 10.0-release-p7 freebsd 10.0-release-p7 #0: tue jul 8 06:37:44 utc 2014 root@amd64-builder.daemonology.net:usrobjusrsrcsysgeneric amd64 ' config_args='-sde -Dprefix=/usr/local -Darchlib=/usr/local/lib/perl5/5.20/mach -Dprivlib=/usr/local/lib/perl5/5.20 -Dman3dir=/usr/local/lib/perl5/5.20/perl/man/man3 -Dman1dir=/usr/local/man/man1 -Dsitearch=/usr/local/lib/perl5/site_perl/5.20/mach -Dsitelib=/usr/local/lib/perl5/site_perl/5.20 -Dscriptdir=/usr/local/bin -Dsiteman3dir=/usr/local/lib/perl5/5.20/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Ui_malloc -Ui_iconv -Uinstallusrbinperl -Dcc=gcc48 -Duseshrplib -Dinc_version_list=none -Dccflags=-DAPPLLIB_EXP="/usr/local/lib/perl5/5.20/BSDPAN" -Doptimize=-g -fno-omit-frame-pointer -fstack-protector-strong -DDEBUGGING -Ui_gdbm -Duse64bitint -Dusethreads=n -Dusemymalloc=n' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc48', ccflags ='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.20/BSDPAN" -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include', optimize='-g -fno-omit-frame-pointer -fstack-protector-strong', cppflags='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.20/BSDPAN" -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.8.4 20140828 (prerelease)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc48', ldflags ='-pthread -Wl,-E -fstack-protector -L/usr/local/lib' libpth=/usr/lib /usr/local/lib /usr/local/lib /usr/local/lib/gcc48/gcc/x86_64-portbld-freebsd10.0/4.8.4/include-fixed /usr/lib libs=-lgdbm -lm -lcrypt -lutil perllibs=-lm -lcrypt -lutil libc=, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' -Wl,-R/usr/local/lib/perl5/5.20/mach/CORE' cccdlflags='-DPIC -fPIC', lddlflags='-shared -L/usr/local/lib -fstack-protector' Locally applied patches: RC2 @INC for perl 5.20.1: /usr/local/lib/perl5/5.20/BSDPAN /usr/local/lib/perl5/site_perl/5.20/mach /usr/local/lib/perl5/site_perl/5.20 /usr/local/lib/perl5/5.20/mach /usr/local/lib/perl5/5.20 . Environment for perl 5.20.1: HOME=/root LANG (unset) LANGUAGE (unset) LC_ALL=en_US.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/root/bin:/usr/local/bin:/usr/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin PERL_BADLANG (unset) SHELL=/usr/local/bin/bash ```
p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

This happened during processing of the first MIME part (a rather short plain text part\, ISO-8859-2\, 8bit) of an otherwise rather large mail message with attachment.

The crash occurs within SpamAssassin code (the last debug log from SpamAssassin was​: SA dbg​: FreeMail​: From address​: ...)\, although I can't reproduce the failure when spamassassin is run from a command line [...]

Made some progress in narrowing this down\, can reproduce it now reliably by running spamassassin from a command line.

The crash involves a s/// operator with a horribly complicated regexp (not utf8\, not tainted)\, and a string (utf8\, tainted).

Will try to narrow it down further...

p5pRT commented 10 years ago

From @khwilliamson

On 09/10/2014 10​:23 AM\, Mark Martinec wrote​:

This happened during processing of the first MIME part (a rather short plain text part\, ISO-8859-2\, 8bit) of an otherwise rather large mail message with attachment.

The crash occurs within SpamAssassin code (the last debug log from SpamAssassin was​: SA dbg​: FreeMail​: From address​: ...)\, although I can't reproduce the failure when spamassassin is run from a command line [...]

Made some progress in narrowing this down\, can reproduce it now reliably by running spamassassin from a command line.

The crash involves a s/// operator with a horribly complicated regexp (not utf8\, not tainted)\, and a string (utf8\, tainted).

Will try to narrow it down further...

That would be helpful. One perhaps easy option is to try it with the string untainted. If that fixes the problem\, it will really narrow down the possible causes. (But I kinda doubt that will have an effect.)

Something else is to run it with valgrind. This may well be the result of a wild write or read.

Perhaps this info will aid you in the narrowing. The core dump indicates it is in the middle of a pattern match and is trying to match a string with the Dali quote.. The pattern match has been made into a 'trie'. The actual position in the utf8 string where the error occurs is shown in octal in the dump. It resolves to LEFT DOUBLE QUOTATION MARK U+201C. That all looks ok so far. The match is supposed to be case-insensitive\, and it is the first time in the program's execution that it has found a caseless match that doesn't have the rules for it coded in. So it has to go out to disk to read in those rules. It saves and restores the state around this fetch\, using Perl_save_re_context() to do the save. The assertion fails during the course of the save.

Here is the comment at the beginning of Perl_save_re_context()​: /* XXX Here's a total kludge. But we need to re-enter for swash routines. */

That indicates what we're up against. It is failing because of a problem with the capturing buffers in the pattern (The things that parentheses enclose). If nothing else\, you could send us the s/// text\, to compile here and eyeball for issues.

p5pRT commented 10 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

Thanks Karl for a quick response!

Got it down to a sensible size\, I'm sure it can be reduced further\, but I left the regexp in its original form for now (from SpamAssassin).

I realized it also fails on perl 5.20.0 build with clang.

The key seems to be in​:

  use re 'taint';

...removing that avoids the crash.

Here is now the test program​:

#!/usr/bin/perl

use strict; use re 'taint';

my $tlds = qr/ (?​:X(?​:N--(?​:MGB(?​:A(?​:(?​:3A4F16|YH7GP)A|AM7A8H|B2BD)|ERP4A5D4AR|C0A9AZCG|BH1A71E|X4CD0AB|9AWBF)|F(?​:IQ(?​:(?​:228C5H|S8|Z9)S|64B)|PCRJ9C3D|ZC2C9E2C)|C(?​:LCHC0EA0B2G2A9GCD|ZR(?​:694B|U2D)|G4BKI|1AVG)|X(?​:KC2(?​:DL3A5EE0H|AL3HYE2A)|HQ521B)|(?​:(?​:GEC|H2B)RJ9|Q9JYB4|90A3A)C|80A(?​:S(?​:EHDB|WG)|DXHKS|O21A)|N(?​:QV7F(?​:S00EMA)?|GBC5AZD)|3(?​:E0B707E|BST00M|DS443G)|KP(?​:R(?​:W13|Y57)D|UT3I)|Y(?​:FRO4I67O|GBI2AMMX)|6(?​:QQ986B3XL|FRZ82G)|I(?​:1B6B1A6A2E|O0A7I)|L(?​:GBBAT1AD8J|1ACC)|(?​:D1ACJ3|ZFR164)B|O(?​:GBPF8FL|3CW4H)|S(?​:9BRJ9C|ES554G)|4(?​:5BRJ9C|GBRIM)|J(?​:6W193G|1AMH)|55Q(?​:W42G|X5D)|P(?​:GBS0DH|1AI)|WGB(?​:H1C|L6A)|1QQW23A|RHQV96G|UNUP4Y|VHQUV)|XX|YZ)|C(?​:[CDFGKMNVWXZ]|O(?​:N(?​:S(?​:TRUCTION|ULTING)|(?​:TRACTOR|DO)S)|M(?​:P(?​:UTER|ANY)|MUNITY)?|(?​:L(?​:LEG|OGN)|FFE)E|O(?​:[LP]|KING)|UNTRY|DES)?|A(?​:R(?​:E(?​:ERS?)?|AVAN|DS)|(?​:NCERRESEARC|S)H|P(?​:ETOWN|ITAL)|T(?​:ERING)?|M(?​:ERA|P)|B)?|L(?​:(?​:EAN|OTH)ING|I(?​:NIC|CK)|AIMS|UB)?|R(?​:EDIT(?​:CARD)?|UISES)?|H(?​:RISTMAS|URCH|EAP)?|I(?​:T(?​:IC|Y))?|E(?​:NTER|R N|O)|U(? :ISINELLA)?|Y(?​:MRU)?)|S(?​:[BDGJKLMNRTVXZ]|U(?​:PP(?​:L(?​:IES|Y)|ORT)|R(?​:GERY|F)|ZUKI)?|O(?​:L(?​:UTIONS|AR)|FTWARE|CIAL|HU|Y)?|C(?​:[AB]|H(?​:MIDT|ULE)|OT)?|A(?​:ARLAND|RL)?|E(?​:RVICES|XY)?|H(?​:IKSHA|OES)?|P(?​:IEGEL|ACE)|I(?​:NGLES)?|Y(?​:STEMS)?)|B(?​:[BDFGHJSTVWY]|U(?​:ILD(?​:ERS)?|SINESS|ZZ)|A(?​:R(?​:GAINS)?|YERN)?|L(?​:ACK(?​:FRIDAY)?|UE)|E(?​:RLIN|ER|ST)?|I(?​:[DOZ]|KE)?|N(?​:PPARIBAS)?|O(?​:UTIQUE|O)?|R(?​:USSELS)?|MW?|ZH?)|M(?​:[CDGHKLMNPQRSTVWXYZ]|O(?​:(?​:RTGAG)?E|TORCYCLES|NASH|SCOW|BI|DA|V)?|A(?​:N(?​:AGEMENT|GO)|RKET(?​:ING)?|ISON)?|E(?​:(?​:LBOURN|M)E|DIA|ET|NU)?|I(?​:(?​:AM|N)I|L)|U(?​:SEUM)?)|P(?​:[EFGKMNSWY]|R(?​:O(?​:D(?​:UCTIONS)?|PERT(?​:IES|Y))?|AXI|ESS)?|H(?​:OTO(?​:GRAPHY|S)?|YSIO)?|A(?​:R(?​:T(?​:NER)?|I)S)?|I(?​:C(?​:TURE)?S|ZZA|NK)|L(?​:UMBING|ACE)?|(?​:OS)?T|UB)|G(?​:[DFGHNPQSTWY]|R(?​:A(?​:PHIC|TI)S|EEN|IPE)?|U(?​:I(?​:TARS|DE)|RU)?|L(?​:OB(?​:AL|O)|ASS)?|A(?​:L(?​:LERY)?)?|I(?​:FTS?|VES)?|M(?​:AIL|O)?|B(?​:IZ)?|E(?​:NT)?|O[PV])|A(?​:[DFLMNOQWZ]|C(?​:T(?​:IVE|OR)|COUNTANTS|ADEMY)?|U(?​:CTION|DIO|TOS)?|S(?​:SO CIATES|I A)?|R(?​:CHI|MY|PA)?|I(?​:RFORCE)?|T(?​:TORNEY)?|G(?​:ENCY)?|E(?​:RO)?|XA?)|F(?​:[JM]|I(?​:NANC(?​:IAL|E)|SH(?​:ING)?|TNESS)?|U(?​:RNITURE|TBOL|ND)|L(?​:IGHTS|ORIST)|O(?​:UNDATION|O)?|R(?​:OGANS|L)?|(?​:EEDBAC)?K|A(?​:IL|RM))|R(?​:E(?​:P(?​:UBLICAN|AIR|ORT)|(?​:CIPE|VIEW)S|S(?​:TAURAN)?T|N(?​:TALS)?|ALTOR|ISEN?|HAB|D)?|O(?​:CKS|DEO)?|I(?​:CH|O)|S(?​:VP)?|U(?​:HR)?|YUKYU|W)|D(?​:[JKMZ]|I(?​:(?​:SCOUN|E)T|RECT(?​:ORY)?|AMONDS|GITAL)|E(?​:NT(?​:IST|AL)|MOCRAT|GREE|ALS|SI)?|A(?​:[DY]|TING|NCE)|O(?​:MAINS)?|URBAN|NP)|T(?​:[CDFGHJKLMNPTVWZ]|O(?​:(?​:OL|Y)S|DAY|KYO|WN|P)?|R(?​:A(?​:INING|VEL|DE))?|A(?​:T(?​:TOO|AR)|X)|I(?​:ENDA|ROL|PS)|E(?​:CHNOLOGY|L))|E(?​:[CEGR]|N(?​:GINEER(?​:ING)?|TERPRISES)|X(?​:P(?​:OSED|ERT)|CHANGE)|(?​:QUIPMEN|A)?T|DU(?​:CATION)?|S(?​:TATE|Q)?|VENTS|MAIL|US?)|V(?​:[CGU]|E(?​:(?​:NTURE|GA)S|RSICHERUNG|T)?|O(?​:T(?​:[EO]|ING)|YAGE|DKA)|I(?​:(?​:AJE|LLA)S|SION)?|(?​:LAANDERE)?N|A(?​:CATIONS)?)|L(?​:[BCKRSVY]|I(?​:M(?​:ITED|O)|GHTING|FE|NK)?|A(?​:CAIXA|WYER|ND)?|O(?​:NDON|ANS|TTO)|U(?​:X(?​:URY|E))?|T(?​:DA)?|EASE|GBT)|H(?​:[KM NRTU]|O( ?​:L(?​:DINGS|IDAY)|ST(?​:ING)?|[RU]SE|MES|W)|E(?​:(?​:ALTHCA)?RE|LP)|A(?​:MBURG|US)|I(?​:PHOP|V))|I(?​:[DELOQRST]|N(?​:[GK]|(?​:VESTMENT|DUSTRIE)S|T(?​:ERNATIONAL)?|S(?​:TITUT|UR)E|FO)?|M(?​:MO(?​:BILIEN)?)?)|W(?​:E(?​:B(?​:SITE|CAM)|D)|I(?​:LLIAMHILL|EN|KI)|A(?​:LES|TCH|NG)|(?​:ORK)?S|HOSWHO|T[CF]|F)|N(?​:[FLOPUZ]|E(?​:T(?​:WORK)?|USTAR|W)?|A(?​:GOYA|ME|VY)?|I(?​:NJA)?|R[AW]?|GO?|Y?C|HK)|K(?​:[EGHMPWYZ]|I(?​:TCHEN|WI|M)?|(?​:AUFE|OEL)?N|R(?​:E?D)?)|O(?​:(?​:KINAW|TSUK)A|RG(?​:ANIC)?|N[GL]|OO|VH|M)|J(?​:[MP]|O(?​:B(?​:URG|S))?|E(?​:TZT)?|UEGOS)|Y(?​:[ET]|O(?​:KOHAMA|UTUBE)|A(?​:CHTS|NDEX))|U(?​:[AGKSYZ]|N(?​:IVERSITY|O)|OL)|Q(?​:UEBEC|PON|A)|Z(?​:[AMW]|ONE))/ix;

my $email_regex = qr/   (?=.{0\,64}\@​) # limit userpart to 64 chars (and speed up searching?)   (?\<![a-z0-9!#\$%&'*+\/=?^_`{|}~-]) # start boundary   ( # capture email   [a-z0-9!#\$%&'*+\/=?^_`{|}~-]+ # no dot in beginning   (?​:\.[a-z0-9!#\$%&'*+\/=?^_`{|}~-]+)* # no consecutive dots\, no ending dot   \@​   (?​:[a-z0-9](?​:[a-z0-9-]{0\,59}[a-z0-9])?\.){1\,4} # max 4x61 char parts (should be enough?)   ${tlds} # ends with valid tld   )   (?!(?​:[a-z0-9-]|\.[a-z0-9])) # make sure domain ends here /xi;

my(@​body) = (   "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\,   "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) {  
s{\<?(?\<!mailto​:)${email_regex}(?​:>|\s{1\,10}(?!(?​:fa(?​:x|csi)|tel|phone|e?-?mail))[a-z]{2\,11}​:)}{ }gi; }

p5pRT commented 10 years ago

From @khwilliamson

On 09/10/2014 11​:51 AM\, Mark Martinec wrote​:

Thanks Karl for a quick response!

Got it down to a sensible size\, I'm sure it can be reduced further\, but I left the regexp in its original form for now (from SpamAssassin).

I realized it also fails on perl 5.20.0 build with clang.

This is not a recent regression\, as it fails back through at least 5.12. Neither valgrind nor clang asan give any extra information.

I'm hoping someone with more expertise than I currently have in this area will look at this.

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

Got it down to this small test program​:

#!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = (   "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\,   "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) {   s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b   (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi; }

perl 5.20.{0\,1} : Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7455. Abort trap

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 15​:05\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

Got it down to this small test program​:

#!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = ( "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\, "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) { s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi; }

perl 5.20.{0\,1} : Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7455. Abort trap

Are you sure you have the script right? Because that code never fetches from a capture buffer and does not fail on bleadperl.

I added a "print $1" to the script​:

$ cat rt122747.t #!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = (   "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\,   "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) {   s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b   (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi;   print "matched​: >>$1\<\<\n"; } __END__

And here is what I get from blead​:

$ ./perl -Ilib -T rt122747.t matched​: >>.xxxx@​outlook.com\<\< matched​: >>.xxxx@​outlook.com\<\<

/me confused.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

On 10 September 2014 20​:55\, Karl Williamson \public@&#8203;khwilliamson\.com wrote​:

On 09/10/2014 11​:51 AM\, Mark Martinec wrote​:

Thanks Karl for a quick response!

Got it down to a sensible size\, I'm sure it can be reduced further\, but I left the regexp in its original form for now (from SpamAssassin).

I realized it also fails on perl 5.20.0 build with clang.

This is not a recent regression\, as it fails back through at least 5.12. Neither valgrind nor clang asan give any extra information.

I'm hoping someone with more expertise than I currently have in this area will look at this.

I cant reproduce it with bleadperl at all. Nor can I reproduce with 5.14.

So I am pretty confused here. What script did you use to determine it is an old regression?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @Hugmeir

On Thu\, Sep 11\, 2014 at 4​:57 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 15​:05\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

Got it down to this small test program​:

#!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = ( "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\, "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) { s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi; }

perl 5.20.{0\,1} : Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7455. Abort trap

Are you sure you have the script right? Because that code never fetches from a capture buffer and does not fail on bleadperl.

I added a "print $1" to the script​:

$ cat rt122747.t #!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = ( "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\, "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) { s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi; print "matched​: >>$1\<\<\n"; } __END__

And here is what I get from blead​:

$ ./perl -Ilib -T rt122747.t matched​: >>.xxxx@​outlook.com\<\< matched​: >>.xxxx@​outlook.com\<\<

/me confused.

Yves

I can reproduce this on 5.10-5.20 but only for debugging builds; maybe you forgot a -DDEBUGGING?

p5pRT commented 10 years ago

From @demerphq

And here is what I get from blead​:

$ ./perl -Ilib -T rt122747.t matched​: >>.xxxx@​outlook.com\<\< matched​: >>.xxxx@​outlook.com\<\<

/me confused.

Yves

I can reproduce this on 5.10-5.20 but only for debugging builds; maybe you forgot a -DDEBUGGING?

Bah\, thought I was on a DEBUGGING build\, but I wasnt. Thanks Brian.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @cpansprout

On Thu Sep 11 06​:07​:03 2014\, mmartinec wrote​:

Got it down to this small test program​:

#!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = ( "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\, "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) { s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi; }

perl 5.20.{0\,1} : Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7455. Abort trap

I think what’s happening is that the kludge to localise $1\, etc. is executed when the regexp is in an inconsistent state. rx->subbeg is referring to the string from the previous match ('\mailto&#8203;:xxxx\.xxxx@&#8203;outlook\.com')\, but the offsets for $1 extend beyond the end of the 30-character string​:

(gdb) p rx->offs[1] $8 = {   start = 12\,   end = 33\,   start_tmp = 12 }

A watchpoint on rx->offs shows that it gets swapped out here in regexec.c​:

2706 swap = prog->offs; 2707 /* do we need a save destructor here for eval dies? */ 2708 Newxz(prog->offs\, (prog->nparens + 1)\, regexp_paren_pair); 2709 DEBUG_BUFFERS_r(PerlIO_printf(Perl_debug_log\, 2710 "rex=0x%"UVxf" saving offs​: orig=0x%"UVxf" new=0x%"UVxf"\n"

when the backtrace is like this​:

#0 Perl_regexec_flags (my_perl=0x100803200\, rx=0x10082fdf8\, stringarg=0x10060b658 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0x10060b67d ""\, strbeg=0x10060b658 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0x1008063e8\, data=0x0\, flags=1) at regexec.c​:2709 #1 0x0000000100247f3f in Perl_pp_subst (my_perl=0x100803200) at pp_hot.c​:2120 #2 0x00000001001b847c in Perl_runops_debug (my_perl=0x100803200) at dump.c​:2231 #3 0x000000010000a8ea in S_run_body (my_perl=0x100803200\, oldscope=1) at perl.c​:2416 #4 0x0000000100009905 in perl_run (my_perl=0x100803200) at perl.c​:2339 #5 0x0000000100072698 in main (argc=3\, argv=0x7fff5fbffa78\, env=0x7fff5fbffa98) at miniperlmain.c​:120

So the ordering of some of this stuff needs to be rethought.

A git bisect points me to this commit​:

commit 44a2ac759eaf811ea851bdf9177a51bf9b95b5ce Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Dec 29 22​:45​:51 2006 +0100

  Re​: [PATCH] Change implementation of %+ to use a proper tied hash interface and add support for %-   Message-ID​: \9b18b3110612291245q792fe91cu69422d2b81bb4f0b@&#8203;mail\.gmail\.com

But I think it’s a false positive.

--

Father Chrysostomos

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 17​:36\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Can you show me your perl -V and the output of MY version of your script? (Attached)

Brian if you happen to have your Configure options handy i would appreciate knowing what they are.

And again I find it very odd that a script which never reads a capture buffer dies with this error.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

#!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = (   "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\,   "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) {   s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b   (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi;   print "matched​: >>$1\<\<\n"; }

p5pRT commented 10 years ago

From @demerphq

#!/usr/bin/perl

use strict; use re 'taint';

my $tlds = qr/ (?​:X(?​:N--(?​:MGB(?​:A(?​:(?​:3A4F16|YH7GP)A|AM7A8H|B2BD)|ERP4A5D4AR|C0A9AZCG|BH1A71E|X4CD0AB|9AWBF)|F(?​:IQ(?​:(?​:228C5H|S8|Z9)S|64B)|PCRJ9C3D|ZC2C9E2C)|C(?​:LCHC0EA0B2G2A9GCD|ZR(?​:694B|U2D)|G4BKI|1AVG)|X(?​:KC2(?​:DL3A5EE0H|AL3HYE2A)|HQ521B)|(?​:(?​:GEC|H2B)RJ9|Q9JYB4|90A3A)C|80A(?​:S(?​:EHDB|WG)|DXHKS|O21A)|N(?​:QV7F(?​:S00EMA)?|GBC5AZD)|3(?​:E0B707E|BST00M|DS443G)|KP(?​:R(?​:W13|Y57)D|UT3I)|Y(?​:FRO4I67O|GBI2AMMX)|6(?​:QQ986B3XL|FRZ82G)|I(?​:1B6B1A6A2E|O0A7I)|L(?​:GBBAT1AD8J|1ACC)|(?​:D1ACJ3|ZFR164)B|O(?​:GBPF8FL|3CW4H)|S(?​:9BRJ9C|ES554G)|4(?​:5BRJ9C|GBRIM)|J(?​:6W193G|1AMH)|55Q(?​:W42G|X5D)|P(?​:GBS0DH|1AI)|WGB(?​:H1C|L6A)|1QQW23A|RHQV96G|UNUP4Y|VHQUV)|XX|YZ)|C(?​:[CDFGKMNVWXZ]|O(?​:N(?​:S(?​:TRUCTION|ULTING)|(?​:TRACTOR|DO)S)|M(?​:P(?​:UTER|ANY)|MUNITY)?|(?​:L(?​:LEG|OGN)|FFE)E|O(?​:[LP]|KING)|UNTRY|DES)?|A(?​:R(?​:E(?​:ERS?)?|AVAN|DS)|(?​:NCERRESEARC|S)H|P(?​:ETOWN|ITAL)|T(?​:ERING)?|M(?​:ERA|P)|B)?|L(?​:(?​:EAN|OTH)ING|I(?​:NIC|CK)|AIMS|UB)?|R(?​:EDIT(?​:CARD)?|UISES)?|H(?​:RISTMAS|URCH|EAP)?|I(?​:T(?​:IC|Y))?|E(?​:NTER|RN|O)|U(?​:ISINELLA)?|Y(?​:MRU)?)|S(?​:[BDGJKLMNRTVXZ]|U(?​:PP(?​:L(?​:IES|Y)|ORT)|R(?​:GERY|F)|ZUKI)?|O(?​:L(?​:UTIONS|AR)|FTWARE|CIAL|HU|Y)?|C(?​:[AB]|H(?​:MIDT|ULE)|OT)?|A(?​:ARLAND|RL)?|E(?​:RVICES|XY)?|H(?​:IKSHA|OES)?|P(?​:IEGEL|ACE)|I(?​:NGLES)?|Y(?​:STEMS)?)|B(?​:[BDFGHJSTVWY]|U(?​:ILD(?​:ERS)?|SINESS|ZZ)|A(?​:R(?​:GAINS)?|YERN)?|L(?​:ACK(?​:FRIDAY)?|UE)|E(?​:RLIN|ER|ST)?|I(?​:[DOZ]|KE)?|N(?​:PPARIBAS)?|O(?​:UTIQUE|O)?|R(?​:USSELS)?|MW?|ZH?)|M(?​:[CDGHKLMNPQRSTVWXYZ]|O(?​:(?​:RTGAG)?E|TORCYCLES|NASH|SCOW|BI|DA|V)?|A(?​:N(?​:AGEMENT|GO)|RKET(?​:ING)?|ISON)?|E(?​:(?​:LBOURN|M)E|DIA|ET|NU)?|I(?​:(?​:AM|N)I|L)|U(?​:SEUM)?)|P(?​:[EFGKMNSWY]|R(?​:O(?​:D(?​:UCTIONS)?|PERT(?​:IES|Y))?|AXI|ESS)?|H(?​:OTO(?​:GRAPHY|S)?|YSIO)?|A(?​:R(?​:T(?​:NER)?|I)S)?|I(?​:C(?​:TURE)?S|ZZA|NK)|L(?​:UMBING|ACE)?|(?​:OS)?T|UB)|G(?​:[DFGHNPQSTWY]|R(?​:A(?​:PHIC|TI)S|EEN|IPE)?|U(?​:I(?​:TARS|DE)|RU)?|L(?​:OB(?​:AL|O)|ASS)?|A(?​:L(?​:LERY)?)?|I(?​:FTS?|VES)?|M(?​:AIL|O)?|B(?​:IZ)?|E(?​:NT)?|O[PV])|A(?​:[DFLMNOQWZ]|C(?​:T(?​:IVE|OR)|COUNTANTS|ADEMY)?|U(?​:CTION|DIO|TOS)?|S(?​:SOCIATES|IA)?|R(?​:CHI|MY|PA)?|I(?​:RFORCE)?|T(?​:TORNEY)?|G(?​:ENCY)?|E(?​:RO)?|XA?)|F(?​:[JM]|I(?​:NANC(?​:IAL|E)|SH(?​:ING)?|TNESS)?|U(?​:RNITURE|TBOL|ND)|L(?​:IGHTS|ORIST)|O(?​:UNDATION|O)?|R(?​:OGANS|L)?|(?​:EEDBAC)?K|A(?​:IL|RM))|R(?​:E(?​:P(?​:UBLICAN|AIR|ORT)|(?​:CIPE|VIEW)S|S(?​:TAURAN)?T|N(?​:TALS)?|ALTOR|ISEN?|HAB|D)?|O(?​:CKS|DEO)?|I(?​:CH|O)|S(?​:VP)?|U(?​:HR)?|YUKYU|W)|D(?​:[JKMZ]|I(?​:(?​:SCOUN|E)T|RECT(?​:ORY)?|AMONDS|GITAL)|E(?​:NT(?​:IST|AL)|MOCRAT|GREE|ALS|SI)?|A(?​:[DY]|TING|NCE)|O(?​:MAINS)?|URBAN|NP)|T(?​:[CDFGHJKLMNPTVWZ]|O(?​:(?​:OL|Y)S|DAY|KYO|WN|P)?|R(?​:A(?​:INING|VEL|DE))?|A(?​:T(?​:TOO|AR)|X)|I(?​:ENDA|ROL|PS)|E(?​:CHNOLOGY|L))|E(?​:[CEGR]|N(?​:GINEER(?​:ING)?|TERPRISES)|X(?​:P(?​:OSED|ERT)|CHANGE)|(?​:QUIPMEN|A)?T|DU(?​:CATION)?|S(?​:TATE|Q)?|VENTS|MAIL|US?)|V(?​:[CGU]|E(?​:(?​:NTURE|GA)S|RSICHERUNG|T)?|O(?​:T(?​:[EO]|ING)|YAGE|DKA)|I(?​:(?​:AJE|LLA)S|SION)?|(?​:LAANDERE)?N|A(?​:CATIONS)?)|L(?​:[BCKRSVY]|I(?​:M(?​:ITED|O)|GHTING|FE|NK)?|A(?​:CAIXA|WYER|ND)?|O(?​:NDON|ANS|TTO)|U(?​:X(?​:URY|E))?|T(?​:DA)?|EASE|GBT)|H(?​:[KMNRTU]|O(?​:L(?​:DINGS|IDAY)|ST(?​:ING)?|[RU]SE|MES|W)|E(?​:(?​:ALTHCA)?RE|LP)|A(?​:MBURG|US)|I(?​:PHOP|V))|I(?​:[DELOQRST]|N(?​:[GK]|(?​:VESTMENT|DUSTRIE)S|T(?​:ERNATIONAL)?|S(?​:TITUT|UR)E|FO)?|M(?​:MO(?​:BILIEN)?)?)|W(?​:E(?​:B(?​:SITE|CAM)|D)|I(?​:LLIAMHILL|EN|KI)|A(?​:LES|TCH|NG)|(?​:ORK)?S|HOSWHO|T[CF]|F)|N(?​:[FLOPUZ]|E(?​:T(?​:WORK)?|USTAR|W)?|A(?​:GOYA|ME|VY)?|I(?​:NJA)?|R[AW]?|GO?|Y?C|HK)|K(?​:[EGHMPWYZ]|I(?​:TCHEN|WI|M)?|(?​:AUFE|OEL)?N|R(?​:E?D)?)|O(?​:(?​:KINAW|TSUK)A|RG(?​:ANIC)?|N[GL]|OO|VH|M)|J(?​:[MP]|O(?​:B(?​:URG|S))?|E(?​:TZT)?|UEGOS)|Y(?​:[ET]|O(?​:KOHAMA|UTUBE)|A(?​:CHTS|NDEX))|U(?​:[AGKSYZ]|N(?​:IVERSITY|O)|OL)|Q(?​:UEBEC|PON|A)|Z(?​:[AMW]|ONE))/ix;

my $email_regex = qr/   (?=.{0\,64}\@​) # limit userpart to 64 chars (and speed up searching?)   (?\<![a-z0-9!#\$%&'*+\/=?^_`{|}~-]) # start boundary   ( # capture email   [a-z0-9!#\$%&'*+\/=?^_`{|}~-]+ # no dot in beginning   (?​:\.[a-z0-9!#\$%&'*+\/=?^_`{|}~-]+)* # no consecutive dots\, no ending dot   \@​   (?​:[a-z0-9](?​:[a-z0-9-]{0\,59}[a-z0-9])?\.){1\,4} # max 4x61 char parts (should be enough?)   ${tlds} # ends with valid tld   )   (?!(?​:[a-z0-9-]|\.[a-z0-9])) # make sure domain ends here /xi;

my(@​body) = (   "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\,   "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) {   s{\<?(?\<!mailto​:)${email_regex}(?​:>|\s{1\,10}(?!(?​:fa(?​:x|csi)|tel|phone|e?-?mail))[a-z]{2\,11}​:)}{ }gi; }

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Can you show me your perl -V and the output of MY version of your script? (Attached)

$ ./rt122747.t matched​: >>.xxxx@​outlook.com\<\< Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7455. Abort trap

Happens on two hosts\, one has 5.20.1-RC2 as documented at the top of this PR (gcc48\, FreeBSD10.0\, -DDEBUGGING)\,

the other host is a 5.20.0\, built with a clang 3.4.1 compiler\, also on a FreeBSD 10.0\, as follows​:

$ perl -V Summary of my perl5 (revision 5 version 20 subversion 0) configuration​:

  Platform​:   osname=freebsd\, osvers=10.0-release\, archname=amd64-freebsd-thread-multi   uname='freebsd 10amd64-ws-default-job-03 10.0-release freebsd 10.0-release amd64 '   config_args='-sde -Dprefix=/usr/local -Darchlib=/usr/local/lib/perl5/5.20/mach -Dprivlib=/usr/local/lib/perl5/5.20 -Dman3dir=/usr/local/lib/perl5/5.20/perl/man/man3 -Dman1dir=/usr/local/man/man1 -Dsitearch=/usr/local/lib/perl5/site_perl/5.20/mach -Dsitelib=/usr/local/lib/perl5/site_perl/5.20 -Dscriptdir=/usr/local/bin -Dsiteman3dir=/usr/local/lib/perl5/5.20/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Ui_malloc -Ui_iconv -Uinstallusrbinperl -Dcc=cc -Duseshrplib -Dinc_version_list=none -Dccflags=-DAPPLLIB_EXP="/usr/local/lib/perl5/5.20/BSDPAN" -Doptimize=-g -DDEBUGGING -Ui_gdbm -Duse64bitint -Dusethreads=y -Dusemymalloc=n'   hint=recommended\, useposix=true\, d_sigaction=define   useithreads=define\, usemultiplicity=define   use64bitint=define\, use64bitall=define\, uselongdouble=undef   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='cc'\, ccflags ='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.20/BSDPAN" -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'\,   optimize='-g'\,   cppflags='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.20/BSDPAN" -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'   ccversion=''\, gccversion='4.2.1 Compatible FreeBSD Clang 3.3 (tags/RELEASE_33/final 183502)'\, gccosandvers=''   intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16   ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags ='-pthread -Wl\,-E -fstack-protector -L/usr/local/lib'   libpth=/usr/lib /usr/local/lib /usr/include/clang/3.3 /usr/lib   libs=-lm -lcrypt -lutil   perllibs=-lm -lcrypt -lutil   libc=\, so=so\, useshrplib=true\, libperl=libperl.so   gnulibc_version=''   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='
-Wl\,-R/usr/local/lib/perl5/5.20/mach/CORE'   cccdlflags='-DPIC -fPIC'\, lddlflags='-shared -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:   Compile-time options​: DEBUGGING HAS_TIMES MULTIPLICITY PERLIO_LAYERS   PERL_DONT_CREATE_GVSV   PERL_HASH_FUNC_ONE_AT_A_TIME_HARD   PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP   PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV   PERL_TRACK_MEMPOOL USE_64_BIT_ALL USE_64_BIT_INT   USE_ITHREADS USE_LARGE_FILES USE_LOCALE   USE_LOCALE_COLLATE USE_LOCALE_CTYPE   USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF   USE_REENTRANT_API   Built under freebsd   Compiled at Jun 16 2014 15​:12​:36   @​INC​:   /usr/local/lib/perl5/5.20/BSDPAN   /usr/local/lib/perl5/site_perl/5.20/mach   /usr/local/lib/perl5/site_perl/5.20   /usr/local/lib/perl5/5.20/mach   /usr/local/lib/perl5/5.20   .

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 17​:30\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Thu Sep 11 06​:07​:03 2014\, mmartinec wrote​:

Got it down to this small test program​:

#!/usr/bin/perl

use strict; use re 'taint';

my(@​body) = ( "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\, "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) { s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi; }

perl 5.20.{0\,1} : Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7455. Abort trap

I think what’s happening is that the kludge to localise $1\, etc. is executed when the regexp is in an inconsistent state. rx->subbeg is referring to the string from the previous match ('\<mailto​: xxxx.xxxx@​outlook.com>')\, but the offsets for $1 extend beyond the end of the 30-character string​:

(gdb) p rx->offs[1] $8 = { start = 12\, end = 33\, start_tmp = 12 }

A watchpoint on rx->offs shows that it gets swapped out here in regexec.c​:

2706 swap = prog->offs; 2707 /* do we need a save destructor here for eval dies? */ 2708 Newxz(prog->offs\, (prog->nparens + 1)\, regexp_paren_pair); 2709 DEBUG_BUFFERS_r(PerlIO_printf(Perl_debug_log\, 2710 "rex=0x%"UVxf" saving offs​: orig=0x%"UVxf" new=0x%"UVxf"\n"

when the backtrace is like this​:

#0 Perl_regexec_flags (my_perl=0x100803200\, rx=0x10082fdf8\, stringarg=0x10060b658 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0x10060b67d ""\, strbeg=0x10060b658 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0x1008063e8\, data=0x0\, flags=1) at regexec.c​:2709 #1 0x0000000100247f3f in Perl_pp_subst (my_perl=0x100803200) at pp_hot.c​:2120 #2 0x00000001001b847c in Perl_runops_debug (my_perl=0x100803200) at dump.c​:2231 #3 0x000000010000a8ea in S_run_body (my_perl=0x100803200\, oldscope=1) at perl.c​:2416 #4 0x0000000100009905 in perl_run (my_perl=0x100803200) at perl.c​:2339 #5 0x0000000100072698 in main (argc=3\, argv=0x7fff5fbffa78\, env=0x7fff5fbffa98) at miniperlmain.c​:120

So the ordering of some of this stuff needs to be rethought.

A git bisect points me to this commit​:

commit 44a2ac759eaf811ea851bdf9177a51bf9b95b5ce Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Dec 29 22​:45​:51 2006 +0100

Re&#8203;: \[PATCH\] Change implementation of %\+ to use a proper tied hash

interface and add support for %- Message-ID​: \< 9b18b3110612291245q792fe91cu69422d2b81bb4f0b@​mail.gmail.com>

But I think it’s a false positive.

Yes it almost definitely is. That is the patch where Perl_reg_numbered_buff_fetch() was added so it could be reused. Prior to that I bet there was no assert.

I still do not see where this function is called. Can you show me the backtrace from where the assert fires? I am as yet unable to replicate on bleadperl.

Yves

p5pRT commented 10 years ago

From @Hugmeir

On Thu\, Sep 11\, 2014 at 5​:51 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:36\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Can you show me your perl -V and the output of MY version of your script? (Attached)

Brian if you happen to have your Configure options handy i would appreciate knowing what they are.

And again I find it very odd that a script which never reads a capture buffer dies with this error.

ml99299​:perl-blead brfraser$ ./perl -Ilib ~/Downloads/rt122747.t matched​: >>.xxxx@​outlook.com\<\< Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap​: 6

ml99299​:perl-blead brfraser$ ./perl -Ilib ~/Downloads/rt122747_2.t Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap​: 6

$ ./perl -Ilib -V Summary of my perl5 (revision 5 version 21 subversion 4) configuration​:   Local Commit​: ff5975c78387030c95c7f997ee4755c6256d8360   Ancestor​: 2febb45ac8fe9a31602934af3d9c14587543a3d9   Platform​:   osname=darwin\, osvers=13.3.0\, archname=darwin-2level   uname='darwin ml99299 13.3.0 darwin kernel version 13.3.0​: tue jun 3 21​:27​:35 pdt 2014; root​:xnu-2422.110.17~1release_x86_64 x86_64 '   config_args='-des -Dusedevel -DDEBUGGING'   hint=recommended\, useposix=true\, d_sigaction=define   useithreads=undef\, usemultiplicity=undef   use64bitint=define\, use64bitall=define\, uselongdouble=undef   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='cc'\, ccflags ='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector'\,   optimize='-O3 -g'\,   cppflags='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector'   ccversion=''\, gccversion='4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)'\, gccosandvers=''   intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16\, longdblkind=3   ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc'\, ldflags =' -fstack-protector'   libpth=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/5.1/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib   libs=-ldbm -ldl -lm -lutil -lc   perllibs=-ldl -lm -lutil -lc   libc=\, so=dylib\, useshrplib=false\, libperl=libperl.a   gnulibc_version=''   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=bundle\, d_dlsymun=undef\, ccdlflags=' '   cccdlflags=' '\, lddlflags=' -bundle -undefined dynamic_lookup -fstack-protector'

Characteristics of this binary (from libperl)​:   Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS   PERL_DONT_CREATE_GVSV   PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP   PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV   PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT   USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE   USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME   USE_PERLIO USE_PERL_ATOF   Locally applied patches​: ff5975c78387030c95c7f997ee4755c6256d8360   Built under darwin   Compiled at Sep 11 2014 18​:10​:30   %ENV​:   PERL5LIB="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/lib/perl5​:/Volumes/git_tree/main/lib"   PERLBREW_BASHRC_VERSION="0.67"   PERLBREW_HOME="/Users/brfraser/.perlbrew"   PERLBREW_LIB="all"   PERLBREW_MANPATH="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/man​:/Users/brfraser/perl5/perlbrew/perls/perl-5.18.2/man"   PERLBREW_PATH="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/bin​:/Users/brfraser/perl5/perlbrew/bin​:/Users/brfraser/perl5/perlbrew/perls/perl-5.18.2/bin"   PERLBREW_PERL="perl-5.18.2"   PERLBREW_ROOT="/Users/brfraser/perl5/perlbrew"   PERLBREW_VERSION="0.67"   PERL_LOCAL_LIB_ROOT="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all"   PERL_MB_OPT="--install_base /Users/brfraser/.perlbrew/libs/perl-5.18.2@​all"   PERL_MM_OPT="INSTALL_BASE=/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all"   @​INC​:   lib   /Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/lib/perl5/darwin-2level   /Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/lib/perl5   /Volumes/git_tree/main/lib   /usr/local/lib/perl5/site_perl/5.21.4/darwin-2level   /usr/local/lib/perl5/site_perl/5.21.4   /usr/local/lib/perl5/5.21.4/darwin-2level   /usr/local/lib/perl5/5.21.4   .

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 18​:13\, Brian Fraser \fraserbn@&#8203;gmail\.com wrote​:

On Thu\, Sep 11\, 2014 at 5​:51 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:36\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Can you show me your perl -V and the output of MY version of your script? (Attached)

Brian if you happen to have your Configure options handy i would appreciate knowing what they are.

And again I find it very odd that a script which never reads a capture buffer dies with this error.

ml99299​:perl-blead brfraser$ ./perl -Ilib ~/Downloads/rt122747.t matched​: >>.xxxx@​outlook.com\<\< Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap​: 6

ml99299​:perl-blead brfraser$ ./perl -Ilib ~/Downloads/rt122747_2.t Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap​: 6

Well\, this doesnt make any sense to me.

Can you show me the exact Configure you used\, because so far every one I have tried does not fail?

For instance I used this​:

./Configure -Doptimize=-g -d -Dusedevel -Dusethreads -Dcc=ccache\ gcc -Dld=gcc -DDEBUGGING -Accflags="-msse2 -mssse3 -maes"

and this​:

./Configure -Doptimize=-g -d -Dusedevel -Dcc=ccache\ gcc -Dld=gcc -DDEBUGGING -Accflags="-msse2 -mssse3 -maes"

and neither produce the failure you describe.

I would very much like to help fix this\, but if I cant replicate it I cant help.

$ ./perl -Ilib -V Summary of my perl5 (revision 5 version 21 subversion 4) configuration​: Local Commit​: ff5975c78387030c95c7f997ee4755c6256d8360 Ancestor​: 2febb45ac8fe9a31602934af3d9c14587543a3d9 Platform​: osname=darwin\, osvers=13.3.0\, archname=darwin-2level uname='darwin ml99299 13.3.0 darwin kernel version 13.3.0​: tue jun 3 21​:27​:35 pdt 2014; root​:xnu-2422.110.17~1release_x86_64 x86_64 ' config_args='-des -Dusedevel -DDEBUGGING' hint=recommended\, useposix=true\, d_sigaction=define useithreads=undef\, usemultiplicity=undef use64bitint=define\, use64bitall=define\, uselongdouble=undef usemymalloc=n\, bincompat5005=undef Compiler​: cc='cc'\, ccflags ='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector'\, optimize='-O3 -g'\, cppflags='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector' ccversion=''\, gccversion='4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)'\, gccosandvers='' intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678 d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16\, longdblkind=3 ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8 alignbytes=8\, prototype=define Linker and Libraries​: ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc'\, ldflags =' -fstack-protector'

libpth=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/5.1/lib

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib libs=-ldbm -ldl -lm -lutil -lc perllibs=-ldl -lm -lutil -lc libc=\, so=dylib\, useshrplib=false\, libperl=libperl.a gnulibc_version='' Dynamic Linking​: dlsrc=dl_dlopen.xs\, dlext=bundle\, d_dlsymun=undef\, ccdlflags=' ' cccdlflags=' '\, lddlflags=' -bundle -undefined dynamic_lookup -fstack-protector'

Characteristics of this binary (from libperl)​: Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF Locally applied patches​: ff5975c78387030c95c7f997ee4755c6256d8360 Built under darwin Compiled at Sep 11 2014 18​:10​:30 %ENV​: PERL5LIB="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all /lib/perl5​:/Volumes/git_tree/main/lib" PERLBREW_BASHRC_VERSION="0.67" PERLBREW_HOME="/Users/brfraser/.perlbrew" PERLBREW_LIB="all" PERLBREW_MANPATH="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all /man​:/Users/brfraser/perl5/perlbrew/perls/perl-5.18.2/man" PERLBREW_PATH="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all /bin​:/Users/brfraser/perl5/perlbrew/bin​:/Users/brfraser/perl5/perlbrew/perls/perl-5.18.2/bin" PERLBREW_PERL="perl-5.18.2" PERLBREW_ROOT="/Users/brfraser/perl5/perlbrew" PERLBREW_VERSION="0.67" PERL_LOCAL_LIB_ROOT="/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all" PERL_MB_OPT="--install_base /Users/brfraser/.perlbrew/libs/perl-5.18.2@​all"

PERL_MM_OPT="INSTALL_BASE=/Users/brfraser/.perlbrew/libs/perl-5.18.2@​all"

You have a lot of environment references to Perl 5.18.2. I wonder if that is relevant?

@​INC​: lib /Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/lib/perl5/darwin-2level /Users/brfraser/.perlbrew/libs/perl-5.18.2@​all/lib/perl5 /Volumes/git_tree/main/lib /usr/local/lib/perl5/site_perl/5.21.4/darwin-2level /usr/local/lib/perl5/site_perl/5.21.4 /usr/local/lib/perl5/5.21.4/darwin-2level /usr/local/lib/perl5/5.21.4 .

Here is my perl -V​:

$ ./perl -Ilib -V Summary of my perl5 (revision 5 version 21 subversion 4) configuration​:   Derived from​: d6f85a58fb1c9ba755fae72f750f2968c3a0cd7f   uname='linux shire 3.8.0-19-generic #30-ubuntu smp wed may 1 16​:35​:23 utc 2013 x86_64 x86_64 x86_64 gnulinux '   config_args='-Doptimize=-g -d -Dusedevel -Dusethreads -Dcc=ccache gcc -Dld=gcc -DDEBUGGING -Accflags=-msse2 -mssse3 -maes'   hint=previous\, useposix=true\, d_sigaction=define   useithreads=undef\, usemultiplicity=undef   use64bitint=define\, use64bitall=define\, uselongdouble=undef   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='ccache gcc'\, ccflags ='-msse2 -mssse3 -msse4 -maes -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -msse2 -mssse3 -maes -msse2 -mssse3 -maes'\,   optimize='-O2 -g'\,   cppflags='-msse2 -mssse3 -msse4 -maes -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -msse2 -mssse3 -msse4 -maes -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -msse2 -mssse3 -maes -msse2 -mssse3 -msse4 -maes -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -msse2 -mssse3 -maes -msse2 -mssse3 -maes'   ccversion=''\, gccversion='4.7.3'\, gccosandvers=''   intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16\, longdblkind=3   ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='gcc'\, ldflags =' -fstack-protector -L/usr/local/lib'   libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /lib64 /usr/lib64 /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib   libs=-lnsl -ldl -lm -lcrypt -lutil -lc   perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc   libc=libc-2.17.so\, so=so\, useshrplib=false\, libperl=libperl.a   gnulibc_version='2.17'   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-Wl\,-E'   cccdlflags='-fPIC'\, lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:   Compile-time options​: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV   PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP   PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV   PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT   USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE   USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME   USE_PERLIO USE_PERL_ATOF   Locally applied patches​: uncommitted-changes   Built under linux   Compiled at Sep 11 2014 18​:11​:23   %ENV​:   PERLBREW_BASHRC_VERSION="0.67"   PERLBREW_CONFIGURE_FLAGS="-de -Dcc=ccache\ gcc -Dld=gcc"   PERLBREW_HOME="/home/yorton/.perlbrew"   PERLBREW_MANPATH=""   PERLBREW_PATH="/home/yorton/perl5/perlbrew/bin"   PERLBREW_ROOT="/home/yorton/perl5/perlbrew"   PERLBREW_VERSION="0.67"   @​INC​:   lib   /usr/local/lib/perl5/site_perl/5.21.4/x86_64-linux   /usr/local/lib/perl5/site_perl/5.21.4   /usr/local/lib/perl5/5.21.4/x86_64-linux   /usr/local/lib/perl5/5.21.4

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

I have repeated the exercise with a fresh install of perl-blead under perlbrew - with same results.

$ perlbrew install blead -DDEBUGGING

$ perlbrew use perl-blead

$ perl -V Summary of my perl5 (revision 5 version 21 subversion 4) configuration​:   Snapshot of​: 2febb45ac8fe9a31602934af3d9c14587543a3d9   Platform​:   osname=freebsd\, osvers=10.0-stable\, archname=amd64-freebsd   uname='freebsd neli.ijs.si 10.0-stable freebsd 10.0-stable #1 r269624m​: wed aug 6 15​:31​:56 cest 2014 mark@​neli.ijs.si​:usrobjusrsrcsysneli amd64 '   config_args='-de -Dprefix=/home/mark/perl5/perlbrew/perls/perl-blead -DDEBUGGING -Dusedevel'   hint=recommended\, useposix=true\, d_sigaction=define   useithreads=undef\, usemultiplicity=undef   use64bitint=define\, use64bitall=define\, uselongdouble=undef   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='cc'\, ccflags ='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_FORTIFY_SOURCE=2'\,   optimize='-O -g'\,   cppflags='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'   ccversion=''\, gccversion='4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 208032)'\, gccosandvers=''   intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16\, longdblkind=3   ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags ='-Wl\,-E -fstack-protector -L/usr/local/lib'   libpth=/usr/lib /usr/local/lib /usr/include/clang/3.4.1 /usr/lib   libs=-lgdbm -lm -lcrypt -lutil -lc   perllibs=-lm -lcrypt -lutil -lc   libc=\, so=so\, useshrplib=false\, libperl=libperl.a   gnulibc_version=''   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags=' '   cccdlflags='-DPIC -fPIC'\, lddlflags='-shared -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:   Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS   PERL_DONT_CREATE_GVSV   PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP   PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV   PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT   USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE   USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME   USE_PERLIO USE_PERL_ATOF   Built under freebsd   Compiled at Sep 11 2014 19​:40​:55   %ENV​:   PERLBREW_BASHRC_VERSION="0.59"   PERLBREW_HOME="/home/mark/.perlbrew"   PERLBREW_MANPATH="/home/mark/perl5/perlbrew/perls/perl-blead/man"  
PERLBREW_PATH="/home/mark/perl5/perlbrew/bin​:/home/mark/perl5/perlbrew/perls/perl-blead/bin"   PERLBREW_PERL="perl-blead"   PERLBREW_ROOT="/home/mark/perl5/perlbrew"   PERLBREW_VERSION="0.59"   @​INC​:  
/home/mark/perl5/perlbrew/perls/perl-blead/lib/site_perl/5.21.4/amd64-freebsd   /home/mark/perl5/perlbrew/perls/perl-blead/lib/site_perl/5.21.4   /home/mark/perl5/perlbrew/perls/perl-blead/lib/5.21.4/amd64-freebsd   /home/mark/perl5/perlbrew/perls/perl-blead/lib/5.21.4   .

$ type perl perl is hashed (/home/mark/perl5/perlbrew/perls/perl-blead/bin/perl)

$ perl ~/rt122747.t matched​: >>.xxxx@​outlook.com\<\< Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap $

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 20​:09\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I have repeated the exercise with a fresh install of perl-blead under perlbrew - with same results.

$ perlbrew install blead -DDEBUGGING

$ perlbrew use perl-blead

$ perl -V Summary of my perl5 (revision 5 version 21 subversion 4) configuration​: Snapshot of​: 2febb45ac8fe9a31602934af3d9c14587543a3d9 Platform​: osname=freebsd\, osvers=10.0-stable\, archname=amd64-freebsd uname='freebsd neli.ijs.si 10.0-stable freebsd 10.0-stable #1 r269624m​: wed aug 6 15​:31​:56 cest 2014 mark@​neli.ijs.si​:usrobjusrsrcsysneli amd64 ' config_args='-de -Dprefix=/home/mark/perl5/perlbrew/perls/perl-blead -DDEBUGGING -Dusedevel' hint=recommended\, useposix=true\, d_sigaction=define useithreads=undef\, usemultiplicity=undef use64bitint=define\, use64bitall=define\, uselongdouble=undef usemymalloc=n\, bincompat5005=undef Compiler​: cc='cc'\, ccflags ='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_FORTIFY_SOURCE=2'\, optimize='-O -g'\, cppflags='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion=''\, gccversion='4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 208032)'\, gccosandvers='' intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678 d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16\, longdblkind=3 ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8 alignbytes=8\, prototype=define Linker and Libraries​: ld='cc'\, ldflags ='-Wl\,-E -fstack-protector -L/usr/local/lib' libpth=/usr/lib /usr/local/lib /usr/include/clang/3.4.1 /usr/lib libs=-lgdbm -lm -lcrypt -lutil -lc perllibs=-lm -lcrypt -lutil -lc libc=\, so=so\, useshrplib=false\, libperl=libperl.a gnulibc_version='' Dynamic Linking​: dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags=' ' cccdlflags='-DPIC -fPIC'\, lddlflags='-shared -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​: Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF Built under freebsd Compiled at Sep 11 2014 19​:40​:55 %ENV​: PERLBREW_BASHRC_VERSION="0.59" PERLBREW_HOME="/home/mark/.perlbrew" PERLBREW_MANPATH="/home/mark/perl5/perlbrew/perls/perl-blead/man" PERLBREW_PATH="/home/mark/perl5/perlbrew/bin​:/home/mark/ perl5/perlbrew/perls/perl-blead/bin" PERLBREW_PERL="perl-blead" PERLBREW_ROOT="/home/mark/perl5/perlbrew" PERLBREW_VERSION="0.59" @​INC​: /home/mark/perl5/perlbrew/perls/perl-blead/lib/site_ perl/5.21.4/amd64-freebsd /home/mark/perl5/perlbrew/perls/perl-blead/lib/site_perl/5.21.4 /home/mark/perl5/perlbrew/perls/perl-blead/lib/5.21.4/amd64-freebsd /home/mark/perl5/perlbrew/perls/perl-blead/lib/5.21.4 .

$ type perl perl is hashed (/home/mark/perl5/perlbrew/perls/perl-blead/bin/perl)

$ perl ~/rt122747.t matched​: >>.xxxx@​outlook.com\<\< Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap $

Thanks. I am still working out why I have not been able to get Perl to build such that assert traps do anything. I think it is sometime stupid on my behalf.

I changed the code to not use an assert but rather a simple if and I am able to trigger the bug.

I am going to try to get to the bottom of this today.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 20​:23\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 20​:09\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I have repeated the exercise with a fresh install of perl-blead under perlbrew - with same results.

$ perlbrew install blead -DDEBUGGING

$ perlbrew use perl-blead

$ perl -V Summary of my perl5 (revision 5 version 21 subversion 4) configuration​: Snapshot of​: 2febb45ac8fe9a31602934af3d9c14587543a3d9 Platform​: osname=freebsd\, osvers=10.0-stable\, archname=amd64-freebsd uname='freebsd neli.ijs.si 10.0-stable freebsd 10.0-stable #1 r269624m​: wed aug 6 15​:31​:56 cest 2014 mark@​neli.ijs.si​:usrobjusrsrcsysneli amd64 ' config_args='-de -Dprefix=/home/mark/perl5/perlbrew/perls/perl-blead -DDEBUGGING -Dusedevel' hint=recommended\, useposix=true\, d_sigaction=define useithreads=undef\, usemultiplicity=undef use64bitint=define\, use64bitall=define\, uselongdouble=undef usemymalloc=n\, bincompat5005=undef Compiler​: cc='cc'\, ccflags ='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_FORTIFY_SOURCE=2'\, optimize='-O -g'\, cppflags='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion=''\, gccversion='4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 208032)'\, gccosandvers='' intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678 d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16\, longdblkind=3 ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8 alignbytes=8\, prototype=define Linker and Libraries​: ld='cc'\, ldflags ='-Wl\,-E -fstack-protector -L/usr/local/lib' libpth=/usr/lib /usr/local/lib /usr/include/clang/3.4.1 /usr/lib libs=-lgdbm -lm -lcrypt -lutil -lc perllibs=-lm -lcrypt -lutil -lc libc=\, so=so\, useshrplib=false\, libperl=libperl.a gnulibc_version='' Dynamic Linking​: dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags=' ' cccdlflags='-DPIC -fPIC'\, lddlflags='-shared -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​: Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF Built under freebsd Compiled at Sep 11 2014 19​:40​:55 %ENV​: PERLBREW_BASHRC_VERSION="0.59" PERLBREW_HOME="/home/mark/.perlbrew" PERLBREW_MANPATH="/home/mark/perl5/perlbrew/perls/perl-blead/man" PERLBREW_PATH="/home/mark/perl5/perlbrew/bin​:/home/mark/ perl5/perlbrew/perls/perl-blead/bin" PERLBREW_PERL="perl-blead" PERLBREW_ROOT="/home/mark/perl5/perlbrew" PERLBREW_VERSION="0.59" @​INC​: /home/mark/perl5/perlbrew/perls/perl-blead/lib/site_ perl/5.21.4/amd64-freebsd /home/mark/perl5/perlbrew/perls/perl-blead/lib/site_perl/5.21.4 /home/mark/perl5/perlbrew/perls/perl-blead/lib/5.21.4/amd64-freebsd /home/mark/perl5/perlbrew/perls/perl-blead/lib/5.21.4 .

$ type perl perl is hashed (/home/mark/perl5/perlbrew/perls/perl-blead/bin/perl)

$ perl ~/rt122747.t matched​: >>.xxxx@​outlook.com\<\< Assertion failed​: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i))\, function Perl_reg_numbered_buff_fetch\, file regcomp.c\, line 7552. Abort trap $

Thanks. I am still working out why I have not been able to get Perl to build such that assert traps do anything. I think it is sometime stupid on my behalf.

Which was that you need to do a git clean -dfX to wipe some things created by Configure in a previous run.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 17​:51\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:36\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Now I can. Configure being overly "helpful" meant i was not building with DEBUGGING even though I thought I was.

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

Yves

p5pRT commented 10 years ago

From @demerphq

#!/usr/bin/perl

use strict;

my(@​body) = (   "\mailto&#8203;:xxxx\.xxxx\\@&#8203;outlook\.com"\,   "A\x{B9}ker\x{E8}eva xxxx.xxxx\@​outlook.com \x{201D}"\, );

for (@​body) {   s{ \<? (?\<!mailto​:) \b ( [a-z0-9.]+ \@​ \S+ ) \b   (?​: > | \s{1\,10} (?!phone) [a-z]{2\,11} : ) }{ }xgi;   print "matched​: >>$1\<\<\n"; }

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 20​:41\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:51\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:36\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Now I can. Configure being overly "helpful" meant i was not building with DEBUGGING even though I thought I was.

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

I was misreading things. In fact this is relevant​:

#0 0x00007ffff70e9037 in __GI_raise (sig=sig@​entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff70ec698 in __GI_abort () at abort.c​:90 #2 0x00007ffff70e1e03 in __assert_fail_base (fmt=0x7ffff7239158 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\,   assertion=assertion@​entry=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=file@​entry=0x7cb018 "regcomp.c"\,   line=line@​entry=7552\, function=function@​entry=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:92 #3 0x00007ffff70e1eb2 in __GI___assert_fail (assertion=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\,   file=0x7cb018 "regcomp.c"\, line=7552\, function=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch")   at assert.c​:101 #4 0x000000000051689b in Perl_reg_numbered_buff_fetch (my_perl=0xa95010\, r=0xac4b68\, paren=1\, sv=0xacd388) at regcomp.c​:7552 #5 0x00000000005721e6 in Perl_magic_get (my_perl=0xa95010\, sv=0xacd388\, mg=0xad36f8) at mg.c​:789 #6 0x000000000056fc1f in Perl_mg_get (my_perl=0xa95010\, sv=0xacd388) at mg.c​:199 #7 0x000000000066352c in Perl_save_scalar (my_perl=0xa95010\, gv=0xacd370) at scope.c​:206 #8 0x0000000000542044 in Perl_save_re_context (my_perl=0xa95010) at regcomp.c​:16814 #9 0x00000000007183cb in Perl__core_swash_init (my_perl=0xa95010\, pkg=0x862a4e "utf8"\, name=0x862a09 "ToCf"\, listsv=0xa95138\,   minbits=4\, none=0\, invlist=0x0\, flags_p=0x0) at utf8.c​:2346 #10 0x0000000000716945 in Perl_to_utf8_case (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\,   swashp=0xa958c8\, normal=0x862a09 "ToCf"\, special=0x86272a "") at utf8.c​:1800 #11 0x0000000000717c24 in Perl__to_utf8_fold_flags (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\,   lenp=0x7fffffffd338\, flags=2 '\002') at utf8.c​:2161 #12 0x0000000000721be7 in Perl_foldEQ_utf8_flags (my_perl=0xa95010\, s1=0xacc974 "phone"\, pe1=0x0\, l1=5\, u1=false\, s2=0xabf0ba "”"\,   pe2=0x7fffffffd500\, l2=0\, u2=true\, flags=0) at utf8.c​:4044 #13 0x0000000000701990 in S_regmatch (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startpos=0xabf0a4 "xxxx.xxxx@​outlook.com ”"\,   prog=0xacc8c8) at regexec.c​:4561 #14 0x00000000006fb104 in S_regtry (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startposp=0x7fffffffdc50) at regexec.c​:3231 #15 0x00000000006fa9fc in Perl_regexec_flags (my_perl=0xa95010\, rx=0xac4b68\,   stringarg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0xabf0ac "x@​outlook.com ”"\,   strbeg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0xa98078\, data=0x0\, flags=1) at regexec.c​:3090 #16 0x00000000005bd269 in Perl_pp_subst (my_perl=0xa95010) at pp_hot.c​:2120 #17 0x000000000055ad69 in Perl_runops_debug (my_perl=0xa95010) at dump.c​:2353 #18 0x000000000045e9a2 in S_run_body (my_perl=0xa95010\, oldscope=1) at perl.c​:2416 #19 0x000000000045dd66 in perl_run (my_perl=0xa95010) at perl.c​:2339 #20 0x000000000041b35d in main (argc=3\, argv=0x7fffffffe3f8\, env=0x7fffffffe418) at perlmain.c​:114

Do we really need to use the regex engine for swash init? Wouldnt the sanest way to solve this class of bugs be to change how we store and represent swashes?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 21​:32\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 20​:41\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:51\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 17​:36\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

I can reproduce this on 5.10-5.20 but only for debugging builds.

Indeed\, I'm using a -DDEBUGGING perl.

Are you sure you have the script right?

Yes.

Hrm\, well even on a DEBUGGING build I cannot replicate in blead.

Now I can. Configure being overly "helpful" meant i was not building with DEBUGGING even though I thought I was.

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

I was misreading things. In fact this is relevant​:

#0 0x00007ffff70e9037 in __GI_raise (sig=sig@​entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff70ec698 in __GI_abort () at abort.c​:90 #2 0x00007ffff70e1e03 in __assert_fail_base (fmt=0x7ffff7239158 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\, assertion=assertion@​entry=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=file@​entry=0x7cb018 "regcomp.c"\, line=line@​entry=7552\, function=function@​entry=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:92 #3 0x00007ffff70e1eb2 in __GI___assert_fail (assertion=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=0x7cb018 "regcomp.c"\, line=7552\, function=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:101 #4 0x000000000051689b in Perl_reg_numbered_buff_fetch (my_perl=0xa95010\, r=0xac4b68\, paren=1\, sv=0xacd388) at regcomp.c​:7552 #5 0x00000000005721e6 in Perl_magic_get (my_perl=0xa95010\, sv=0xacd388\, mg=0xad36f8) at mg.c​:789 #6 0x000000000056fc1f in Perl_mg_get (my_perl=0xa95010\, sv=0xacd388) at mg.c​:199 #7 0x000000000066352c in Perl_save_scalar (my_perl=0xa95010\, gv=0xacd370) at scope.c​:206 #8 0x0000000000542044 in Perl_save_re_context (my_perl=0xa95010) at regcomp.c​:16814 #9 0x00000000007183cb in Perl__core_swash_init (my_perl=0xa95010\, pkg=0x862a4e "utf8"\, name=0x862a09 "ToCf"\, listsv=0xa95138\, minbits=4\, none=0\, invlist=0x0\, flags_p=0x0) at utf8.c​:2346 #10 0x0000000000716945 in Perl_to_utf8_case (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, swashp=0xa958c8\, normal=0x862a09 "ToCf"\, special=0x86272a "") at utf8.c​:1800 #11 0x0000000000717c24 in Perl__to_utf8_fold_flags (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, flags=2 '\002') at utf8.c​:2161 #12 0x0000000000721be7 in Perl_foldEQ_utf8_flags (my_perl=0xa95010\, s1=0xacc974 "phone"\, pe1=0x0\, l1=5\, u1=false\, s2=0xabf0ba "”"\, pe2=0x7fffffffd500\, l2=0\, u2=true\, flags=0) at utf8.c​:4044 #13 0x0000000000701990 in S_regmatch (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startpos=0xabf0a4 "xxxx.xxxx@​outlook.com ”"\, prog=0xacc8c8) at regexec.c​:4561 #14 0x00000000006fb104 in S_regtry (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startposp=0x7fffffffdc50) at regexec.c​:3231 #15 0x00000000006fa9fc in Perl_regexec_flags (my_perl=0xa95010\, rx=0xac4b68\, stringarg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0xabf0ac "x@​outlook.com ”"\, strbeg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0xa98078\, data=0x0\, flags=1) at regexec.c​:3090 #16 0x00000000005bd269 in Perl_pp_subst (my_perl=0xa95010) at pp_hot.c​:2120 #17 0x000000000055ad69 in Perl_runops_debug (my_perl=0xa95010) at dump.c​:2353 #18 0x000000000045e9a2 in S_run_body (my_perl=0xa95010\, oldscope=1) at perl.c​:2416 #19 0x000000000045dd66 in perl_run (my_perl=0xa95010) at perl.c​:2339 #20 0x000000000041b35d in main (argc=3\, argv=0x7fffffffe3f8\, env=0x7fffffffe418) at perlmain.c​:114

Do we really need to use the regex engine for swash init? Wouldnt the sanest way to solve this class of bugs be to change how we store and represent swashes?

Or maybe trigger the swash_init() during regex *compile*.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @cpansprout

On Thu Sep 11 12​:32​:57 2014\, demerphq wrote​:

On 11 September 2014 20​:41\, demerphq \demerphq@&#8203;gmail\.com wrote​:

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

I was misreading things. In fact this is relevant​:

#0 0x00007ffff70e9037 in __GI_raise (sig=sig@​entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff70ec698 in __GI_abort () at abort.c​:90 #2 0x00007ffff70e1e03 in __assert_fail_base (fmt=0x7ffff7239158 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\, assertion=assertion@​entry=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=file@​entry=0x7cb018 "regcomp.c"\, line=line@​entry=7552\, function=function@​entry=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:92 #3 0x00007ffff70e1eb2 in __GI___assert_fail (assertion=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=0x7cb018 "regcomp.c"\, line=7552\, function=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:101 #4 0x000000000051689b in Perl_reg_numbered_buff_fetch (my_perl=0xa95010\, r=0xac4b68\, paren=1\, sv=0xacd388) at regcomp.c​:7552 #5 0x00000000005721e6 in Perl_magic_get (my_perl=0xa95010\, sv=0xacd388\, mg=0xad36f8) at mg.c​:789 #6 0x000000000056fc1f in Perl_mg_get (my_perl=0xa95010\, sv=0xacd388) at mg.c​:199 #7 0x000000000066352c in Perl_save_scalar (my_perl=0xa95010\, gv=0xacd370) at scope.c​:206 #8 0x0000000000542044 in Perl_save_re_context (my_perl=0xa95010) at regcomp.c​:16814 #9 0x00000000007183cb in Perl__core_swash_init (my_perl=0xa95010\, pkg=0x862a4e "utf8"\, name=0x862a09 "ToCf"\, listsv=0xa95138\, minbits=4\, none=0\, invlist=0x0\, flags_p=0x0) at utf8.c​:2346 #10 0x0000000000716945 in Perl_to_utf8_case (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, swashp=0xa958c8\, normal=0x862a09 "ToCf"\, special=0x86272a "") at utf8.c​:1800 #11 0x0000000000717c24 in Perl__to_utf8_fold_flags (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, flags=2 '\002') at utf8.c​:2161 #12 0x0000000000721be7 in Perl_foldEQ_utf8_flags (my_perl=0xa95010\, s1=0xacc974 "phone"\, pe1=0x0\, l1=5\, u1=false\, s2=0xabf0ba "”"\, pe2=0x7fffffffd500\, l2=0\, u2=true\, flags=0) at utf8.c​:4044 #13 0x0000000000701990 in S_regmatch (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startpos=0xabf0a4 "xxxx.xxxx@​outlook.com ”"\, prog=0xacc8c8) at regexec.c​:4561 #14 0x00000000006fb104 in S_regtry (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startposp=0x7fffffffdc50) at regexec.c​:3231 #15 0x00000000006fa9fc in Perl_regexec_flags (my_perl=0xa95010\, rx=0xac4b68\, stringarg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0xabf0ac "x@​outlook.com ”"\, strbeg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0xa98078\, data=0x0\, flags=1) at regexec.c​:3090 #16 0x00000000005bd269 in Perl_pp_subst (my_perl=0xa95010) at pp_hot.c​:2120 #17 0x000000000055ad69 in Perl_runops_debug (my_perl=0xa95010) at dump.c​:2353 #18 0x000000000045e9a2 in S_run_body (my_perl=0xa95010\, oldscope=1) at perl.c​:2416 #19 0x000000000045dd66 in perl_run (my_perl=0xa95010) at perl.c​:2339 #20 0x000000000041b35d in main (argc=3\, argv=0x7fffffffe3f8\, env=0x7fffffffe418) at perlmain.c​:114

So you’ve beaten me to the backtrace.

Do we really need to use the regex engine for swash init? Wouldnt the sanest way to solve this class of bugs be to change how we store and represent swashes?

Short of that\, could we just stop the init code from using regexps itself?

That said\, is it even necessary at present to localise $1 et al.? As long as the swash init code does not access $1 after a failed match\, would it matter that the current (outer) regexp is in an inconsistent state? Or could we make PL_curpm null during the swash init?

--

Father Chrysostomos

p5pRT commented 10 years ago

From @cpansprout

On Thu Sep 11 11​:26​:03 2014\, demerphq wrote​:

Which was that you need to do a git clean -dfX to wipe some things created by Configure in a previous run.

‘rm config.sh Policy.sh’ usually works for me\, but I may be missing something. At least I don’t have to rebuild everything\, though.

--

Father Chrysostomos

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 21​:40\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Thu Sep 11 12​:32​:57 2014\, demerphq wrote​:

On 11 September 2014 20​:41\, demerphq \demerphq@&#8203;gmail\.com wrote​:

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

I was misreading things. In fact this is relevant​:

#0 0x00007ffff70e9037 in __GI_raise (sig=sig@​entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff70ec698 in __GI_abort () at abort.c​:90 #2 0x00007ffff70e1e03 in __assert_fail_base (fmt=0x7ffff7239158 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\, assertion=assertion@​entry=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=file@​entry=0x7cb018 "regcomp.c"\, line=line@​entry=7552\, function=function@​entry=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:92 #3 0x00007ffff70e1eb2 in __GI___assert_fail (assertion=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=0x7cb018 "regcomp.c"\, line=7552\, function=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:101 #4 0x000000000051689b in Perl_reg_numbered_buff_fetch (my_perl=0xa95010\, r=0xac4b68\, paren=1\, sv=0xacd388) at regcomp.c​:7552 #5 0x00000000005721e6 in Perl_magic_get (my_perl=0xa95010\, sv=0xacd388\, mg=0xad36f8) at mg.c​:789 #6 0x000000000056fc1f in Perl_mg_get (my_perl=0xa95010\, sv=0xacd388) at mg.c​:199 #7 0x000000000066352c in Perl_save_scalar (my_perl=0xa95010\, gv=0xacd370) at scope.c​:206 #8 0x0000000000542044 in Perl_save_re_context (my_perl=0xa95010) at regcomp.c​:16814 #9 0x00000000007183cb in Perl__core_swash_init (my_perl=0xa95010\, pkg=0x862a4e "utf8"\, name=0x862a09 "ToCf"\, listsv=0xa95138\, minbits=4\, none=0\, invlist=0x0\, flags_p=0x0) at utf8.c​:2346 #10 0x0000000000716945 in Perl_to_utf8_case (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, swashp=0xa958c8\, normal=0x862a09 "ToCf"\, special=0x86272a "") at utf8.c​:1800 #11 0x0000000000717c24 in Perl__to_utf8_fold_flags (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, flags=2 '\002') at utf8.c​:2161 #12 0x0000000000721be7 in Perl_foldEQ_utf8_flags (my_perl=0xa95010\, s1=0xacc974 "phone"\, pe1=0x0\, l1=5\, u1=false\, s2=0xabf0ba "”"\, pe2=0x7fffffffd500\, l2=0\, u2=true\, flags=0) at utf8.c​:4044 #13 0x0000000000701990 in S_regmatch (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startpos=0xabf0a4 "xxxx.xxxx@​outlook.com ”"\, prog=0xacc8c8) at regexec.c​:4561 #14 0x00000000006fb104 in S_regtry (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startposp=0x7fffffffdc50) at regexec.c​:3231 #15 0x00000000006fa9fc in Perl_regexec_flags (my_perl=0xa95010\, rx=0xac4b68\, stringarg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0xabf0ac "x@​outlook.com ”"\, strbeg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0xa98078\, data=0x0\, flags=1) at regexec.c​:3090 #16 0x00000000005bd269 in Perl_pp_subst (my_perl=0xa95010) at pp_hot.c​:2120 #17 0x000000000055ad69 in Perl_runops_debug (my_perl=0xa95010) at dump.c​:2353 #18 0x000000000045e9a2 in S_run_body (my_perl=0xa95010\, oldscope=1) at perl.c​:2416 #19 0x000000000045dd66 in perl_run (my_perl=0xa95010) at perl.c​:2339 #20 0x000000000041b35d in main (argc=3\, argv=0x7fffffffe3f8\, env=0x7fffffffe418) at perlmain.c​:114

So you’ve beaten me to the backtrace.

Yes\, sorry about that. Configure was driving me bonkers.

Do we really need to use the regex engine for swash init? Wouldnt the sanest way to solve this class of bugs be to change how we store and represent swashes?

Short of that\, could we just stop the init code from using regexps itself?

That is what I meant.

That said\, is it even necessary at present to localise $1 et al.? As long as the swash init code does not access $1 after a failed match\, would it matter that the current (outer) regexp is in an inconsistent state? Or could we make PL_curpm null during the swash init?

The latter is an amazingly good idea and I am testing a patch that does exactly that as I type.

Yves

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 21​:53\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 21​:40\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Thu Sep 11 12​:32​:57 2014\, demerphq wrote​:

On 11 September 2014 20​:41\, demerphq \demerphq@&#8203;gmail\.com wrote​:

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

I was misreading things. In fact this is relevant​:

#0 0x00007ffff70e9037 in __GI_raise (sig=sig@​entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff70ec698 in __GI_abort () at abort.c​:90 #2 0x00007ffff70e1e03 in __assert_fail_base (fmt=0x7ffff7239158 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\, assertion=assertion@​entry=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=file@​entry=0x7cb018 "regcomp.c"\, line=line@​entry=7552\, function=function@​entry=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:92 #3 0x00007ffff70e1eb2 in __GI___assert_fail (assertion=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=0x7cb018 "regcomp.c"\, line=7552\, function=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:101 #4 0x000000000051689b in Perl_reg_numbered_buff_fetch (my_perl=0xa95010\, r=0xac4b68\, paren=1\, sv=0xacd388) at regcomp.c​:7552 #5 0x00000000005721e6 in Perl_magic_get (my_perl=0xa95010\, sv=0xacd388\, mg=0xad36f8) at mg.c​:789 #6 0x000000000056fc1f in Perl_mg_get (my_perl=0xa95010\, sv=0xacd388) at mg.c​:199 #7 0x000000000066352c in Perl_save_scalar (my_perl=0xa95010\, gv=0xacd370) at scope.c​:206 #8 0x0000000000542044 in Perl_save_re_context (my_perl=0xa95010) at regcomp.c​:16814 #9 0x00000000007183cb in Perl__core_swash_init (my_perl=0xa95010\, pkg=0x862a4e "utf8"\, name=0x862a09 "ToCf"\, listsv=0xa95138\, minbits=4\, none=0\, invlist=0x0\, flags_p=0x0) at utf8.c​:2346 #10 0x0000000000716945 in Perl_to_utf8_case (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, swashp=0xa958c8\, normal=0x862a09 "ToCf"\, special=0x86272a "") at utf8.c​:1800 #11 0x0000000000717c24 in Perl__to_utf8_fold_flags (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, flags=2 '\002') at utf8.c​:2161 #12 0x0000000000721be7 in Perl_foldEQ_utf8_flags (my_perl=0xa95010\, s1=0xacc974 "phone"\, pe1=0x0\, l1=5\, u1=false\, s2=0xabf0ba "”"\, pe2=0x7fffffffd500\, l2=0\, u2=true\, flags=0) at utf8.c​:4044 #13 0x0000000000701990 in S_regmatch (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startpos=0xabf0a4 "xxxx.xxxx@​outlook.com ”"\, prog=0xacc8c8) at regexec.c​:4561 #14 0x00000000006fb104 in S_regtry (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startposp=0x7fffffffdc50) at regexec.c​:3231 #15 0x00000000006fa9fc in Perl_regexec_flags (my_perl=0xa95010\, rx=0xac4b68\, stringarg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0xabf0ac "x@​outlook.com ”"\, strbeg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0xa98078\, data=0x0\, flags=1) at regexec.c​:3090 #16 0x00000000005bd269 in Perl_pp_subst (my_perl=0xa95010) at pp_hot.c​:2120 #17 0x000000000055ad69 in Perl_runops_debug (my_perl=0xa95010) at dump.c​:2353 #18 0x000000000045e9a2 in S_run_body (my_perl=0xa95010\, oldscope=1) at perl.c​:2416 #19 0x000000000045dd66 in perl_run (my_perl=0xa95010) at perl.c​:2339 #20 0x000000000041b35d in main (argc=3\, argv=0x7fffffffe3f8\, env=0x7fffffffe418) at perlmain.c​:114

So you’ve beaten me to the backtrace.

Yes\, sorry about that. Configure was driving me bonkers.

Do we really need to use the regex engine for swash init? Wouldnt the sanest way to solve this class of bugs be to change how we store and represent swashes?

Short of that\, could we just stop the init code from using regexps itself?

That is what I meant.

That said\, is it even necessary at present to localise $1 et al.? As long as the swash init code does not access $1 after a failed match\, would it matter that the current (outer) regexp is in an inconsistent state? Or could we make PL_curpm null during the swash init?

The latter is an amazingly good idea and I am testing a patch that does exactly that as I type.

It passed basic regex tests. I am running a full test now\, but at the same time I pushed smoke-me/rt_122747 which includes

commit 55b10d6c41f252b3256cf96b3bdf903eb6a7fb57 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Thu Sep 11 21​:55​:08 2014 +0200

  perl #122747​: localize PL_curpm to null in _core_swash_init

  This is a naive patch to set PL_curpm to null before we do any   swash intialization. This "hides" the current regop from the swash   code\, with the intent of prevent weird reentrancy bugs.

  Thanks to FC for the suggestion!

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 22​:00\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 21​:53\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 11 September 2014 21​:40\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Thu Sep 11 12​:32​:57 2014\, demerphq wrote​:

On 11 September 2014 20​:41\, demerphq \demerphq@&#8203;gmail\.com wrote​:

Now I can replicate I have determined the the use re 'taint'; is apparently unnecessary\, the script I posted which prints $1 will trigger it as well. Reattached in this mail...

Since the re taint is not necessary this means the relation to the utf8 loading and save_re_context and things like that is irrelevant.

I was misreading things. In fact this is relevant​:

#0 0x00007ffff70e9037 in __GI_raise (sig=sig@​entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff70ec698 in __GI_abort () at abort.c​:90 #2 0x00007ffff70e1e03 in __assert_fail_base (fmt=0x7ffff7239158 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\, assertion=assertion@​entry=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=file@​entry=0x7cb018 "regcomp.c"\, line=line@​entry=7552\, function=function@​entry=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:92 #3 0x00007ffff70e1eb2 in __GI___assert_fail (assertion=0x7ce6b8 "(STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + i)"\, file=0x7cb018 "regcomp.c"\, line=7552\, function=0x7d5430 \<__PRETTY_FUNCTION__.17671> "Perl_reg_numbered_buff_fetch") at assert.c​:101 #4 0x000000000051689b in Perl_reg_numbered_buff_fetch (my_perl=0xa95010\, r=0xac4b68\, paren=1\, sv=0xacd388) at regcomp.c​:7552 #5 0x00000000005721e6 in Perl_magic_get (my_perl=0xa95010\, sv=0xacd388\, mg=0xad36f8) at mg.c​:789 #6 0x000000000056fc1f in Perl_mg_get (my_perl=0xa95010\, sv=0xacd388) at mg.c​:199 #7 0x000000000066352c in Perl_save_scalar (my_perl=0xa95010\, gv=0xacd370) at scope.c​:206 #8 0x0000000000542044 in Perl_save_re_context (my_perl=0xa95010) at regcomp.c​:16814 #9 0x00000000007183cb in Perl__core_swash_init (my_perl=0xa95010\, pkg=0x862a4e "utf8"\, name=0x862a09 "ToCf"\, listsv=0xa95138\, minbits=4\, none=0\, invlist=0x0\, flags_p=0x0) at utf8.c​:2346 #10 0x0000000000716945 in Perl_to_utf8_case (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, swashp=0xa958c8\, normal=0x862a09 "ToCf"\, special=0x86272a "") at utf8.c​:1800 #11 0x0000000000717c24 in Perl__to_utf8_fold_flags (my_perl=0xa95010\, p=0xabf0ba "”"\, ustrp=0x7fffffffd390 "\f"\, lenp=0x7fffffffd338\, flags=2 '\002') at utf8.c​:2161 #12 0x0000000000721be7 in Perl_foldEQ_utf8_flags (my_perl=0xa95010\, s1=0xacc974 "phone"\, pe1=0x0\, l1=5\, u1=false\, s2=0xabf0ba "”"\, pe2=0x7fffffffd500\, l2=0\, u2=true\, flags=0) at utf8.c​:4044 #13 0x0000000000701990 in S_regmatch (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startpos=0xabf0a4 "xxxx.xxxx@​outlook.com ”"\, prog=0xacc8c8) at regexec.c​:4561 #14 0x00000000006fb104 in S_regtry (my_perl=0xa95010\, reginfo=0x7fffffffddf0\, startposp=0x7fffffffdc50) at regexec.c​:3231 #15 0x00000000006fa9fc in Perl_regexec_flags (my_perl=0xa95010\, rx=0xac4b68\, stringarg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, strend=0xabf0ac "x@​outlook.com ”"\, strbeg=0xabf098 "Aškerèeva xxxx.xxxx@​outlook.com ”"\, minend=0\, sv=0xa98078\, data=0x0\, flags=1) at regexec.c​:3090 #16 0x00000000005bd269 in Perl_pp_subst (my_perl=0xa95010) at pp_hot.c​:2120 #17 0x000000000055ad69 in Perl_runops_debug (my_perl=0xa95010) at dump.c​:2353 #18 0x000000000045e9a2 in S_run_body (my_perl=0xa95010\, oldscope=1) at perl.c​:2416 #19 0x000000000045dd66 in perl_run (my_perl=0xa95010) at perl.c​:2339 #20 0x000000000041b35d in main (argc=3\, argv=0x7fffffffe3f8\, env=0x7fffffffe418) at perlmain.c​:114

So you’ve beaten me to the backtrace.

Yes\, sorry about that. Configure was driving me bonkers.

Do we really need to use the regex engine for swash init? Wouldnt the sanest way to solve this class of bugs be to change how we store and represent swashes?

Short of that\, could we just stop the init code from using regexps itself?

That is what I meant.

That said\, is it even necessary at present to localise $1 et al.? As long as the swash init code does not access $1 after a failed match\, would it matter that the current (outer) regexp is in an inconsistent state? Or could we make PL_curpm null during the swash init?

The latter is an amazingly good idea and I am testing a patch that does exactly that as I type.

It passed basic regex tests. I am running a full test now\, but at the same time I pushed smoke-me/rt_122747 which includes

commit 55b10d6c41f252b3256cf96b3bdf903eb6a7fb57 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Thu Sep 11 21​:55​:08 2014 +0200

perl \#122747&#8203;: localize PL\_curpm to null in \_core\_swash\_init

This is a naive patch to set PL\_curpm to null before we do any
swash intialization\. This "hides" the current regop from the swash
code\, with the intent of prevent weird reentrancy bugs\.

Thanks to FC for the suggestion\!

And now I just pushed to blead​:

commit 2c1f00b9036a7987c714a407662651ef7da99495 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Thu Sep 11 21​:55​:08 2014 +0200

  perl #122747​: localize PL_curpm to null in _core_swash_init

  Set PL_curpm to null before we do any swash intialization   in _core_swash_init(). This "hides" the current regop from the   swash code\, with the intent of prevent weird reentrancy bugs   when the swashes are initialized.

  Long term you could argue that we should just not use the regex   engine to initialize a swash\, and then this would be unnecessary.

  Thanks to FC for the suggestion!

I believe that this ticket can be closed.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @cpansprout

On Thu Sep 11 13​:49​:31 2014\, demerphq wrote​:

And now I just pushed to blead​:

commit 2c1f00b9036a7987c714a407662651ef7da99495 ... I believe that this ticket can be closed.

Wait. Two things​:

• Let’s commit a test. • Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

(I guess the latter issue doesn’t need to keep the ticket open.)

--

Father Chrysostomos

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 23​:42\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Thu Sep 11 13​:49​:31 2014\, demerphq wrote​:

And now I just pushed to blead​:

commit 2c1f00b9036a7987c714a407662651ef7da99495 ... I believe that this ticket can be closed.

Wait. Two things​:

• Let’s commit a test.

Ah\, ahem. Good catch. :-)

• Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

(I guess the latter issue doesn’t need to keep the ticket open.)

Agreed. Lets open a different ticket for that.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

I believe that this ticket can be closed.

Thanks you all for the fast resolution!

Will this be able to make it into 5.20.1-RC3 ?

p5pRT commented 10 years ago

From @cpansprout

On Thu Sep 11 14​:59​:25 2014\, mmartinec wrote​:

I believe that this ticket can be closed.

Thanks you all for the fast resolution!

Will this be able to make it into 5.20.1-RC3 ?

Seeing that this is a crashing bug\, it meets the policy. It gets my vote. (2c1f00b90\, that is\, which alone is sufficient for maint.)

--

Father Chrysostomos

p5pRT commented 10 years ago

From @iabyn

On Thu\, Sep 11\, 2014 at 03​:12​:12PM -0700\, Father Chrysostomos via RT wrote​:

On Thu Sep 11 14​:59​:25 2014\, mmartinec wrote​:

I believe that this ticket can be closed.

Thanks you all for the fast resolution!

Will this be able to make it into 5.20.1-RC3 ?

Seeing that this is a crashing bug\, it meets the policy. It gets my vote. (2c1f00b90\, that is\, which alone is sufficient for maint.)

On the other hand\, its a long-standing (and clearly rare) issue\, and squeezing it in at the very last gasp into RC3 when it's had no time to settle or be BBCed seems like a really good way to inadvertently break 5.20.1. There's always 5.20.2.

-- A walk of a thousand miles begins with a single step... then continues for another 1\,999\,999 or so.

p5pRT commented 10 years ago

From @jkeenan

On 09/12/2014 06​:27 AM\, Dave Mitchell wrote​:

On Thu\, Sep 11\, 2014 at 03​:12​:12PM -0700\, Father Chrysostomos via RT wrote​:

On Thu Sep 11 14​:59​:25 2014\, mmartinec wrote​:

I believe that this ticket can be closed.

Thanks you all for the fast resolution!

Will this be able to make it into 5.20.1-RC3 ?

Seeing that this is a crashing bug\, it meets the policy. It gets my vote. (2c1f00b90\, that is\, which alone is sufficient for maint.)

On the other hand\, its a long-standing (and clearly rare) issue\, and squeezing it in at the very last gasp into RC3 when it's had no time to settle or be BBCed seems like a really good way to inadvertently break 5.20.1. There's always 5.20.2.

What-Dave-said++

p5pRT commented 10 years ago

From @demerphq

On 12 September 2014 13​:08\, James E Keenan \jkeen@&#8203;verizon\.net wrote​:

On 09/12/2014 06​:27 AM\, Dave Mitchell wrote​:

On Thu\, Sep 11\, 2014 at 03​:12​:12PM -0700\, Father Chrysostomos via RT wrote​:

On Thu Sep 11 14​:59​:25 2014\, mmartinec wrote​:

I believe that this ticket can be closed.

Thanks you all for the fast resolution!

Will this be able to make it into 5.20.1-RC3 ?

Seeing that this is a crashing bug\, it meets the policy. It gets my vote. (2c1f00b90\, that is\, which alone is sufficient for maint.)

On the other hand\, its a long-standing (and clearly rare) issue\, and squeezing it in at the very last gasp into RC3 when it's had no time to settle or be BBCed seems like a really good way to inadvertently break 5.20.1. There's always 5.20.2.

What-Dave-said++

FWIW\, I think delaying for a minor release is a good idea. If Mark really needs this he can cherry-pick the patch and build a custom perl.

Sorry for any inconvenience Mark\, but this patch has characteristics\, (such as being almost *too* easy)\, which make me a think a bit of time to cook in blead is a good idea.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From Mark.Martinec@ijs.si

Seeing that this is a crashing bug\, it meets the policy. It gets my vote. (2c1f00b90\, that is\, which alone is sufficient for maint.)

On the other hand\, its a long-standing (and clearly rare) issue\, and squeezing it in at the very last gasp into RC3 when it's had no time to settle or be BBCed seems like a really good way to inadvertently break 5.20.1. There's always 5.20.2.

What-Dave-said++

FWIW\, I think delaying for a minor release is a good idea. If Mark really needs this he can cherry-pick the patch and build a custom perl.

Sorry for any inconvenience Mark\, but this patch has characteristics\, (such as being almost *too* easy)\, which make me a think a bit of time to cook in blead is a good idea.

Sure\, understood. It is indeed uncomfortably close to a 5.20.1 release.

On the other hand\, its a long-standing (and clearly rare) issue

If the issue is otherwise not harmful (like causing memory corruption) and the assert failure can be safely ignored\, perhaps there can just be a warning in perldelta that -DDEBUGGING must not be used in regular use.

The event is not so rare (e.g. one such case per several days of mail filtering)\, but goes by unnoticed as the system-installed perl is usually not built with debugging. I had debugging enabled because of trying out a fresh release candidate.

p5pRT commented 10 years ago

From @demerphq

On 12 September 2014 17​:46\, Mark Martinec \Mark\.Martinec@&#8203;ijs\.si wrote​:

Seeing that this is a crashing bug\, it meets the policy. It gets my

vote. (2c1f00b90\, that is\, which alone is sufficient for maint.)

On the other hand\, its a long-standing (and clearly rare) issue\, and squeezing it in at the very last gasp into RC3 when it's had no time to settle or be BBCed seems like a really good way to inadvertently break 5.20.1. There's always 5.20.2.

What-Dave-said++

FWIW\, I think delaying for a minor release is a good idea. If Mark really needs this he can cherry-pick the patch and build a custom perl.

Sorry for any inconvenience Mark\, but this patch has characteristics\, (such as being almost *too* easy)\, which make me a think a bit of time to cook in blead is a good idea.

Sure\, understood. It is indeed uncomfortably close to a 5.20.1 release.

On the other hand\, its a long-standing (and clearly rare) issue

If the issue is otherwise not harmful (like causing memory corruption) and the assert failure can be safely ignored\, perhaps there can just be a warning in perldelta that -DDEBUGGING must not be used in regular use.

I think that is correct. I would need to audit to be sure.

The event is not so rare (e.g. one such case per several days of mail filtering)\,

Hrm. That is annoying.

but goes by unnoticed as the system-installed perl is usually not built with debugging. I had debugging enabled because of trying out a fresh release candidate.

I see.

I dont know if that changes the balance of issues enough to justify it going into 5.20.1. I will leave that to others to decide.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @demerphq

On 11 September 2014 23​:42\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Thu Sep 11 13​:49​:31 2014\, demerphq wrote​:

And now I just pushed to blead​:

commit 2c1f00b9036a7987c714a407662651ef7da99495 ... I believe that this ticket can be closed.

Wait. Two things​:

• Let’s commit a test.

Done in​: 409c6472cedc6771a158a61dbbf8154d0246dc5b

• Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

Did you already follow up on this?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @cpansprout

On Sun Sep 14 09​:57​:14 2014\, demerphq wrote​:

On 11 September 2014 23​:42\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

• Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

Did you already follow up on this?

Yes\, in these commits (in reverse order)​:

2018906 pp_ctl.c​: Remove junk from #endif 0ddd4a5 Mathomise save_re_context e32ff4e pp_ctl.c​: Remove PL_curcop assignment 1a419e6 utf8.c​: Move an #ifndef for clarity 1ca1bae Remove obsolete comment from utf8.c d28a925 Don’t call save_re_context b4fa55d Gut Perl_save_re_context

--

Father Chrysostomos

p5pRT commented 10 years ago

From @demerphq

On 14 September 2014 21​:49\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Sep 14 09​:57​:14 2014\, demerphq wrote​:

On 11 September 2014 23​:42\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

• Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

Did you already follow up on this?

Yes\, in these commits (in reverse order)​:

2018906 pp_ctl.c​: Remove junk from #endif 0ddd4a5 Mathomise save_re_context e32ff4e pp_ctl.c​: Remove PL_curcop assignment 1a419e6 utf8.c​: Move an #ifndef for clarity 1ca1bae Remove obsolete comment from utf8.c d28a925 Don’t call save_re_context b4fa55d Gut Perl_save_re_context

Great. Thanks. BTW\, did you dig into the history of the function to see why it was added in the first place? Did it ever make sense?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

From @cpansprout

On Sun Sep 14 14​:15​:18 2014\, demerphq wrote​:

On 14 September 2014 21​:49\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Sep 14 09​:57​:14 2014\, demerphq wrote​:

On 11 September 2014 23​:42\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

• Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

Did you already follow up on this?

Yes\, in these commits (in reverse order)​:

2018906 pp_ctl.c​: Remove junk from #endif 0ddd4a5 Mathomise save_re_context e32ff4e pp_ctl.c​: Remove PL_curcop assignment 1a419e6 utf8.c​: Move an #ifndef for clarity 1ca1bae Remove obsolete comment from utf8.c d28a925 Don’t call save_re_context b4fa55d Gut Perl_save_re_context

Great. Thanks. BTW\, did you dig into the history of the function to see why it was added in the first place?

7d75537e explains the history.

Did it ever make sense?

Yes\, originally save_re_context saved a whole list of global variables used by the regexp engine (which no longer exist).

The localisation of $1 etc. was added later. I’m not sure that part ever made sense. I’m not sure either that it’s worth trying to figure it out. It was commit ada6e8a992 that did it\, to fix bug #18107.

--

Father Chrysostomos

p5pRT commented 10 years ago

From @demerphq

On 14 September 2014 23​:30\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Sep 14 14​:15​:18 2014\, demerphq wrote​:

On 14 September 2014 21​:49\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Sep 14 09​:57​:14 2014\, demerphq wrote​:

On 11 September 2014 23​:42\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

• Can we remove the $1 localisation?

That localisation doesn’t make much sense to me\, even without your PL_curpm change. Saving and restoring the value of something that is just a proxy for a value stored elsewhere is weird. Can we just delete save_re_context?

Did you already follow up on this?

Yes\, in these commits (in reverse order)​:

2018906 pp_ctl.c​: Remove junk from #endif 0ddd4a5 Mathomise save_re_context e32ff4e pp_ctl.c​: Remove PL_curcop assignment 1a419e6 utf8.c​: Move an #ifndef for clarity 1ca1bae Remove obsolete comment from utf8.c d28a925 Don’t call save_re_context b4fa55d Gut Perl_save_re_context

Great. Thanks. BTW\, did you dig into the history of the function to see why it was added in the first place?

7d75537e explains the history.

Did it ever make sense?

Yes\, originally save_re_context saved a whole list of global variables used by the regexp engine (which no longer exist).

The localisation of $1 etc. was added later. I’m not sure that part ever made sense. I’m not sure either that it’s worth trying to figure it out. It was commit ada6e8a992 that did it\, to fix bug #18107.

Cool thanks\, that was very educational.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 10 years ago

@cpansprout - Status changed from 'open' to 'pending release'

p5pRT commented 10 years ago

From @iabyn

On Sun\, Sep 14\, 2014 at 02​:30​:41PM -0700\, Father Chrysostomos via RT wrote​:

The localisation of $1 etc. was added later. I’m not sure that part ever made sense. I’m not sure either that it’s worth trying to figure it out. It was commit ada6e8a992 that did it\, to fix bug #18107.

I've always had the strong suspicion that the $1 localization was an incorrect bug-fix\, and its been on my list of things to look at. Thanks for sorting it out.

-- The optimist believes that he lives in the best of all possible worlds. As does the pessimist.

p5pRT commented 9 years ago

From @khwilliamson

Thanks for submitting this ticket

The issue should be resolved with the release today of Perl v5.22. If you find that the problem persists\, feel free to reopen this ticket

-- Karl Williamson for the Perl 5 porters team

p5pRT commented 9 years ago

@khwilliamson - Status changed from 'pending release' to 'resolved'