Raku / old-issue-tracker

Tickets from RT
https://github.com/Raku/old-issue-tracker/issues
2 stars 1 forks source link

no protection against potentially infinite quantification on zero-width assertion #1806

Open p6rt opened 14 years ago

p6rt commented 14 years ago

Migrated from rt.perl.org#75586 (status was 'open')

Searchable as RT75586$

p6rt commented 15 years ago

From @mathw

Run this​:

perl6 -e "my $x = 'wibble'; $x.subst(/\+ $$/, '');"

Watch your processor usage go through the roof. It doesn't seem to terminate (as far as I can tell without solving the halting problem), but memory usage remains stable.

p6rt commented 15 years ago

From @mathw

Turns out that the loop is actually caused by the \+, not the subst, as I had the same inside an ordinary match. jnthn tells me that the + is a bit redundant, but it shouldn't cause a loop.

\* also succumbs.

p6rt commented 14 years ago

From @cognominal

To reproduce :

say 'a' ~~ m/''*/

and wait until the end of times.

-- cognominal stef

p6rt commented 14 years ago

From @bbkr

[16​:25] \ rakudo​: grammar X { token TOP { \+ } }; X.parse(" "); # why it goes into infinite loop (despite the fact that + is not necessary) ? [16​:25] \ rakudo a54677​: ( no output ) [16​:25] \ known bug? [16​:26] \ haven't seen it before, no. [16​:26] \ Me either. [16​:26] * bbkr reports [16​:26] \ I guess since \ can match nothing, it can also match nothing a lot of times. ;-) [16​:27] \ that might explain it. [16​:27] \ so maybe it's expected behavior. [16​:27] \ maybe even something that merits a warning from the compiler. [16​:27] \ if someone does \+ with the standard \ rule, they get a "you probably don't mean that" warning. [16​:27] \ rakudo​: grammar X { token TOP { ""+ } }; X.parse(" "); # looks like jnthn i right [16​:28] \ rakudo a54677​: ( no output ) [16​:28] \<-- jaldhar_ has left this server (Remote host closed the connection). [16​:28] \ rakudo​: " " ~~ / [ '' ]+ /; say 'alive' [16​:28] \ rakudo a54677​: ( no output ) [16​:28] \ Ah, *any* zero-width match seems to exhibit it. [16​:29] \ That probably is a bug. [16​:29] \ no, it's not. [16​:29] \ think about it. it's expected behavior. [16​:29] \ rakudo​: " " ~~ / [ x? ]+ /; say 'alive' [16​:29] \ regexes are *expected* to do this. [16​:29] \ rakudo a54677​: ( no output ) [16​:29] \ I have a plan for letting GGE detect these things, though. [16​:29] \ it tends to bite people. [16​:30] \ I also seem to recall that with a Thompson engine, you can't even think that kind of wrong thought. ;) [16​:30] \ masak​: It's inconsistent with Perl 5. [16​:30] \ C​:\>perl -e "' ' =~ /(x?)+/; print 42;" [16​:30] \ 42 [16​:30] \ oh? [16​:30] \ then it probably merits pmichaud's and TimToady's attention.

p6rt commented 14 years ago

From @bbkr

another part of discussion

[16​:36] \ mathw​: I'm trying to work out why it might not be so simple as "check if we advanced by any characters" [16​:37] \ jnthn​: it'd be nice if it is that simple [16​:39] \ rakudo​: for 1..5 { my $x = 0; " " ~~ /(\<?{ $x++; rand()

0.5 }> x?)+/; say $x; } [16​:39] \ rakudo a54677​: OUTPUT«===SORRY!===␤Unsupported use of rand(); in Perl 6 please use rand at line 11, near "() > 0.5 }"␤» [16​:40] \ rakudo​: for 1..5 { my $x = 0; " " ~~ /(\<?{ $x++; rand > 0.5 }> x?)+/; say $x; } [16​:40] \ rakudo a54677​: OUTPUT«2␤2␤2␤5␤4␤» [16​:40] \ Well, we'd break that. ;-) [16​:40] \ oh boo [16​:40] \ I'm conflicted [16​:40] \ on the one hand is the side of me which says "people who rely on that kind of trick can figure out another way" [16​:41] \ it also says "we shouldn't make it trivial to introduce infinite loops into regexps via a common mistake" [16​:41] \ however the other side of me says "Perl gives you enough rope to hang yourself" [16​:41] \ but I can imagine being in here for years answering about why \+ causes infinite loops [16​:44] \ \+ causing infinite loop is not DWIM-style, i think most people understan it as "\s+". beside​: i cannot imagine where this "bug" can be used for purpose

p6rt commented 14 years ago

From @bbkr

same as http://rt.perl.org/rt3/Ticket/Display.html?id=75586

p6rt commented 12 years ago

From @diakopter

On Wed Jun 09 07​:49​:39 2010, bbkr wrote​:

another part of discussion

[16​:36] \ mathw​: I'm trying to work out why it might not be so simple as "check if we advanced by any characters" [16​:37] \ jnthn​: it'd be nice if it is that simple

14​:21 \< diakopter> (that's what I did in the javascript peg for MGrammar - check if it advanced; seemed to work great)

p6rt commented 12 years ago

The RT System itself - Status changed from 'new' to 'open'

p6rt commented 9 years ago

From @drforr

OS​: Ubuntu 14.04 LTS on VirtualBox Host​: Windows 8, dual Core i5

Rakudo version​: current as of 3/25/2015

This edge case invokes the OOM killer on my test machine. It requires at least one level of nesting, and is probably analogous to the exponential backtracking in the '(a(a(a(...)*)*)*)' regular expression, although that expression will terminate. I don't think this one does :) It is a fairly extreme edge case, but if I did this by accident I'm sure someone else will. It also feels like something not resetting pos() after backtracking, but I don't claim to know the new regex's internals.

The code is here​: --cut here-- grammar Bug {   token blank { \s* }   token TOP { \* } } Bug.parse(''); --cut here--

p6rt commented 7 years ago

From tomentiruran@gmail.com

This is a simple two-regex grammar. It is as if the ‘?’ modifier is set on the wildcards, and matching nothing doesn’t stop the parser. -- Andrew Buchanan tomentiruran@​gmail.com \mailto&#8203;:tomentiruran@&#8203;gmail\.com

p6rt commented 7 years ago

From tomentiruran@gmail.com

use v6;

# This Grammar goes into an infinite loop grammar G {   regex Body { \+ }   # With this, only one character is chosen at a time, except when there is a \v   regex Text { \V*​: % \v+ }   # With this, the input is found, and then it goes into an infinite loop   #regex Text { \V* }   # This works   # regex Text { \V+ } } class A {   method FALLBACK($fn, $match) {   say "\<$fn> matched '$match'";   } }

my $g = G.new; my $actions = A.new; my $rule = 'Body'; say "Calling parse"; $g.parse("Line 1", :$rule, :$actions);

=finish moar​::dcbrule=@​​: moar​::ar=ar moar​::mastdir=/Users/andrewbuchanan/.rakudobrew/moar-nom/install/share/nqp/lib/MAST moar​::tomrule=$(AR) $(ARFLAGS) $@​ 3rdparty/libtommath/*.o moar​::shaobjects=3rdparty/sha1/sha1.o moar​::ccshared= moar​::be=0 moar​::dynasmlua=./3rdparty/dynasm/dynasm.lua moar​::ld=clang moar​::mtlib=3rdparty/tinymt/libtinymt.a moar​::mtobjects=3rdparty/tinymt/tinymt64.o moar​::noreturnspecifier= moar​::perl=perl moar​::shaclean=$(RM) 3rdparty/sha1/libsha1.a 3rdparty/sha1/*.o moar​::static_inline=static __inline__ moar​::ccdebugflags=-g3 moar​::auxclean=@​​: moar​::thirdpartylibs=3rdparty/dyncall/dyncall/libdyncall_s.a 3rdparty/dyncall/dyncallback/libdyncallback_s.a 3rdparty/dyncall/dynload/libdynload_s.a 3rdparty/libatomic_ops/src/libatomic_ops.a 3rdparty/tinymt/libtinymt.a 3rdparty/sha1/libsha1.a 3rdparty/libtommath/libtommath.a 3rdparty/libuv/libuv.a moar​::config=--optimize --prefix=/Users/andrewbuchanan/.rakudobrew/moar-nom/install --make-install moar​::ccinc=-I moar​::booltype=_Bool moar​::ldout=-o moar​::moarlib=libmoar.a moar​::mtrule=$(AR) $(ARFLAGS) $@​ 3rdparty/tinymt/*.o moar​::dcrule=cd 3rdparty/dyncall && ./configure && CC='$(CC)' CFLAGS='$(CFLAGS)' $(MAKE) -f Makefile moar​::dcbobjects= moar​::cflags=-fno-omit-frame-pointer -fno-optimize-sibling-calls -O3 -DNDEBUG -Wno-logical-op-parentheses -D_DARWIN_USE_64_BIT_INODE=1 moar​::laolib=3rdparty/libatomic_ops/src/libatomic_ops.a moar​::libdir=/Users/andrewbuchanan/.rakudobrew/moar-nom/install/lib moar​::nul=/dev/null moar​::osvers=15.0 moar​::translate_newline_output=0 moar​::dlrule=@​​: moar​::cat=cat moar​::can_unaligned_int32=1 moar​::ldimp= moar​::pkgconfig=/usr/bin/pkg-config moar​::ccdef=-D moar​::lib=lib%s.a moar​::tomclean=$(RM) 3rdparty/libtommath/libtommath.a 3rdparty/libtommath/*.o moar​::dlllocal=__attribute__ ((visibility ("hidden"))) moar​::dcbclean=$(RM) 3rdparty/dyncall/dyncallback/libdyncallback_s.a moar​::noreturnattribute=__attribute__((noreturn)) moar​::ptr_size=8 moar​::dllimport=__attribute__ ((visibility ("default"))) moar​::ccout=-o moar​::can_unaligned_num64=1 moar​::has_pthread_yield=0 moar​::ldrpath=-Wl,-rpath,"//Users/andrewbuchanan/.rakudobrew/moar-nom/install/lib" -Wl,-rpath,"/Users/andrewbuchanan/.rakudobrew/moar-nom/install/share/perl6/site/lib" moar​::ldusr=-l%s moar​::mainflags=-DMVM_SHARED moar​::name=moar moar​::dll=lib%s.dylib moar​::cincludes= -I3rdparty/libuv/include -I3rdparty/libuv/src -I3rdparty/libatomic_ops/src -I3rdparty/libtommath -I3rdparty/dynasm -I3rdparty/dyncall/dynload -I3rdparty/dyncall/dyncall -I3rdparty/dyncall/dyncallback moar​::uvrule=$(AR) $(ARFLAGS) $@​ $(UV_DARWIN) moar​::tomlib=3rdparty/libtommath/libtommath.a moar​::staticlib= moar​::dlclean=$(RM) 3rdparty/dyncall/dynload/libdynload_s.a moar​::make=make moar​::ccinstflags=-fsanitize=address moar​::ldflags= -O3 -DNDEBUG -Wl,-rpath,"//Users/andrewbuchanan/.rakudobrew/moar-nom/install/lib" -Wl,-rpath,"/Users/andrewbuchanan/.rakudobrew/moar-nom/install/share/perl6/site/lib" moar​::crossconf= moar​::obj=.o moar​::usrlibs[0]=pthread moar​::dlobjects= moar​::defs[0]=_DARWIN_USE_64_BIT_INODE=1 moar​::cppout=> moar​::ccoptiflags=-O3 -DNDEBUG moar​::asmswitch=-S moar​::ldoptiflags=-O3 -DNDEBUG moar​::mkflags= moar​::moar=libmoar.dylib moar​::version=2016.09 moar​::dcclean=cd 3rdparty/dyncall && $(MAKE) -f Makefile clean moar​::moarshared=-install_name "/Users/andrewbuchanan/.rakudobrew/moar-nom/install/lib/libmoar.dylib" moar​::sharule=$(AR) $(ARFLAGS) $@​ 3rdparty/sha1/*.o moar​::ccwarnflags=-Wno-logical-op-parentheses moar​::jit=$(JIT_POSIX_X64) moar​::moardll=libmoar.dylib moar​::asm=.s moar​::mknoisy=ifneq ($(NOISY), 1)MSG = @​echoCMD = @​NOOUT = > /dev/nullNOERR = 2> /dev/nullendif moar​::dclib=3rdparty/dyncall/dyncall/libdyncall_s.a moar​::ccdefflags=-D_DARWIN_USE_64_BIT_INODE=1 moar​::cc=clang moar​::exe= moar​::lddir=-L moar​::uvlib=3rdparty/libuv/libuv.a moar​::dllib=3rdparty/dyncall/dynload/libdynload_s.a moar​::shaincludedir=3rdparty/sha1 moar​::versionmajor=2016 moar​::cppswitch=-E moar​::can_unaligned_int64=1 moar​::lua=./3rdparty/dynasm/minilua moar​::dcobjects= moar​::cancgoto=1 moar​::ldlibs=-lpthread moar​::mainlibs=-L. -lmoar moar​::nativecall_backend=dyncall moar​::objflags=-DMVM_BUILD_SHARED moar​::versionpatch=0 moar​::laoclean=cd 3rdparty/libatomic_ops/src && $(MAKE) distclean moar​::versionminor=09 moar​::ccmiscflags=-fno-omit-frame-pointer -fno-optimize-sibling-calls moar​::install= $(MKPATH) $(DESTDIR)$(PREFIX)/include/libuv $(CP) 3rdparty/libuv/include/*.h $(DESTDIR)$(PREFIX)/include/libuv $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/armcc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/gcc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/hpc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/ibmc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/icc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/loadstore $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/msftc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/sunc $(CP) 3rdparty/libatomic_ops/src/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops $(CP) 3rdparty/libatomic_ops/src/atomic_ops/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/armcc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/armcc $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/gcc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/gcc $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/hpc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/hpc $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/ibmc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/ibmc $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/icc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/icc $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/loadstore/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/loadstore $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/msftc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/msftc $(CP) 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/sunc/*.h $(DESTDIR)$(PREFIX)/include/libatomic_ops/atomic_ops/sysdeps/sunc $(MKPATH) $(DESTDIR)$(PREFIX)/include/libtommath $(CP) 3rdparty/libtommath/*.h $(DESTDIR)$(PREFIX)/include/libtommath $(MKPATH) $(DESTDIR)$(PREFIX)/include/dyncall $(CP) 3rdparty/dyncall/dynload/*.h $(DESTDIR)$(PREFIX)/include/dyncall $(CP) 3rdparty/dyncall/dyncall/*.h $(DESTDIR)$(PREFIX)/include/dyncall $(CP) 3rdparty/dyncall/dyncallback/*.h $(DESTDIR)$(PREFIX)/include/dyncall moar​::laoobjects= moar​::mtclean=$(RM) 3rdparty/tinymt/libtinymt.a 3rdparty/tinymt/*.o moar​::rm=rm -f moar​::uvclean=$(RM) 3rdparty/libuv/libuv.a $(UV_DARWIN) moar​::sharedlib=libmoar.dylib moar​::dllexport=__attribute__ ((visibility ("default"))) moar​::shalib=3rdparty/sha1/libsha1.a moar​::arout= moar​::lddebugflags=-g3 moar​::osname=darwin moar​::prefix=/Users/andrewbuchanan/.rakudobrew/moar-nom/install moar​::uvobjects=$(UV_DARWIN) moar​::dcblib=3rdparty/dyncall/dyncallback/libdyncallback_s.a moar​::bindir=/Users/andrewbuchanan/.rakudobrew/moar-nom/install/bin moar​::asmout=-o moar​::havebooltype=1 moar​::laorule=cd 3rdparty/libatomic_ops && CC='$(CC)' CFLAGS='$(CFLAGS)' ./configure && cd src && $(MAKE) && cd .. moar​::os=darwin moar​::sh=/bin/sh moar​::arflags=rcs moar​::ldshared=-dynamiclib moar​::impinst=libmoar.dylib moar​::ldinstflags=-fsanitize=address moar​::ldmiscflags= moar​::ldsys=-l%s moar​::platform=$(PLATFORM_POSIX) moar​::ccswitch=-c moar​::formatattribute=__attribute__((format(X, Y, Z))) moar​::tomobjects=3rdparty/libtommath/bn_error.o 3rdparty/libtommath/bn_fast_mp_invmod.o 3rdparty/libtommath/bn_fast_mp_montgomery_reduce.o 3rdparty/libtommath/bn_fast_s_mp_mul_digs.o 3rdparty/libtommath/bn_fast_s_mp_mul_high_digs.o 3rdparty/libtommath/bn_fast_s_mp_sqr.o 3rdparty/libtommath/bn_mp_2expt.o 3rdparty/libtommath/bn_mp_abs.o 3rdparty/libtommath/bn_mp_add.o 3rdparty/libtommath/bn_mp_add_d.o 3rdparty/libtommath/bn_mp_addmod.o 3rdparty/libtommath/bn_mp_and.o 3rdparty/libtommath/bn_mp_clamp.o 3rdparty/libtommath/bn_mp_clear.o 3rdparty/libtommath/bn_mp_clear_multi.o 3rdparty/libtommath/bn_mp_cmp.o 3rdparty/libtommath/bn_mp_cmp_d.o 3rdparty/libtommath/bn_mp_cmp_mag.o 3rdparty/libtommath/bn_mp_cnt_lsb.o 3rdparty/libtommath/bn_mp_copy.o 3rdparty/libtommath/bn_mp_count_bits.o 3rdparty/libtommath/bn_mp_div.o 3rdparty/libtommath/bn_mp_div_2.o 3rdparty/libtommath/bn_mp_div_2d.o 3rdparty/libtommath/bn_mp_div_3.o 3rdparty/libtommath/bn_mp_div_d.o 3rdparty/libtommath/bn_mp_dr_is_modulus.o 3rdparty/libtommath/bn_mp_dr_reduce.o 3rdparty/libtommath/bn_mp_dr_setup.o 3rdparty/libtommath/bn_mp_exch.o 3rdparty/libtommath/bn_mp_expt_d.o 3rdparty/libtommath/bn_mp_exptmod.o 3rdparty/libtommath/bn_mp_exptmod_fast.o 3rdparty/libtommath/bn_mp_exteuclid.o 3rdparty/libtommath/bn_mp_fread.o 3rdparty/libtommath/bn_mp_fwrite.o 3rdparty/libtommath/bn_mp_gcd.o 3rdparty/libtommath/bn_mp_get_int.o 3rdparty/libtommath/bn_mp_get_long.o 3rdparty/libtommath/bn_mp_grow.o 3rdparty/libtommath/bn_mp_init.o 3rdparty/libtommath/bn_mp_init_copy.o 3rdparty/libtommath/bn_mp_init_multi.o 3rdparty/libtommath/bn_mp_init_set.o 3rdparty/libtommath/bn_mp_init_set_int.o 3rdparty/libtommath/bn_mp_init_size.o 3rdparty/libtommath/bn_mp_invmod.o 3rdparty/libtommath/bn_mp_invmod_slow.o 3rdparty/libtommath/bn_mp_is_square.o 3rdparty/libtommath/bn_mp_jacobi.o 3rdparty/libtommath/bn_mp_karatsuba_mul.o 3rdparty/libtommath/bn_mp_karatsuba_sqr.o 3rdparty/libtommath/bn_mp_lcm.o 3rdparty/libtommath/bn_mp_lshd.o 3rdparty/libtommath/bn_mp_mod.o 3rdparty/libtommath/bn_mp_mod_2d.o 3rdparty/libtommath/bn_mp_mod_d.o 3rdparty/libtommath/bn_mp_montgomery_calc_normalization.o 3rdparty/libtommath/bn_mp_montgomery_reduce.o 3rdparty/libtommath/bn_mp_montgomery_setup.o 3rdparty/libtommath/bn_mp_mul.o 3rdparty/libtommath/bn_mp_mul_2.o 3rdparty/libtommath/bn_mp_mul_2d.o 3rdparty/libtommath/bn_mp_mul_d.o 3rdparty/libtommath/bn_mp_mulmod.o 3rdparty/libtommath/bn_mp_n_root.o 3rdparty/libtommath/bn_mp_neg.o 3rdparty/libtommath/bn_mp_or.o 3rdparty/libtommath/bn_mp_prime_fermat.o 3rdparty/libtommath/bn_mp_prime_is_divisible.o 3rdparty/libtommath/bn_mp_prime_is_prime.o 3rdparty/libtommath/bn_mp_prime_miller_rabin.o 3rdparty/libtommath/bn_mp_prime_next_prime.o 3rdparty/libtommath/bn_mp_prime_rabin_miller_trials.o 3rdparty/libtommath/bn_mp_prime_random_ex.o 3rdparty/libtommath/bn_mp_radix_size.o 3rdparty/libtommath/bn_mp_radix_smap.o 3rdparty/libtommath/bn_mp_rand.o 3rdparty/libtommath/bn_mp_read_radix.o 3rdparty/libtommath/bn_mp_read_signed_bin.o 3rdparty/libtommath/bn_mp_read_unsigned_bin.o 3rdparty/libtommath/bn_mp_reduce.o 3rdparty/libtommath/bn_mp_reduce_2k.o 3rdparty/libtommath/bn_mp_reduce_2k_l.o 3rdparty/libtommath/bn_mp_reduce_2k_setup.o 3rdparty/libtommath/bn_mp_reduce_2k_setup_l.o 3rdparty/libtommath/bn_mp_reduce_is_2k.o 3rdparty/libtommath/bn_mp_reduce_is_2k_l.o 3rdparty/libtommath/bn_mp_reduce_setup.o 3rdparty/libtommath/bn_mp_rshd.o 3rdparty/libtommath/bn_mp_set.o 3rdparty/libtommath/bn_mp_set_int.o 3rdparty/libtommath/bn_mp_set_long.o 3rdparty/libtommath/bn_mp_shrink.o 3rdparty/libtommath/bn_mp_signed_bin_size.o 3rdparty/libtommath/bn_mp_sqr.o 3rdparty/libtommath/bn_mp_sqrmod.o 3rdparty/libtommath/bn_mp_sqrt.o 3rdparty/libtommath/bn_mp_sub.o 3rdparty/libtommath/bn_mp_sub_d.o 3rdparty/libtommath/bn_mp_submod.o 3rdparty/libtommath/bn_mp_to_signed_bin.o 3rdparty/libtommath/bn_mp_to_signed_bin_n.o 3rdparty/libtommath/bn_mp_to_unsigned_bin.o 3rdparty/libtommath/bn_mp_to_unsigned_bin_n.o 3rdparty/libtommath/bn_mp_toom_mul.o 3rdparty/libtommath/bn_mp_toom_sqr.o 3rdparty/libtommath/bn_mp_toradix.o 3rdparty/libtommath/bn_mp_toradix_n.o 3rdparty/libtommath/bn_mp_unsigned_bin_size.o 3rdparty/libtommath/bn_mp_xor.o 3rdparty/libtommath/bn_mp_zero.o 3rdparty/libtommath/bn_prime_tab.o 3rdparty/libtommath/bn_reverse.o 3rdparty/libtommath/bn_s_mp_add.o 3rdparty/libtommath/bn_s_mp_exptmod.o 3rdparty/libtommath/bn_s_mp_mul_digs.o 3rdparty/libtommath/bn_s_mp_mul_high_digs.o 3rdparty/libtommath/bn_s_mp_sqr.o 3rdparty/libtommath/bn_s_mp_sub.o 3rdparty/libtommath/bncore.o perl6​::language_version=6.c perl6​::codename= perl6​::release-number= perl6​::build-date=2016-09-19T13​:43​:11Z perl6​::version=2016.09-15-g4b1864b perl6​::implementation=Rakudo

p6rt commented 7 years ago

From @zoffixznet

I'm not seeing the bug here, to be honest.

The `Body` is asking for one or more tokens `Text`, *nothing* is a valid match for those tokens, so after matching the provided text, your grammar continues to match nothing infinite number of times.

p6rt commented 7 years ago

The RT System itself - Status changed from 'new' to 'open'

p6rt commented 7 years ago

From @zoffixznet

Here's a much shorter way to reproduce it​:

  perl6 -e '"foo" ~~ /(.*)+/' # hangs

While my previous explanation for why this occurs makes sense, it's worth noting this behaviour is not observed in Perl 5, for example​:

  perl -e '"foo" =~ /(.*)+/' # does not hang

p6rt commented 7 years ago

From @coke

On Tue Sep 20 06​:06​:59 2016, cpan@​zoffix.com wrote​:

Here's a much shorter way to reproduce it​:

perl6 -e '"foo" ~~ /(.*)+/' # hangs

While my previous explanation for why this occurs makes sense, it's worth noting this behaviour is not observed in Perl 5, for example​:

perl -e '"foo" =~ /(.*)+/' # does not hang

This sounds like a dupe of #​75586

-- Will "Coke" Coleda

p6rt commented 7 years ago

From tomentiruran@gmail.com

Is this also technically correct, even though it clearly shouldn't match?

perl6 -e '"foo" ~~ /(.*)+\​:/' # hangs

In either case, going into an infinite loop is not exactly DWIM.

On 20 Sep 2016, at 9​:12 PM, Will Coleda via RT \perl6\-bugs\-followup@&#8203;perl\.org wrote​:

On Tue Sep 20 06​:06​:59 2016, cpan@​zoffix.com wrote​:

Here's a much shorter way to reproduce it​:

perl6 -e '"foo" ~~ /(.*)+/' # hangs

While my previous explanation for why this occurs makes sense, it's worth noting this behaviour is not observed in Perl 5, for example​:

perl -e '"foo" =~ /(.*)+/' # does not hang

This sounds like a dupe of #​75586

-- Will "Coke" Coleda

p6rt commented 7 years ago

From @smls

Is this also technically correct, even though it clearly shouldn't match?

perl6 -e '"foo" ~~ /(.*)+\​:/' # hangs

Looks right to me. You're asking for "Capture zero or more characters, as often as possible". So you get​:

1) three characters 2) zero characters 3) zero characters 4) zero characters ...and so on.

In either case, going into an infinite loop is not exactly DWIM.

Regexes are code, and when you write an infinite loop, it hangs - just like when you write an infinite loop in normal Perl 6 code.

Perl 5 has a special fallback mechanism to force-advance the regex cursor by one character when it sees that it has not moved for two iterations of the same quantifier, and that makes some things "DWIM", but also causes its fair share of confusion.

In Perl 6, where regex code and normal code can be intermixed easily and safely, Perl 5's special rule could mess up people's expectations for the execution order of embedded code blocks, and it's not even that crazy to imagine situations where you *want* an alternation to keep looping on the spot (e.g. until an embedded code block sets some condition to move one).

Not to mention that Perl 6 doubled down on the whole "regexes are code, not magic strings" principle.

So I'm not surprised that infinite loops behave exactly as they would in normal code, and the auto-advance mechanism from Perl 5 was not ported over.

p6rt commented 7 years ago

From @smls

For the record the problems the auto-advance feature causes even in Perl 5 (where embedded code blocks are experimental and rarely used), are twofold​:

1) Writing an infinite loop can be indicative of a programmer mistake, and the auto-advance feature hides it by making the regex "do something" (which may or may not be what the programmer intended) instead of hanging (which would have caused the programmer to re-write the regex).

2) It can cause mysterious-seeming double captures. When there is a capture group in a quantified zero-width assertion, it will capture the same thing twice, because auto-advance kicks in when the cursor has not advanced after *two* iterations. Now, Perl 5 lets repeated captures of the same capture group overwrite each other, so this does not normally show up other than as a performance degradation (whereas in Perl 6 it would cause the same sub-matches to appear twice in $/). But it can show up in even Perl 5 with `while(m//g) { }` global matching. I remember a discussion on perlmonks a few years ago, where even experienced Perl hackers were tearing their hair out trying to figure out why such a while loop iterated over the same match twice.

p6rt commented 7 years ago

From tomentiruran@gmail.com

Ok, that clarifies things. Now that I understand what is happening, it is straightforward to recognise and fix the problem. A sentence in the documentation might help other perl 5 transitioners from getting bitten, perhaps at the explanation of the * quantifier.

On 21 Sep 2016, at 6​:35 PM, Sam S. via RT \perl6\-bugs\-followup@&#8203;perl\.org wrote​:

For the record the problems the auto-advance feature causes even in Perl 5 (where embedded code blocks are experimental and rarely used), are twofold​:

1) Writing an infinite loop can be indicative of a programmer mistake, and the auto-advance feature hides it by making the regex "do something" (which may or may not be what the programmer intended) instead of hanging (which would have caused the programmer to re-write the regex).

2) It can cause mysterious-seeming double captures. When there is a capture group in a quantified zero-width assertion, it will capture the same thing twice, because auto-advance kicks in when the cursor has not advanced after *two* iterations. Now, Perl 5 lets repeated captures of the same capture group overwrite each other, so this does not normally show up other than as a performance degradation (whereas in Perl 6 it would cause the same sub-matches to appear twice in $/). But it can show up in even Perl 5 with `while(m//g) { }` global matching. I remember a discussion on perlmonks a few years ago, where even experienced Perl hackers were tearing their hair out trying to figure out why such a while loop iterated over the same match twice.

p6rt commented 6 years ago

From @AlexDaniel

This reminds me of https://rt-archive.perl.org/perl6/Ticket/Display.html?id=132004

On 2015-03-27 15​:01​:38, drforr@​pobox.com wrote​:

OS​: Ubuntu 14.04 LTS on VirtualBox Host​: Windows 8, dual Core i5

Rakudo version​: current as of 3/25/2015

This edge case invokes the OOM killer on my test machine. It requires at least one level of nesting, and is probably analogous to the exponential backtracking in the '(a(a(a(...)*)*)*)' regular expression, although that expression will terminate. I don't think this one does :) It is a fairly extreme edge case, but if I did this by accident I'm sure someone else will. It also feels like something not resetting pos() after backtracking, but I don't claim to know the new regex's internals.

The code is here​: --cut here-- grammar Bug { token blank { \s* } token TOP { \* } } Bug.parse(''); --cut here--

p6rt commented 6 years ago

The RT System itself - Status changed from 'new' to 'open'