Closed p5pRT closed 19 years ago
Test case #0: perl -e '$eol = qr/$/m; "foo\nbar\n" =~ /$eol/; print $-[0]\, "\n"'
Test case #1: perl -e '$eol = qr/$/m; "foo\nbar\n" =~ /$eol(?:)/; print $-[0]\, "\n"'
I'm getting the answer 7 from case #0 and 3 from case #1. The correct answer is 3. (The /$/m pattern should match at the embedded newline at position 3.)
In case #1\, putting anything at all that I've tried into the regexp in addition to $eol makes it give the correct answer. The semantically null /(?:)/ is what I'm using as a workaround in the application where I ran into this. There are other workarounds too.
Putting the /m modifier on where $eol is used also makes it give the right answer\, but for the wrong reason. Here's a related case that misbehaves:
Test case #2: perl -e '$eol = qr/$/; "foo\nbar\n" =~ /$eol/m; print $-[0]\, "\n"'
#2 is the converse of #1; it outputs 3 where it should output 7.
The envelope of this bug is quite revealing.
Zefram (via RT) \perlbug\-followup@​perl\.org wrote: :Test case #0: : perl -e '$eol = qr/$/m; "foo\nbar\n" =~ /$eol/; print $-[0]\, "\n"' : :Test case #1: : perl -e '$eol = qr/$/m; "foo\nbar\n" =~ /$eol(?:)/; print $-[0]\, "\n"' : :I'm getting the answer 7 from case #0 and 3 from case #1. The correct :answer is 3. (The /$/m pattern should match at the embedded newline at :position 3.) [...] :Test case #2: : perl -e '$eol = qr/$/; "foo\nbar\n" =~ /$eol/m; print $-[0]\, "\n"' : :#2 is the converse of #1; it outputs 3 where it should output 7.
This is the same bug as #7781\, reported way back in October 2001.
For some reason my comments from then aren't attached in RT; you can find them here: http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2001-10/msg00552.html .. and in the additional followup to that.
I think the right answer for perl-5.10.0 is to remove support for $* entirely\, and fixup the regexp engine to remove references to PL_multiline and instead use the current flags throughout\, and attach the relevant flags to any cached optimiser substrings for passing to fbm_instr(). That's likely to be a largish job\, and difficult to make suitable for any maintenance branch.
For maintenance versions it may be possible to identify a reasonable subset of cases in which the SvTAIL optimisation should be suppressed - for example\, any regexp that mixes +m and -m flag settings. If that is an avenue worth pursuing\, it probably makes sense to develop that in bleadperl before embarking on the excision of PL_multiline.
As a workaround\, you could replace the definition of $eol in your code with something like qr/(?=\n|\z)/.
Hugo
The RT System itself - Status changed from 'new' to 'open'
hv@crypt.org wrote:
I think the right answer for perl-5.10.0 is to remove support for $* entirely\, and fixup the regexp engine to remove references to PL_multiline and instead use the current flags throughout\, and attach the relevant flags to any cached optimiser substrings for passing to fbm_instr(). That's likely to be a largish job\, and difficult to make suitable for any maintenance branch.
$* has already been removed from bleadperl (since before 5.9.0\, actually).
[zefram@fysh.org - Mon Feb 23 15:11:06 2004]:
Test case #0: perl -e '$eol = qr/$/m; "foo\nbar\n" =~ /$eol/; print $-[0]\, "\n"'
Test case #1: perl -e '$eol = qr/$/m; "foo\nbar\n" =~ /$eol(?:)/; print $-[0]\, "\n"'
I'm getting the answer 7 from case #0 and 3 from case #1. The correct answer is 3. (The /$/m pattern should match at the embedded newline at position 3.)
bleadperl@25129 reports 3 for both cases.
Test case #2: perl -e '$eol = qr/$/; "foo\nbar\n" =~ /$eol/m; print $-[0]\, "\n"'
#2 is the converse of #1; it outputs 3 where it should output 7.
bleadperl reports 7.
I believe this bug is fixed but I'd like to see a test added before closing it. The regex tests scare me.
On Thu\, Jul 14\, 2005 at 03:52:45AM -0700\, Michael G Schwern via RT wrote:
I believe this bug is fixed but I'd like to see a test added before closing it. The regex tests scare me.
Here's one way. Line 614 of t/op/re_tests fits the bill. I'm a little disappointed that this patch didn't shake out any more bugs. I guess we're getting close to a bug-free regex engine. ;-)
If you don't like having an extra 900+ tests then I suppose adding a line like:
'$(?:)'m b\na\n y $-[0] 1
or
(?m:$)(?:) b\na\n y $-[0] 1
to t/op/re_tests would suffice. But that doesn't test the embedding of a qr/pattern/m in another pattern.
-- Rick Delaney rick@bort.ca
Rick Delaney \rick@​bort\.ca wrote: :I guess we're getting close to a bug-free regex engine. ;-)
\
:But that doesn't test the embedding of a qr/pattern/m in another pattern.
You can add random tests to op/pat.t for anything that can't be squeezed into re_tests.
Hugo
On Thu\, Jul 14\, 2005 at 10:10:59AM -0400\, Rick Delaney wrote:
to t/op/re_tests would suffice. But that doesn't test the embedding of a qr/pattern/m in another pattern.
Sounds like a fine hammer to hit the regex engine with.
-- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern Reality is that which\, when you stop believing in it\, doesn't go away. -- Phillip K. Dick
On 7/14/05\, Rick Delaney \rick@​bort\.ca wrote:
Here's one way. Line 614 of t/op/re_tests fits the bill. I'm a little disappointed that this patch didn't shake out any more bugs. I guess we're getting close to a bug-free regex engine. ;-)
Thanks\, almost a thousand tests added as change #25166.
@smpeters - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#27028 (status was 'resolved')
Searchable as RT27028$