Closed p5pRT closed 5 years ago
The following code works:
#!/usr/bin/perl
($_) = "abcdef" =~
/
((?&BB).*)
| (?!)
(?\
The following equivalent code does not:
#!/usr/bin/perl
($_) = "abcdef" =~
/
((?&BB).*)
|
(?!)
(?\
Why is the named pattern being treated as a variable length pattern?
Thanks\,
Adrian Hawryluk
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/
(?\
which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
(?=a)(?\<=a)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
-- Overhead\, without any fuss\, the stars were going out. -- Arthur C Clarke
The RT System itself - Status changed from 'new' to 'open'
On 11/21/2013 08:02 AM\, Dave Mitchell wrote:
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
It's my understanding that nobody understands that area
On 21 November 2013 16:02\, Dave Mitchell \davem@​iabyn\.com wrote:
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/ (?\
a) (?\ (?=(?&W))(?\<=(?&W)) ) (?&BB) /x; which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
\(?=a\)\(?\<=a\)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
All I can say that as implemented its not a bug\, if anything a misfeature. (?&W) is a hairs breadth away from (??{ ... }) and is treated accordingly right now. To fix this we would have to change that.
We would have to analyse the pattern and determine if the thing named (&W) (and there could be more than one) is variable width or not. So we punt and assume it must be variable width.
IMO unless we can very efficiently determine if the sub pattern is fixed width this ticket will end up as a "wont fix".
It definitely isn't a priority for me to investigate this edge case\, although I might at some point.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Thu\, Nov 21\, 2013 at 1:40 PM\, demerphq \demerphq@​gmail\.com wrote:
On 21 November 2013 16:02\, Dave Mitchell \davem@​iabyn\.com wrote:
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/ (?\
a) (?\ (?=(?&W))(?\<=(?&W)) ) (?&BB) /x; which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
\(?=a\)\(?\<=a\)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
All I can say that as implemented its not a bug\, if anything a misfeature. (?&W) is a hairs breadth away from (??{ ... }) and is treated accordingly right now. To fix this we would have to change that.
That was my guess too\, but that doesn't explain why only one of the following fails:
$ perl -E'qr/(?\
$ perl -E'qr/(?\
(I thought we used -E in error messages instead of -e when -E was used. Do I remember incorrectly\, or did that change?)
On Thu\, Nov 21\, 2013 at 07:40:31PM +0100\, demerphq wrote:
On 21 November 2013 16:02\, Dave Mitchell \davem@​iabyn\.com wrote:
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/ (?\
a) (?\ (?=(?&W))(?\<=(?&W)) ) (?&BB) /x; which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
\(?=a\)\(?\<=a\)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
All I can say that as implemented its not a bug\, if anything a misfeature. (?&W) is a hairs breadth away from (??{ ... }) and is treated accordingly right now. To fix this we would have to change that.
We would have to analyse the pattern and determine if the thing named (&W) (and there could be more than one) is variable width or not. So we punt and assume it must be variable width.
Except we often don't punt. If you remove *anything* from the reduced test
case above\, it stops warning. i.e. negative has to be preceded by a
positive lookbehind\, and wrapped within the \
-- "Procrastination grows to fill the available time" -- Mitchell's corollary to Parkinson's Law
On 21 November 2013 20:30\, Dave Mitchell \davem@​iabyn\.com wrote:
On Thu\, Nov 21\, 2013 at 07:40:31PM +0100\, demerphq wrote:
On 21 November 2013 16:02\, Dave Mitchell \davem@​iabyn\.com wrote:
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/ (?\
a) (?\ (?=(?&W))(?\<=(?&W)) ) (?&BB) /x; which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
\(?=a\)\(?\<=a\)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
All I can say that as implemented its not a bug\, if anything a misfeature. (?&W) is a hairs breadth away from (??{ ... }) and is treated accordingly right now. To fix this we would have to change that.
We would have to analyse the pattern and determine if the thing named (&W) (and there could be more than one) is variable width or not. So we punt and assume it must be variable width.
Except we often don't punt. If you remove *anything* from the reduced test case above\, it stops warning. i.e. negative has to be preceded by a positive lookbehind\, and wrapped within the \
. Remove either\, and it works. If it works sometimes\, that seems to lend more weight to it being a bug.
Gah. Ok. Ill investigate a bit and report my findings.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
demerphq wrote:
IMO unless we can very efficiently determine if the sub pattern is fixed width this ticket will end up as a "wont fix".
Runtime efficiency of determining fixedness shouldn't be a concern\, because the tricky case only comes up when explicitly invoked. A slow answer is far better than giving up.
-zefram
On 21 November 2013 20:30\, Dave Mitchell \davem@​iabyn\.com wrote:
On Thu\, Nov 21\, 2013 at 07:40:31PM +0100\, demerphq wrote:
On 21 November 2013 16:02\, Dave Mitchell \davem@​iabyn\.com wrote:
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/ (?\
a) (?\ (?=(?&W))(?\<=(?&W)) ) (?&BB) /x; which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
\(?=a\)\(?\<=a\)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
All I can say that as implemented its not a bug\, if anything a misfeature. (?&W) is a hairs breadth away from (??{ ... }) and is treated accordingly right now. To fix this we would have to change that.
We would have to analyse the pattern and determine if the thing named (&W) (and there could be more than one) is variable width or not. So we punt and assume it must be variable width.
Except we often don't punt. If you remove *anything* from the reduced test case above\, it stops warning. i.e. negative has to be preceded by a positive lookbehind\, and wrapped within the \
. Remove either\, and it works. If it works sometimes\, that seems to lend more weight to it being a bug.
commit 099ec7dcf9e085a650e6d9010c12ad9649209bf4 Author: Yves Orton \demerphq@​gmail\.com Date: Fri Nov 22 01:08:39 2013 +0100
Fix RT #120600: Variable length lookbehind is not variable
Inside of study_chunk() we have to guard against infinite recursion with recursive subpatterns. The existing logic sort of worked\, but didn't address all cases properly.
qr/
(?\
The pattern in the test would fail when the optimizer was expanding (&BB). When it recursed\, it creates a bitmap for the recursion it performs\, it then jumps back to the BB node and then eventually does the first (&W) call. At this point the bit for (&W) would be set in the bitmask. When the recursion for the (&W) exited (fake exit through the study frame logic) the bit was not /unset/. When the parser then entered the (&W) again it was treated as a nested and potentially infinite length pattern.
The fake-recursion in study-chunk made it little less obvious what was going on in the debug output.
By reorganizing the code and adding logic to unset the bitmap when exiting this bug was fixed. Unfortunately this also revealed another little issue with patterns like this:
qr/x|(?0)/ qr/(x|(?1))/
which forced the creation of a new bitmask for each branch. Effectively study_chunk treats each branch as an independent pattern\, so when we are expanding (?1) via the 'x' branch we dont want that to prevent us from detecting the infinite recursion in the (?1) branch. If you were to think of trips through study_chunk as paths\, and [] as recursive processing you would get something like:
BRANCH 'x' END BRANCH (?0) [ 'x' END ] BRANCH (?0) [ (?0) [ 'x' END ] ] ...
When we want something like:
BRANCH 'x' END BRANCH (?0) [ 'x' END ] BRANCH (?0) [ (?0) INFINITE_RECURSION ]
So when we deal with a branch we need to make a new recursion bitmask.
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 22 November 2013 01:32\, demerphq \demerphq@​gmail\.com wrote:
On 21 November 2013 20:30\, Dave Mitchell \davem@​iabyn\.com wrote:
On Thu\, Nov 21\, 2013 at 07:40:31PM +0100\, demerphq wrote:
On 21 November 2013 16:02\, Dave Mitchell \davem@​iabyn\.com wrote:
On Wed\, Nov 20\, 2013 at 11:12:15AM -0800\, Adrian wrote:
Why is the named pattern being treated as a variable length pattern?
I can reduce the failing code to this:
qr/ (?\
a) (?\ (?=(?&W))(?\<=(?&W)) ) (?&BB) /x; which gives
Variable length lookbehind not implemented
Changing the line with the two &W's to
\(?=a\)\(?\<=a\)
makes the error go away.
I've had a quick look at study_chunk()\, but it's way beyond my understanding. Does anyone who understands that area want to have a go?
All I can say that as implemented its not a bug\, if anything a misfeature. (?&W) is a hairs breadth away from (??{ ... }) and is treated accordingly right now. To fix this we would have to change that.
We would have to analyse the pattern and determine if the thing named (&W) (and there could be more than one) is variable width or not. So we punt and assume it must be variable width.
Except we often don't punt. If you remove *anything* from the reduced test case above\, it stops warning. i.e. negative has to be preceded by a positive lookbehind\, and wrapped within the \
. Remove either\, and it works. If it works sometimes\, that seems to lend more weight to it being a bug. commit 099ec7dcf9e085a650e6d9010c12ad9649209bf4 Author: Yves Orton \demerphq@​gmail\.com Date: Fri Nov 22 01:08:39 2013 +0100
FWIW\, I am not super happy with this implementation. We should have a flag for all forms of recursion\, and use that to decide if we need to allocate the "recursed" bitmap. As is we create it for every branch\, which is bad. I expected RExC_seen_recursed to be useful for this\, but it stubbornly wasnt\, and I didnt have time to dig further.
It fixes the bug however\, and if someone doesnt get to it first I will try to improve it over the weekend.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Fri\, Nov 22\, 2013 at 01:37:14AM +0100\, demerphq wrote:
It fixes the bug however\, and if someone doesnt get to it first I will try to improve it over the weekend.
Thanks for this. (And no I'm not volunteering to improve it ;-)
-- The Enterprise successfully ferries an alien VIP from one place to another without serious incident. -- Things That Never Happen in "Star Trek" #7
I think this is the same bug\, but I can give you a much simpler example of failure.
$ perl -v
This is perl 5\, version 18\, subversion 2 (v5.18.2) built for x86_64-linux-gnu-thread-multi (with 41 registered patches\, see perl -V for more detail)
$ cat try3.pl #!/usr/bin/perl
my $code = 'SELECT'; $code =~ s/(?\<!CLASSSY0B )\bALTER\b/xyz/igs;
$ perl try3.pl Variable length lookbehind not implemented in regex m/(?\<!CLASSSY0B )\bALTER\b/ at try3.pl line 4.
There is obviously no variable parts anywhere in that regex\, yet we get a failure. Interestingly\, if you remove the "i" qualifier on the end\, then the error goes away. Yes\, I do need that "i" qualifier. :)
On Tue\, May 13\, 2014 at 12:10 AM\, Kevin Brannen via RT \< perlbug-followup@perl.org> wrote:
I think this is the same bug\, but I can give you a much simpler example of failure.
I think it's not.
$ cat try3.pl #!/usr/bin/perl
my $code = 'SELECT'; $code =~ s/(?\<!CLASSSY0B )\bALTER\b/xyz/igs;
$ perl try3.pl Variable length lookbehind not implemented in regex m/(?\<!CLASSSY0B )\bALTER\b/ at try3.pl line 4.
There is obviously no variable parts anywhere in that regex\, yet we get a failure. Interestingly\, if you remove the "i" qualifier on the end\, then the error goes away. Yes\, I do need that "i" qualifier. :)
There are no obviously variable parts\, but there are variable parts: qr/SS/i is variable length\, since it can match 'ß'.
The fix would seem to be adding the "/a" modifier\, since you seem to be working on ASCII data\, and explicitly don't want Unicode case insensitivity\, with its variable-length implications.
Eirik
On 13 May 2014 10:30\, Eirik Berg Hanssen \ebhanssen@​cpan\.org wrote:
On Tue\, May 13\, 2014 at 12:10 AM\, Kevin Brannen via RT \< perlbug-followup@perl.org> wrote:
I think this is the same bug\, but I can give you a much simpler example of failure.
I think it's not.
$ cat try3.pl #!/usr/bin/perl
my $code = 'SELECT'; $code =~ s/(?\<!CLASSSY0B )\bALTER\b/xyz/igs;
$ perl try3.pl Variable length lookbehind not implemented in regex m/(?\<!CLASSSY0B )\bALTER\b/ at try3.pl line 4.
There is obviously no variable parts anywhere in that regex\, yet we get a failure. Interestingly\, if you remove the "i" qualifier on the end\, then the error goes away. Yes\, I do need that "i" qualifier. :)
There are no obviously variable parts\, but there are variable parts: qr/SS/i is variable length\, since it can match 'ß'.
The fix would seem to be adding the "/a" modifier\, since you seem to be working on ASCII data\, and explicitly don't want Unicode case insensitivity\, with its variable-length implications.
Nice. I didnt catch that personally. The error message should explain the problem better I reckon. Not sure how easy that is\, ill take a look one day maybe.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Tue\, May 13\, 2014 at 10:30 AM\, Eirik Berg Hanssen \ebhanssen@​cpan\.orgwrote:
The fix would seem to be adding the "/a" modifier\, since you seem to be working on ASCII data\, and explicitly don't want Unicode case insensitivity\, with its variable-length implications.
Err\, make that the "/aa" modifier. I'd thought "/a" would suffice\, but no:
eirik@greencat[11:11:07]~$ perl -e '/(?\<!SS)/i;' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. eirik@greencat[11:11:11]~$ perl -e '/(?\<!SS)/ai;' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. eirik@greencat[11:11:12]~$ perl -e '/(?\<!SS)/aai;' eirik@greencat[11:11:14]~$
Guess I'll need to reread those docs ...
Eirik
On Tue May 13 02:13:29 2014\, ebhanssen@cpan.org wrote:
Err\, make that the "/aa" modifier. I'd thought "/a" would suffice\,
Thanks Eirik! A new way for unicode to bite me that I wasn't aware of\, as if there aren't enough other ways. :) Now that I know what to look for\, I see this in the 5.16.0 change notes and I can go educate myself.
On 05/13/2014 03:13 AM\, Eirik Berg Hanssen wrote:
On Tue\, May 13\, 2014 at 10:30 AM\, Eirik Berg Hanssen \<ebhanssen@cpan.org \mailto​:ebhanssen@​cpan\.org> wrote:
The fix would seem to be adding the "/a" modifier\, since you seem to be working on ASCII data\, and explicitly don't want Unicode case insensitivity\, with its variable\-length implications\.
Err\, make that the "/aa" modifier. I'd thought "/a" would suffice\, but no:
eirik@greencat[11:11:07]~$ perl -e '/(?\<!SS)/i;' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. eirik@greencat[11:11:11]~$ perl -e '/(?\<!SS)/ai;' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. eirik@greencat[11:11:12]~$ perl -e '/(?\<!SS)/aai;' eirik@greencat[11:11:14]~$
Guess I'll need to reread those docs ...
Eirik
Having /a and /aa was a group decision. I wash my hands of it. /a works on things like \w and [:punct:] /aa is /a plus case folding /i
* Eirik Berg Hanssen \ebhanssen@​cpan\.org [2014-05-13 10:35]:
qr/SS/i is variable length\, since it can match 'ß'.
Cf. also this talk from GPW 2014: https://youtu.be/8FIGDgNa_CU
On 05/14/2014 07:18 AM\, Aristotle Pagaltzis wrote:
* Eirik Berg Hanssen \ebhanssen@​cpan\.org [2014-05-13 10:35]:
qr/SS/i is variable length\, since it can match 'ß'.
Cf. also this talk from GPW 2014: https://youtu.be/8FIGDgNa_CU
Is there some summary or alternative version of this in English?
On 05/13/2014 09:08 AM\, Kevin Brannen via RT wrote:
On Tue May 13 02:13:29 2014\, ebhanssen@cpan.org wrote:
Err\, make that the "/aa" modifier. I'd thought "/a" would suffice\,
Thanks Eirik! A new way for unicode to bite me that I wasn't aware of\, as if there aren't enough other ways. :) Now that I know what to look for\, I see this in the 5.16.0 change notes and I can go educate myself.
You can avoid Unicode issues in regexes by doing a
use re '/aa';
in an outer scope of your code. This causes Perl to behave pretty much like it did before Unicode was introduced.
* Karl Williamson \public@​khwilliamson\.com [2014-05-14 18:05]:
On 05/14/2014 07:18 AM\, Aristotle Pagaltzis wrote:
Cf. also this talk from GPW 2014: https://youtu.be/8FIGDgNa_CU
Is there some summary or alternative version of this in English?
Oh. Several German speakers gave their talks in English and I seemed to remember this as one of them – I guess the title misled me. There is no transcript or notes that I know of… I’m very sorry for the noise.
Summary: daxim took the Unicode Consortium’s request for comments as an opportunity to ask for the removal of this case folding rule whereupon they referred him to http://www.unicode.org/faq/casemap_charprop.html#11 and said the Consortium does not create such rules but follows official orthography. Unfortunately the bodies connected to that in Germany are essentially prescriptivist – the Duden editors\, the Council for German Orthography\, etc. – so it’s not realistic to expect change from there\, meaning that this screwiness will be part of Unicode for the foreseeable future. So he went to implement it himself and that turned out to take just a screenful of thin wrapping around Unicode::Casing and s///\, which he called Lingua::DEU::Casing::Sharp_s (I don’t see it on CPAN though).
Regards\, -- Aristotle Pagaltzis // \<http://plasmasturm.org/>
"Kevin Brannen via RT" \perlbug\-followup@​perl\.org wrote: :I think this is the same bug\, but I can give you a much simpler example of failure. : :$ perl -v : :This is perl 5\, version 18\, subversion 2 (v5.18.2) built for x86_64-linux-gnu-thread-multi :(with 41 registered patches\, see perl -V for more detail) : :$ cat try3.pl :#!/usr/bin/perl : :my $code = 'SELECT'; :$code =~ s/(?\<!CLASSSY0B )\bALTER\b/xyz/igs; : :$ perl try3.pl :Variable length lookbehind not implemented in regex m/(?\<!CLASSSY0B )\bALTER\b/ at try3.pl line 4. : :There is obviously no variable parts anywhere in that regex\, yet we get a failure. Interestingly\, if you remove the "i" qualifier on the end\, then the error goes away. Yes\, I do need that "i" qualifier. :)
I can reduce this to: % ./perl -ce '/(?\<!SS)/i' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. % ./perl -ce '/(?\<!SS)/iaa' -e syntax OK % on blead.
This looks like an intentional change - except in the "superAscii" mode give by /aa\, the /ss/i should also match the case-folded Eszett character (http://en.wikipedia.org/wiki/Eszett).
Karl\, can you confirm? Are there other useful workarounds?
Hugo
On 05/17/2014 03:59 PM\, hv@crypt.org wrote:
"Kevin Brannen via RT" \perlbug\-followup@​perl\.org wrote: :I think this is the same bug\, but I can give you a much simpler example of failure. : :$ perl -v : :This is perl 5\, version 18\, subversion 2 (v5.18.2) built for x86_64-linux-gnu-thread-multi :(with 41 registered patches\, see perl -V for more detail) : :$ cat try3.pl :#!/usr/bin/perl : :my $code = 'SELECT'; :$code =~ s/(?\<!CLASSSY0B )\bALTER\b/xyz/igs; : :$ perl try3.pl :Variable length lookbehind not implemented in regex m/(?\<!CLASSSY0B )\bALTER\b/ at try3.pl line 4. : :There is obviously no variable parts anywhere in that regex\, yet we get a failure. Interestingly\, if you remove the "i" qualifier on the end\, then the error goes away. Yes\, I do need that "i" qualifier. :)
I can reduce this to: % ./perl -ce '/(?\<!SS)/i' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. % ./perl -ce '/(?\<!SS)/iaa' -e syntax OK % on blead.
This looks like an intentional change - except in the "superAscii" mode give by /aa\, the /ss/i should also match the case-folded Eszett character (http://en.wikipedia.org/wiki/Eszett).
Karl\, can you confirm? Are there other useful workarounds?
Hugo
Yes it follows the unicode standard for better or worse. I don't know of any other workarounds\, other than what I already mentioned on this thread
use re '/aa';
Karl Williamson \public@​khwilliamson\.com wrote: :Yes it follows the unicode standard for better or worse. I don't know :of any other workarounds\, other than what I already mentioned on this thread : :use re '/aa';
Ah sorry\, I'd missed the earlier followups on this ticket due to a full disk.
Hugo
On Sat\, May 17\, 2014 at 10:59:19PM +0100\, hv@crypt.org wrote:
"Kevin Brannen via RT" \perlbug\-followup@​perl\.org wrote: :I think this is the same bug\, but I can give you a much simpler example of failure. : :$ perl -v : :This is perl 5\, version 18\, subversion 2 (v5.18.2) built for x86_64-linux-gnu-thread-multi :(with 41 registered patches\, see perl -V for more detail) : :$ cat try3.pl :#!/usr/bin/perl : :my $code = 'SELECT'; :$code =~ s/(?\<!CLASSSY0B )\bALTER\b/xyz/igs; : :$ perl try3.pl :Variable length lookbehind not implemented in regex m/(?\<!CLASSSY0B )\bALTER\b/ at try3.pl line 4. : :There is obviously no variable parts anywhere in that regex\, yet we get a failure. Interestingly\, if you remove the "i" qualifier on the end\, then the error goes away. Yes\, I do need that "i" qualifier. :)
I can reduce this to: % ./perl -ce '/(?\<!SS)/i' Variable length lookbehind not implemented in regex m/(?\<!SS)/ at -e line 1. % ./perl -ce '/(?\<!SS)/iaa' -e syntax OK % on blead.
This looks like an intentional change - except in the "superAscii" mode give by /aa\, the /ss/i should also match the case-folded Eszett character (http://en.wikipedia.org/wiki/Eszett).
Karl\, can you confirm? Are there other useful workarounds?
I guess that /(?\<!(?-i:SS|ss))/ is a workaround too\, but I'm not going to assess its usefulness.
Abigail
On Sat May 17 18:37:30 2014\, public@khwilliamson.com wrote:
Yes it follows the unicode standard for better or worse. I don't know of any other workarounds\, other than what I already mentioned on this thread
use re '/aa';
To help others searching for this later\, the final solution I came up with after lots of reading was:
use if $] >= 5.016\, re => '/aa';
The program that uses this can be run on a number of machines\, so the versions of perl can not be totally controlled. I'm sure some of the really old ones wouldn't have that "if" pragma\, but it's there even on our old 5.8 servers which should be good enough for most of us and you have to draw the line somewhere.
Kevin
On 05/19/2014 04:17 PM\, Kevin Brannen via RT wrote:
On Sat May 17 18:37:30 2014\, public@khwilliamson.com wrote:
Yes it follows the unicode standard for better or worse. I don't know of any other workarounds\, other than what I already mentioned on this thread
use re '/aa';
To help others searching for this later\, the final solution I came up with after lots of reading was:
use if $] >= 5.016\, re => '/aa';
The program that uses this can be run on a number of machines\, so the versions of perl can not be totally controlled. I'm sure some of the really old ones wouldn't have that "if" pragma\, but it's there even on our old 5.8 servers which should be good enough for most of us and you have to draw the line somewhere.
Kevin
--- via perlbug: queue: perl5 status: open https://rt-archive.perl.org/perl5/Ticket/Display.html?id=120600
This gave me the idea to add text to perldiag to clue people in about this issue. Also \K can be used to work around the problem for positive lookbehind assertions. This has now been pushed to blead as d0a29c363d313dc91fc5bfe71f7a5c525acfed03 If you don't like my wording\, patches welcome
On Thu Nov 21 16:37:48 2013\, demerphq wrote:
FWIW\, I am not super happy with this implementation. We should have a flag for all forms of recursion\, and use that to decide if we need to allocate the "recursed" bitmap. As is we create it for every branch\, which is bad. I expected RExC_seen_recursed to be useful for this\, but it stubbornly wasnt\, and I didnt have time to dig further.
It fixes the bug however\, and if someone doesnt get to it first I will try to improve it over the weekend.
This ticket is listed in perl5200delta. Is there still work to be done or can it be closed?
Yes\, this ticket can be closed\, and I am so doing.
The second part involved the German U+DF that folds to 'ss'. That is covered by [perl #132367] - Karl Williamson
@khwilliamson - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#120600 (status was 'resolved')
Searchable as RT120600$