Closed p5pRT closed 21 years ago
Depending on the contents of a string and a regexp\, matching does not always succeed. Please see the following:
#!/usr/sw/perl/default/bin/perl
my ($s\, $re);
$s = chr(4711) . chr(200) . chr(4711) . chr(200); $re = chr(200) . '.' . chr(200);
if ($s =~ m/$re/) { print "ok\n"; } else { print "fail\n"; }
$re .= chr(4711); chop($re);
if ($s =~ m/$re/) { print "ok\n"; } else { print "fail\n"; } fail ok
I think that the regular expression matching code should look at the string comprising the regexp and at the string being matched against and make sure that they are both in the same encoding.
In the meantime\, maybe there is a way for me to (efficiently) frob the regular expression and/or string to ensure both are in UTF-8 encoding?
Thanks\, Kai
On Wed\, 13 Feb 2002 15:37:11 +0100 (CET)\, grossjoh@lothlorien.cs.uni-dortmund.de (Kai Grossjohann) said:
> This is a bug report for perl from Kai.Grossjohann@cs.uni-dortmund.de\, > generated with the help of perlbug 1.33 running under perl v5.6.1.
> ----------------------------------------------------------------- > [Please enter your report here]
> Depending on the contents of a string and a regexp\, matching does not > always succeed. Please see the following:
> #!/usr/sw/perl/default/bin/perl
> my ($s\, $re);
> $s = chr(4711) . chr(200) . chr(4711) . chr(200);
I suppose you meant
$s = chr(200) . chr(4711) . chr(200) . chr(4711);
otherwise I cannot confirm your bugreport for any perl. But for the case I presume I can confirm for 5.6.1
This particular bug has been fixed in the current development branch of perl\, somewhere between patch 8130 and 8375. I have the impression it was not a single patch that fixed the bug. Anyway\, this was long before 5.7.1 and 5.7.2 came out. If you can switch to bleadperl (see man perlhack for download locations)\, please do\, it has many UTF-8 bugs fixed and only few bugs open. Otherwise I fear you should avoid everything related to Unicode in 5.6.1.
-- andreas
andreas.koenig@anima.de (Andreas J. Koenig) writes:
This particular bug has been fixed in the current development branch of perl\, somewhere between patch 8130 and 8375. I have the impression it was not a single patch that fixed the bug. Anyway\, this was long before 5.7.1 and 5.7.2 came out. If you can switch to bleadperl (see man perlhack for download locations)\, please do\, it has many UTF-8 bugs fixed and only few bugs open. Otherwise I fear you should avoid everything related to Unicode in 5.6.1.
Thanks a lot for that hint. I wonder whether I can get the local Perl guru to install that...
But it's not absolutely necessary -- I found a way to encode my data which does not rely on Unicode.
It's always great to hear a bug has been fixed :-)
kai -- ~/.signature is: umop 3p!sdn (Frank Nobis)
This bug has been resolved by Perl 5.8.0\, marking the problem ticket as resolved.
@jhi - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#8516 (status was 'resolved')
Searchable as RT8516$