nayakgi / perl-compiler

Automatically exported from code.google.com/p/perl-compiler
Other
0 stars 0 forks source link

Regex search and replace on utf8 characters doesn't work as expected. #333

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
$>perl -e'use encoding "utf8"; my @hiragana =  map {chr} 
ord("ぁ")..ord("ん"); my @katakana =  map {chr} ord("ァ")..ord("ン"); my 
$hiragana = join(q{} => @hiragana); my $katakana = join(q{} => @katakana); my 
%h2k; @h2k{@hiragana} = @katakana; $str = $hiragana; $str =~ 
s/([ぁ-ん])/$h2k{$1}/go; $str eq $katakana and print "ok\n"; print 
"$hiragana\n"; print "$katakana\n";'
ok
ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすず��
�ぜそぞただちぢっつづてでとどなにぬねのはばぱひびぴふ�
��ぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろゎわゐ
ゑをん
ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズ��
�ゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフ�
��プヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰ
ヱヲン

$>perlcc -r -O3 -e'use encoding "utf8"; my @hiragana =  map {chr} 
ord("ぁ")..ord("ん"); my @katakana =  map {chr} ord("ァ")..ord("ン"); my 
$hiragana = join(q{} => @hiragana); my $katakana = join(q{} => @katakana); my 
%h2k; @h2k{@hiragana} = @katakana; $str = $hiragana; $str =~ 
s/([ぁ-ん])/$h2k{$1}/go; $str eq $katakana and print "ok\n"; print 
"$hiragana\n"; print "$katakana\n";'        
Wide character in print at -e line 1.
ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすず��
�ぜそぞただちぢっつづてでとどなにぬねのはばぱひびぴふ�
��ぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろゎわゐ
ゑをん
Wide character in print at -e line 1.
ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズ��
�ゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフ�
��プヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰ
ヱヲン

Original issue reported on code.google.com by todd.e.rinaldo on 9 May 2014 at 10:33

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago

Original comment by reini.urban on 13 May 2014 at 6:20

GoogleCodeExporter commented 9 years ago
This issue was closed by revision 5b5616d73841.

Original comment by reini.urban on 13 May 2014 at 11:57