lvapeab / m4loc

Automatically exported from code.google.com/p/m4loc
GNU Lesser General Public License v3.0
0 stars 0 forks source link

mod_tokenizer.pl - problem with 'wide' utf-8 characters #25

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
1. cat oo
DdNé_Local Corrections
2. ./modtokenizer < oo
Wide character in print at /home/moses/m4loc/xliff/./mod_tokenizer.pl line 138, 
<STDIN> line 1.

This was not issue in "file-based" approach, therefore I suppose problem is 
just a way how the string coming to LibXML::Reader encoded (probably 
http://perldoc.perl.org/Encode.html). I'll elaborate on this the next week.

Tomas

Original issue reported on code.google.com by xhu...@gmail.com on 15 Jul 2011 at 12:21

GoogleCodeExporter commented 9 years ago

Original comment by xhu...@gmail.com on 15 Jul 2011 at 12:23

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r88.

Original comment by Achi...@gmail.com on 15 Jul 2011 at 6:29