correij / v8-i18n

Automatically exported from code.google.com/p/v8-i18n
Other
0 stars 0 forks source link

Incorrect word and character count using Intl.v8BreakIterator for oriental languages #36

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Platform : Mac OS X 10.11.3
Chrome Browser Version: 48.0.2564.103 (64-bit)

We used Intl.v8BreakIterator to count the number of words and characters in 
text using Intl.v8BreakIterator. 
Please see https://jsfiddle.net/aq0znpLz/ showing a sample.
This sample has text of different languages. As you see, respective locale is 
set while configuring the break iterator.
We referred https://code.google.com/p/v8-i18n/wiki/BreakIterator 

What is the expected output? 
The word and character count result should be as expected after setting the 
respective locale even for oriental languages Chinese, Thai and Japanese.

What do you see instead?
For oriental languages where the separator is NOT space, the word count and in 
some case even the character count is incorrect. For space separated languages, 
the counts are correct

Original issue reported on code.google.com by fameeda....@synerzip.com on 4 Feb 2016 at 7:00