jackxiao / jslibs

Automatically exported from code.google.com/p/jslibs
0 stars 0 forks source link

Implement string.charlength #85

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
For awhile I've been considering how to go about grabbing an actual list of
characters within a string, because strings are not actually UTF-8, using
'♥'.length; returns 3.
First note of course is that .length should never be redefined from binary
length to character length.

I was browsing some SpiderMonkey information today, and I found out that
SpiderMonkey already has UTF-8 support and a number of Unicode variants of
internal functions. Theoretically it should be possible to implement a
.charlength which returns the number of unicode characters in a string.

Original issue reported on code.google.com by nadir.se...@gmail.com on 19 Feb 2009 at 8:18

GoogleCodeExporter commented 9 years ago
I will have a closer look at the UTF-8 implementation in SpiderMonkey, but as 
far as
I recall, their implementation of UTF-8 is not well maintained.

BTW, I support many encoding through the jsiconv module:
http://code.google.com/p/jslibs/source/browse/trunk/src/jsiconv/iconv.cpp

Original comment by sou...@gmail.com on 20 Feb 2009 at 1:00

GoogleCodeExporter commented 9 years ago
eg.
  Print( new Iconv('UTF-8', 'ISO-8859-1')('é').length ); // prints 2

Original comment by sou...@gmail.com on 1 Mar 2009 at 9:39

GoogleCodeExporter commented 9 years ago

Original comment by sou...@gmail.com on 11 Nov 2009 at 2:35