Open GoogleCodeExporter opened 8 years ago
Yeah. The problem is that JavaScript regular expressions don't have unicode
character classes so to handle all non-Latin letters/digits and even the Latin
ligatures, I have to list every contiguous code-point ranges.
I've been loathe to do that since it means shipping a lot of code that is
rarely used, but I haven't actually tried to quantify the amount of code that
would be required. I'll look into it.
Original comment by mikesamuel@gmail.com
on 7 Mar 2013 at 1:58
[deleted comment]
There is this http://xregexp.com/ that extends the javascript regexes with
unicode character classes. You have to use the Unicode Base 1.0.0 to have the
Letter category (under addons)
Original comment by xanato...@gmail.com
on 7 Mar 2013 at 2:46
Are there any plans for this issue? This is an important part of translating
code for educational purposes. I'm tempted to replace the regex with a much
more lenient one in my local version
Original comment by john.gralyan@gmail.com
on 5 Apr 2015 at 9:20
Original issue reported on code.google.com by
nneon...@gmail.com
on 7 Mar 2013 at 7:43