rubychan / coderay

Fast and easy syntax highlighting for selected languages, written in Ruby.
http://coderay.rubychan.de/
Other
846 stars 115 forks source link

JS scanner considers Cyrillic and Chinese variable names as errors #172

Open JasonBarnabe opened 10 years ago

JasonBarnabe commented 10 years ago

This runs with no problem:

var 動 = 1;
var Ы = 2;
alert(動+Ы);

But CodeRay thinks there's errors. (Hey, so does Github!)

korny commented 10 years ago

You're right, the JavaScript scanner allows only the standard (Englisch) ASCII alphabet. But Firefox and Safari both accept your code. I'm going to change this in 1.2.

korny commented 10 years ago

Mozilla writes:

Starting with JavaScript 1.5, you can use ISO 8859-1 or Unicode letters such as å and ü in identifiers. You can also use the \uXXXX Unicode escape sequences as characters in identifiers.

Pygments doesn't agree:

\u1212 = 4
\u1212(\u1212)

Also:

JavaScript 1.5 was introduced back in 1999.

Wow, I missed that :P

JasonBarnabe commented 10 years ago

FYI, http://mathiasbynens.be/notes/javascript-identifiers#valid-identifier-names

CoolCmd commented 10 years ago

P.S. HTML "id" attributes and CSS class names also allows unicode characters. Very few people know about that.

JasonBarnabe commented 10 years ago

https://github.com/rubychan/coderay/pull/174 for a fix.