ratel-rust / ratel-core

High performance JavaScript to JavaScript compiler with a Rust core
Apache License 2.0
435 stars 17 forks source link

[AST] Unicode Support #67

Closed LuoZijun closed 6 years ago

LuoZijun commented 6 years ago

Hi, the parser is not ready for parse unicode, right ?

The master branch:

The rewrite branch:

JavaScript Unicode Name Example:


这是一个名称 = "世界 ( World )!";

console.log(这是一个名称.length);
console.log(`Hello, ${这是一个名称}`);
maciejhirsz commented 6 years ago

Hey! You can try the above snippet in http://maciej.codes/ratel-wasm/ and see that it works.

It's not really correct, the lexer reads any sequence of non-ASCII bytes as identifiers, so technically any set of unicode characters not separated by permitted ASCII in identifiers, will produce an identifier token. To be really compliant with the specs we would have to make sure that the unicode we accept as identifiers are letter characters (the rewrite branch as it is checks it for the first character).

LuoZijun commented 6 years ago

I'm so sorry for that, i will close this.