oftn-oswg / coca

An implementation of C in JavaScript.
Other
10 stars 3 forks source link

Tokenizer.prototype.codes_to_string needs to do UTF-16 encoding #6

Closed dsamarin closed 13 years ago

dsamarin commented 13 years ago

Some of the Unicode code points will be larger than 0xFFFF which String.fromCharCode can't handle. The array must first be traversed and characters larger than 0xFFFF need to be broken up into 2 elements with this formula:

var hi, lo;
hi = Math.floor((ch - 0x10000) / 0x400) + 0xD800;
lo = ((ch - 0x10000) % 0x400) + 0xDC00;