erkyrath / quixe

A Glulx VM interpreter written in Javascript
http://eblong.com/zarf/glulx/
MIT License
169 stars 33 forks source link

Fix handling of emoji (high-plane Unicode) #17

Open erkyrath opened 8 years ago

erkyrath commented 8 years ago

Comment from glkapi.js:

Some places in the library get confused about Unicode characters beyond 0xFFFF. They are handled correctly by streams, but grid windows will think they occupy two characters rather than one, which will throw off the grid spacing.

Also, the glk_put_jstring() function can't handle them at all. Quixe printing operations that funnel through glk_put_jstring() -- meaning, most native string printing -- will break up three-byte characters into a UTF-16-encoded pair of two-byte characters. This will come out okay in a buffer window, but it will again mess up grid windows, and will also double the write-count in a stream.

curiousdannii commented 8 years ago

https://mathiasbynens.be/notes/javascript-unicode#accounting-for-astral-symbols

Assuming you can't move to ES6 yet, using punycode.js is probably your safest and easiest option. The minified version is only 2.74kb.

erkyrath commented 8 years ago

Thanks.

I should probably flip the grid-window handling over to work on int arrays, which will solve that problem. glk_put_jstring is the messy case.