jhannemann commented 4 years ago

When set to UNICODE, the output log uses the JS String.fromCharCode(value) function.

This only works correctly for the Base Multilingual Plane (BMP). It would be cool to be able to output a surrogate pair (e.g a sequence of 2 words/4 bytes) to represent emojis such as this one 🙂 (UTF-16 encoding d83dde42). Alternatively, UTF-8 would be even cooler, as it's more widely used. The benefit would be to demonstrate to students how multi-byte encodings work. Null-Lobur is terribly behind in terms of Unicode (basically saying that UCS-2 is still a thing). In any case, even if you think this is not worth doing, please make sure that students understand that Unicode in this case means Unicode in the UTF-16 encoding, e.g. by labeling the output mode drop-down as Unicode (UTF-16). See pull request #291

auroranil commented 4 years ago

Pull request #291 resolves this issue. I have bumped version to v1.3.0, and deployed changes to gh-pages branch.

jhannemann commented 4 years ago

Thanks for accepting the pull request. Next up, I'll have a look at the input. This is a bit more complex, but I may have an idea on how to handle this.

Let me commend you on the high quality of the code. It was really easy to find the exact point to add my changes to, and there were no unexpected side effects. The whole code review process by you guys was just exemplary and outstanding. I'll definitely use this example in my upcoming computer organization class.

MARIE-js / MARIE.js

Unicode Output (UTF-16) is not handled correctly #288

When set to UNICODE, the output log uses the JS String.fromCharCode(value) function.