martynsmith / node-irc

NodeJS IRC client library
GNU General Public License v3.0
1.33k stars 424 forks source link

`\u200B` interpreted as `​` even though encoding set to `utf-8` #537

Open anirbanmu opened 5 years ago

anirbanmu commented 5 years ago

What I'm seeing is that when someone copy-pastes a zero width space into IRC, the text that is seen in my callback has ​ instead of the actual zero width space (https://en.wikipedia.org/wiki/Zero-width_space). Maybe this is somehow expected, but my IRC client interprets the text correctly & is sent over the network correctly preserving the zero-width space.

Any idea what's happening?

EDIT: I played around in the node shell, and I can assign a string with a zero-space width to a variable - if I do a charCodeAt(i) at the index for the zero-space width I get the correct code 8203. I'd have expected the same thing to be true when I see the same string coming from IRC.