ghcjs / ghcjs-base

base library for GHCJS for JavaScript interaction and marshalling, used by higher level libraries like JSC
MIT License
45 stars 67 forks source link

Data.JSString incompatible with Data.Text for certain code points #126

Open ali-abrar opened 5 years ago

ali-abrar commented 5 years ago

Data.Text performs a replacement on code points ["\55296" .. "\57343"] during certain operations (see this issue). Data.JSString, which uses JavaScript's builtin fromCodePoint (available as of ECMAScript 6) does not do this replacement.

Here's the behavior of Data.Text:

Prelude T> T.pack ['\xD800']
"\65533"
Prelude T> print '\xD800'
'\55296'
Prelude T> T.unpack (T.pack ['\xD800'])
"\65533"

As you can see, it doesn't roundtrip. By comparison, fromCodePoint and codePointAt in JavaScript do roundtrip on this value.

ali-abrar commented 4 years ago

This is the replacement Data.Text does: https://hackage.haskell.org/package/text-1.2.4.0/docs/Data-Text-Internal.html#v:safe

mightybyte commented 4 years ago

Can confirm that this is an issue that is impacting a production codebase. Would be great if we could get a fix.