melange-re / melange

A mixture of tooling combined to produce JavaScript from OCaml & Reason
https://melange.re
Other
846 stars 52 forks source link

Support regular OCaml unicode string literals #1141

Closed jchavarri closed 5 months ago

jchavarri commented 5 months ago

I understand in Melange the way to tell the compiler that some string is a "JavaScript string" (unicode encoding) is using quoted strings with ids j and js.

In the context of writing universal code, where the preprocessing for these quoted strings is not available, I tried to use the regular OCaml way to define unicode literals, e.g:


let t = "\u{1F42B}"
let u = String.length "\u{1F42B}"
let () = print_endline t

But this doesn't work as expected, as the output is quite strange:

const t = "\xf0\x9f\x90\xab";

console.log(t);

const u = 4;

Playground

I wonder if this is some fundamental limitation of the way Melange interop with JS?

I believe this issue is similar to what was referred to in this comment: https://github.com/rescript-lang/rescript-compiler/issues/802#issuecomment-364916644

jchavarri commented 5 months ago

This seems to have been fixed in rescript,

Input:

let t = "\u{1F42B}"

Js.log(t)

Output:

var t = "\u{1F42B}";

console.log(t);

playground

jchavarri commented 5 months ago

This won't be possible to fix without breaking compatilibity with OCaml, see #1143 for details.