oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
74.35k stars 2.78k forks source link

Unicode escape sequence in string literal is decoded incorrectly #15326

Closed adams85 closed 3 hours ago

adams85 commented 8 hours ago

What version of Bun is running?

1.1.36

What platform is your computer?

Microsoft Windows NT 10.0.19045.0 x64

What steps can reproduce the bug?

repro.js:

const s = "\uFFFF";

console.log("length:", s.length);

for (let i = 0, ch; !isNaN(ch = s.charCodeAt(i)); i++) {
  console.log("\\u" + ch.toString(16).padStart(4, '0'));
}
bun run repro.js

What is the expected behavior?

length: 1
\uffff

What do you see instead?

length: 1
\udbff
\udfff

Additional information

According to my tests, 1.1.34 decodes the escape sequence correctly. So this is a regression introduced in 1.1.35 (it's present in 1.1.36 as well).

Both "\uFFFF" and "\u{FFFF}" are decoded incorrectly.

Jarred-Sumner commented 7 hours ago

Transpiler output:

const s = "\uDBFF\uDFFF";
console.log("length:", 1);
for (let i = 0, ch;!isNaN(ch = "\uDBFF\uDFFF".charCodeAt(i)); i++)
  console.log("\\u" + ch.toString(16).padStart(4, "0"));
❯ bun-1.1.34 build --target=bun a.js
// @bun
// a.js
var s = "\uFFFF";
console.log("length:", s.length);
for (let i = 0, ch;!isNaN(ch = s.charCodeAt(i)); i++) {
  console.log("\\u" + ch.toString(16).padStart(4, "0"));
}
❯ bun build --target=bun a.js
// @bun
// a.js
var s = "\uDBFF\uDFFF";
console.log("length:", s.length);
for (let i = 0, ch;!isNaN(ch = s.charCodeAt(i)); i++) {
  console.log("\\u" + ch.toString(16).padStart(4, "0"));
}