Unicode escape sequence in string literal is decoded incorrectly

oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one

Other

74.35k stars 2.78k forks source link

What version of Bun is running?

1.1.36

What platform is your computer?

Microsoft Windows NT 10.0.19045.0 x64

What steps can reproduce the bug?

repro.js:

const s = "\uFFFF";

console.log("length:", s.length);

for (let i = 0, ch; !isNaN(ch = s.charCodeAt(i)); i++) {
  console.log("\\u" + ch.toString(16).padStart(4, '0'));
}

bun run repro.js

What is the expected behavior?

length: 1
\uffff

What do you see instead?

length: 1
\udbff
\udfff

Additional information

According to my tests, 1.1.34 decodes the escape sequence correctly. So this is a regression introduced in 1.1.35 (it's present in 1.1.36 as well).

Both "\uFFFF" and "\u{FFFF}" are decoded incorrectly.

❯ bun-1.1.34 build --target=bun a.js // @bun // a.js var s = "\uFFFF"; console.log("length:", s.length); for (let i = 0, ch;!isNaN(ch = s.charCodeAt(i)); i++) { console.log("\\u" + ch.toString(16).padStart(4, "0")); }

❯ bun build --target=bun a.js // @bun // a.js var s = "\uDBFF\uDFFF"; console.log("length:", s.length); for (let i = 0, ch;!isNaN(ch = s.charCodeAt(i)); i++) { console.log("\\u" + ch.toString(16).padStart(4, "0")); }

oven-sh / bun