Open remcohaszing opened 2 weeks ago
Hi! Could you possibly provide some example code to reproduce this? Preferably code that has been compiled into plain JS.
I would absolutely love to. Unfortunately the bug is reproduced by running tsc
. Neither I nor the TypeScript team have been able to make a small reproduction.
It was caused by https://github.com/microsoft/TypeScript/pull/53081 and fixed by https://github.com/microsoft/TypeScript/pull/58339 (unreleased yet).
AFAICT this doesn't seem like an issue with Node.js itself, but rather a compiler (such as tsc
), but I'm no expert, and I'd love to get a second opinion.
Sorry, I now see I forgot to provide a critical piece of information. This is a regression in Node.js 22.0.0. It wasn’t a problem before.
Ahh, okay, thank you.
@targos I have a feel we rushed the V8 upgrades.
cc @nodejs/v8 @RafaelGSS
It's probably related to V8, but I'm not sure waiting would have changed anything? We released v22.0.0 with the version of V8 that's in current Chrome.
Seems specific to Linux or x64 as I cannot reproduce on ARM64 macOS.
We also don't know which version of V8 introduced the bug (assuming it's in V8).
So, it's specific to x64. I can reproduce with node-v22.0.0-darwin-x64
on Rosetta.
I'm going to compile a debug build on one of the Hetzner machines to get a meaningful stack trace.
I’m on macOS and repro it consistently btw
@woorm ARM or Intel?
It's probably related to V8,
The code that started/stopped crashing in TS had do to with indexing into strings. The string that TS was looking into starts/stops crashing when there is/isn’t an emoji.
One TS maintainer potentially saw the crash appear/disappear when adding a console.log
somewhere. So it may be related to some optimization routine
I’m on an 2.6 GHz 6-Core Intel Core i7, Sonoma 14.4.1 (23E224). I have a bunch of the GNU utils tho. So perhaps there could be something that doesn’t happen on mac normally, but my machine looks more like Linux.
Ignore the repro-exists tag, I didn't mean to add it, and it won't effect anything.
Just to give a side by side using https://github.com/remcohaszing/typescript-bug-58369:
$ grep -A8 'function scanJSDocCommentTextToken' ./node_modules/typescript/lib/tsc.js
function scanJSDocCommentTextToken(inBackticks) {
fullStartPos = tokenStart = pos;
tokenFlags = 0 /* None */;
if (pos >= end) {
return token = 1 /* EndOfFileToken */;
}
for (let ch = text.charCodeAt(pos); pos < end && (!isLineBreak(ch) && ch !== 96 /* backtick */); ch = codePointAt(text, ++pos)) {
if (!inBackticks) {
if (ch === 123 /* openBrace */) {
$ node ./node_modules/typescript/lib/tsc.js
[1] 1090384 segmentation fault node ./node_modules/typescript/lib/tsc.js
Now, add debugger
to the loop (console.log works too but is loud):
$ grep -A8 'function scanJSDocCommentTextToken' ./node_modules/typescript/lib/tsc.js
function scanJSDocCommentTextToken(inBackticks) {
fullStartPos = tokenStart = pos;
tokenFlags = 0 /* None */;
if (pos >= end) {
return token = 1 /* EndOfFileToken */;
}
for (let ch = text.charCodeAt(pos); pos < end && (!isLineBreak(ch) && ch !== 96 /* backtick */); ch = codePointAt(text, ++pos)) {
debugger; // ADDED
if (!inBackticks) {
$ node ./node_modules/typescript/lib/tsc.js
I have not been able to extract out a test which just calls the parser via the public API, nor by extracting this code and giving it the same inputs.
Weird, thanks for the information!
(gdb) run node_modules/typescript/lib/tsc.js
Starting program: /home/iojs/tmp-targos/node/out/Debug/node node_modules/typescript/lib/tsc.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7a4f640 (LWP 18826)]
[New Thread 0x7ffff724e640 (LWP 18827)]
[New Thread 0x7ffff6a4d640 (LWP 18828)]
[New Thread 0x7ffff624c640 (LWP 18829)]
[New Thread 0x7ffff5a4b640 (LWP 18830)]
[New Thread 0x7ffff51a9640 (LWP 18831)]
Thread 1 "node" received signal SIGSEGV, Segmentation fault.
0x00005554d878345d in ?? ()
(gdb) bt
#0 0x00005554d878345d in ?? ()
#1 0x00001967bd580c69 in ?? ()
#2 0x000000000000200e in ?? ()
#3 0x0000200e00000000 in ?? ()
#4 0x0000000000000000 in ?? ()
Perfect 🙃
Just to note it, you can also add noop();
instead of debugger;
if you want a debugger-statement free crasher.
This is due to the Maglev compiler. I confirm that ./configure --v8-disable-maglev
fixes it.
Maglev was enabled in https://github.com/nodejs/node/pull/51360
/cc @kvakil
/cc @victorgomes
I'm rebuilding with ASan and GCC stack protection flags to see if it helps to pinpoint the issue
GCC failed to build so I switched to clang. Here's a bit more information:
I don't know what else to do at this point. Happy to run more commands if you have any idea.
--no-maglev-inlining
also fixes it.
I also tried the repro with:
They all segfault so I don't think it's a V8 regression, but really a Maglev inlining bug.
I submitted a V8 bug report: https://issues.chromium.org/issues/338535750
It looks very similar to https://issues.chromium.org/issues/42204637
Well, there is a v8 option called --print-opt-source
which can print source code of optimized and inlined functions.
The problem occurs when v8 optimizes the isLineBreak
js function in tsc.js.
I changed the function to this and the problem is gone.
// tsc.js
function isLineBreak(ch) {
return ch === 10 /* lineFeed */ || ch === 13 /* carriageReturn */ ;
}
gdb --args ~/tannalwork/projects/node/node_g --print-opt-source ./node_modules/typescript/lib/tsc.js
--- FUNCTION SOURCE (/home/tannal/tannalwork/projects/node/out/typescript-bug-58369/node_modules/.pnpm/typescript@5.4.5/node_modules/typescript/lib/tsc.js:isLineBreak) id{16,-1} start{744101} ---
(ch) {
return ch === 10 /* lineFeed */ || ch === 13 /* carriageReturn */ || ch === 8232 /* lineSeparator */ || ch === 8233 /* paragraphSeparator */;
}
--- END ---
Thread 1 "node_g" received signal SIGSEGV, Segmentation fault.
0x00005554dbb1115d in ?? ()
@targos Could you try to run the test with maglev and with pointer compression enabled?
@victorgomes It also crashes with https://unofficial-builds.nodejs.org/download/release/v22.1.0/node-v22.1.0-linux-x64-pointer-compression.tar.xz
FWIW, I faced this issue on macOS 13 (Intel Mac) as well in my project. My Node.js version is v22.1.0 (installed via Homebrew). Here is a reproduction (sorry I don't have enough time to minimize it):
npm install
and then npm run build
> remark-emoji@4.0.1 build
> tsc -p .
[1] 19446 segmentation fault npm run build
@remcohaszing your issue also involved emojis, right?
The original issue is TypeScript processing @types/mdast
, which is used by remark plugins such as remark-emoji
. So this is the exact same issue.
Version
v22.0.0
Platform
Linux vali 6.8.0-76060800daily20240311-generic #202403110203~1714077665~22.04~4c8e9a0 SMP PREEMPT_DYNAMIC Thu A x86_64 x86_64 x86_64 GNU/Linux
Subsystem
No response
What steps will reproduce the bug?
On Linux using Node.js 22:
See also this failed GitHub action: https://github.com/remcohaszing/typescript-bug-58369/actions/runs/8899456400/job/24438867767
How often does it reproduce? Is there a required condition?
For this reproduction it’s reproduced consistently on Linux on both my machine and GitHub actions.
While troubleshooting by trimming down the content of
node_modules/@types/mdast/index.d.ts
, I got into a state where it seemed to happen randomly. The major factor is the👉
emoji in a comment.The error did not occur on macOS in the GitHub action, but it did happen consistently for @wooorm on their macbook.
The problem was not reproducible on Windows.
What is the expected behavior? Why is that the expected behavior?
No segmentation fault
What do you see instead?
Additional information
This was originally reported to TypeScript: https://github.com/microsoft/TypeScript/issues/58369. This issue contains more information.
This has coincidentally already been fixed for the upcoming TypeScript 5.5. Still, a segfault should not occur.
We were unable to make a smaller reproduction.