nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
106.67k stars 29.1k forks source link

WASI.start() intermittently causing a crash when a pre-compiled module is reused several times #52402

Open rotemdan opened 5 months ago

rotemdan commented 5 months ago

Version

v20.12.1, v21.7.2

Platform

Microsoft Windows NT 10.0.22631.0 x64

Subsystem

WASI

What steps will reproduce the bug?

Download the project, containing the files: flite.wasm, test.js, package.json:

flite-wasi-test.zip

test.js contains:

import { readFile  } from 'fs/promises'

async function startWasiTest(fliteModuleObject) {
    const { WASI } = await import('wasi')

    const outFileName = 'out.wav'

    const preopens = {
        '.': '.',
    }

    const text = `Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.`

    const wasi = new WASI({
        version: 'preview1', // Doesn't seem to matter if set to 'unstable'

        env: {
        },

        args: ['--',
            '-voice', 'slt',

            ` ${text} `,

            outFileName
        ],

        preopens,

        returnOnExit: true
    })

    const importObject = { wasi_snapshot_preview1: wasi.wasiImport }

    const instance = await WebAssembly.instantiate(fliteModuleObject, importObject)

    const exitCode = wasi.start(instance)

    console.log(`Exit code: ${exitCode}`)
}

async function getFliteWasiModuleObject() {
    const fliteWasiPath = './flite.wasm'
    const fliteModuleObject = await WebAssembly.compile(await readFile(fliteWasiPath))

    return fliteModuleObject
}

async function start() {
    const fliteModuleObject = await getFliteWasiModuleObject()

    for (let i = 1; i <= 1000; i++) {
        console.log(`Iteration ${i}:`)

        await startWasiTest(fliteModuleObject)

        console.log(``)
    }
}

start()

Run with node test.js

How often does it reproduce? Is there a required condition?

Intermittent: requires a varying number of iterations on v20.12.1 and v21.7.2. Can range between 3 and 150 iterations at times.

In my own code, it seems to happen much often, about 25 - 50% of the time, in much less iterations (usually 2).

Does not seem to happen on 18.20.1

What is the expected behavior? Why is that the expected behavior?

No crash

What do you see instead?

On v21.7.2, when using version: 'unstable':

FATAL ERROR: v8::Object::GetAlignedPointerFromInternalField() Internal field ou
t of bounds
----- Native stack trace -----

 1: 00007FF6D4516C2B node::SetCppgcReference+15595
 2: 00007FF6D4481829 node::OnFatalError+265
 3: 00007FF6D4F30DB6 v8::api_internal::AnnotateStrongRetainer+118
 4: 00007FF6D4F57E8E v8::Object::SlowGetAlignedPointerFromInternalField+78
 5: 00007FF6D43842FF v8::CTypeInfoBuilder<unsigned __int64>::Build+3023
 6: 000002EB3527C54D

On v21.7.2, when using version: 'preview1' it abruptly terminates.

On v20.12.1, when using any of preview1 and unstable. It similarly abruptly terminates.

Additional information

If changing to recompile at every iteration, instead of reusing it, like:

    const instance = await WebAssembly.instantiate(await getFliteWasiModuleObject(), importObject)

The problem doesn't seem to occur (or at least not as frequently as I was able to see it).

mertcanaltin commented 5 months ago

Greetings, no crashes occurred when I tried v21.7.2 or lower versions

Iteration 1000: I was able to run more than

rotemdan commented 5 months ago

It doesn't happen perfectly consistently. Sometimes it doesn't happen for a while even after 1000 iterations.

I'm on Windows 11. There may be particular conditions that cause it to happen. That was a minimal test case based on what I had. My deployed code gets the crashes very consistently.

The goal here was isolate what is sufficient to cause it at all, not find a way to reproduce it more reliably on more systems.

You can also try other node versions.

rotemdan commented 5 months ago

I tried again to reproduce on v21.7.2-x64 and it didn't happen after several tests, so I installed v20.12.1-x64 and then it happened consistently after up to 200 iterations. Here's a video of it failing after 79 iterations:

https://github.com/nodejs/node/assets/8589488/29136e91-6620-4c69-8dad-ed86f116bf9a

Another video showing the randomness of the crashes, with 6 consecutive runs:

https://github.com/nodejs/node/assets/8589488/c46adc2b-62e1-4394-b146-c1d65203a9e7

You can try several times with different versions of Node or different platforms/OS.

mertcanaltin commented 5 months ago

I guess this doesn't happen in macos, only in windows

rotemdan commented 5 months ago

I'm able to repeatedly reproduce this on WSL Linux (Ubuntu 22.04.4 LTS). Node v20.12.1

On Linux it ends with 'segmentation fault' when it crashes:

https://github.com/nodejs/node/assets/8589488/da7cdaff-94eb-4aa1-838c-3a34ab111791

Since this does appears to also happen on POSIX, I wouldn't rule out it would happen on macOS too. It's possible that it's just "luck" or some aspect of your system that didn't reproduce it for the extent you tried.

anfibiacreativa commented 5 months ago

Tried with v20.12.1 on macOS 14.4.1 (23E224). Could not reproduce, after multiple runs.

rotemdan commented 5 months ago

By making a simple change to the code, I reproduced with a clean Ubuntu Server 23.10 Mantic Minotaur VM, with Node.js v20.12.1 in VirtualBox, right on the first iteration:

I simply changed the text I pass the WASI executable to a long, random hex string:

    const text = `5741aefa1dc5d1bbbb064232cae2af80e1b227854a86f6d988985474a90ea626f61fed7d8dc21b5972e6d101ad0d6289fd8e161439e1eccd656b94711a1018d61f5b0f7896c4396fe3d41a7a4de184e5c5df80fba3feb3ba11fd198332634d3ad0a62c25721ae59bbe03dc070697dfc57c1969f0f7dcd06dbeabddd999673142`

https://github.com/nodejs/node/assets/8589488/90a48c58-e948-40e4-80df-f0f6195c6674

Here are the modified project files:

flite-wasi-test.zip

flite is a relatively old speech synthesis utility I compiled to WASI as part of my Echogarden speech toolset.

By passing it a random sequence of characters that's unlike natural language, with no spaces, it isn't clear what exactly should happen, since it's based on old C code that isn't maintained.

The question is why does a segmentation fault occur, and not a WASI exception.


Edit: Some additional information

On Windows I'm also getting the error. It seems like in both Windows and Linux flite does generate the full synthesized audio file correctly (for the random string of characters). The crash seems to happens afterwards apparently:

Here's how it looks in a step debugger:

https://github.com/nodejs/node/assets/8589488/68d16ff7-d47a-465b-938f-deb076613255

The debugger was able to get to the next line:

console.log(`Exit code: ${exitCode}`)

This line is actually getting highlighted, but then it abruptly exists.

Every time I run it crashes in a slightly different place (iteration 1, iteration 2, slightly after iteration 1, etc.). Sometimes it gives this message:

abort: Invalid bytecode

#
# Fatal error in , line 0
# unreachable code
#
#
#
#FailureMessage Object: 000000A1AEFFE250
----- Native stack trace -----

 1: 00007FF7D135C81B node::SetCppgcReference+17979
 2: 00007FF7D124A97F node::TriggerNodeReport+72127
 3: 00007FF7D2146E82 V8_Fatal+162
 4: 00007FF7D1C0D4F6 v8::base::CPU::has_sse41+154806
 5: 00007FF7D1C08E34 v8::base::CPU::has_sse41+136692
 6: 00007FF7D1C08243 v8::base::CPU::has_sse41+133635
 7: 00007FF7D1BFCBCB v8::base::CPU::has_sse41+86923
 8: 00007FF7D1BFCDD0 v8::base::CPU::has_sse41+87440
 9: 00007FF7D1BFCC84 v8::base::CPU::has_sse41+87108
10: 00007FF7D1850053 v8::internal::Version::GetString+257235
11: 00007FF7D1DF619E v8::PropertyDescriptor::writable+677998
12: 00007FF7D1EBA31C v8::PropertyDescriptor::writable+1481196