oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
72.27k stars 2.58k forks source link

Random "error: n should be larger than or equal to 0" when streaming to a process #12008

Open oguimbal opened 1 month ago

oguimbal commented 1 month ago

What version of Bun is running?

1.1.15+b23ba1fe1

What platform is your computer?

Darwin 23.1.0 arm64 arm

What steps can reproduce the bug?

We consistently experience the below error when streaming lots of data to a spawned process (although at random times, even when streaming exactly the same data):

2479 | enumerable: !1, 2480 | get() { 2481 | return this._readableState ? this.readableState.endEmitted : !1; 2482 | } 2483 | } 2484 | function fromList(n, state) { ^ error: n should be larger than or equal to 0 at fromList (node:stream:2484:22) at node:stream:2096:39 at flow (node:stream:2312:18) at resume (node:stream:2301:21)

Why this happens is a bit unclear, but it seems to be raised in Zig here.

Because it is random, I could not isolate this error properly in a simple copy/pastable environment with a small archive (it happens after streaming GBs of data), but here is what we do, schematically (where we have thousands of chunks, each 100MB):

const lz4 = spawn("lz4", ["-c", "-d", "-"]);
const tar = spawn("tar", ["-x", "-C", toPath]);
lz4.stdout.pipe(tar.stdin);

for (let i = 0; i < chunks.length; i++) {
            const f = fs.createReadStream(chunks[i]);
            const {resolve, reject, promise} = Promise.withResolvers<void>();
            f.on('end', resolve);
            f.on('error', reject);
            f.pipe(lz4.stdin, { end: i === chunks.length - 1 });
            await chunkProm;
            f.close();
}

What is the expected behavior?

should not throw an error

What do you see instead?

a cumbersome error, show in JS but which actually seems to be thrown from Zig.

Additional information

No response

oguimbal commented 1 month ago

I looked a bit at the involved code.

The only idea that I had is as follows: If n is big enoug, it will fit a javascript number, but not in an int32. Then, maybe this line:

int32_t n = callFrame->argument(0).toInt32(lexicalGlobalObject);

Overflows, and returns a negative integer, thus triggering the error.

I dont know if toInt32() can return an overflowed int in such case, but overflowing an int32 only requires 2GB of data, which is possible given the amount of data we're dealing with here.

If such is the case, beware, there is also the same line in jsBufferListPrototypeFunction_concatBody (same file)