sindresorhus / get-stream

Get a stream as a string, Buffer, ArrayBuffer or array
MIT License
341 stars 33 forks source link

range error thrown when using createWriteStream() #48

Closed FrederickEngelhardt closed 1 year ago

FrederickEngelhardt commented 2 years ago

Hi, I'm running a file queue service that uses 'child_process'.createWriteStream().

For files much larger than the JS callstack this library consistently throws the error

RangeError [ERR_OUT_OF_RANGE]: The value of "length" is out of range. It must be >= 0 && <= 4294967296. Received 10_100_604_616
at validateOffset (node:buffer:112:3)
at Function.concat (node:buffer:549:5)
t PassThrough.stream.getBufferedValue (/root/servers/file-queue-server/node_modules/get-stream/buffer-stream.js:45:26)
at /root/servers/file-queue-server/node_modules/get-stream/index.js:44:23

Is there a way to prevent this error from throwing?

sindresorhus commented 2 years ago

In

https://github.com/sindresorhus/get-stream/blob/c17901233590aa49675a7fd0f42d70b9bed1c580/buffer-stream.js#L46

The length passed to Buffer.concat is optional and only passed as an optimization. Can you try removing that parameter and see if it still throws? (Whether or not that fixes it depends on how Buffer.concat is implemented internally).

However, I'm not sure if that will completely solve your problem. You are trying to create a buffer for 9 GB, however:

On 64-bit architectures, this value currently is 2 32 (about 4 GB).

marcelklehr commented 2 years ago

I get this with slightly more than 1GB (using this library transitively through 'download'):

It must be >= 0 && <= 1073741823. Received 1097234836
ehmicky commented 1 year ago

The problem is the following: buffers with Node.js have a max size of 4GB, and strings of ~5e8 characters (which is ~500MB if using only ASCII characters, but fewer with UTF-8 characters). The limits are available at buffer.constants.MAX_LENGTH and buffer.constants.MAX_STRING_LENGTH.

The reason this fails is because Node.js cannot represent a file with >4GB in a Buffer due to the above limit. This problem is not related to get-stream. It happens when Buffer.concat() is trying to create that Buffer, but this is not a bug in the current implementation. I believe those limits might actually be coming from V8.

You can reproduce the problem like this (if your OS has head and /dev/urandom):

head -c 10000000000 /dev/urandom > big

Then:

import { createReadStream } from 'node:fs'
import { getStreamAsBuffer } from 'get-stream'

await getStreamAsBuffer(createReadStream('./big'))

Which results in:

Uncaught:
RangeError [ERR_OUT_OF_RANGE]: The value of "length" is out of range. It must be >= 0 && <= 4294967296. Received 10_000_000_000
    at __node_internal_captureLargerStackTrace (node:internal/errors:496:5)
    at new NodeError (node:internal/errors:405:5)
    at __node_internal_ (node:internal/validators:99:13)
    at validateOffset (node:buffer:122:3)
    at Function.concat (node:buffer:590:5)
    at getBufferedValue (file:///home/me/get-stream/index.js:68:74)
    at getStream (file:///home/me/get-stream/index.js:37:10)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async REPL15:2:34 {
  code: 'ERR_OUT_OF_RANGE',
}

It is generally a bad practice to buffer that much memory, as opposed to stream it, as it makes the process use a large amount of the machine's memory. Also, in some cases, V8 crashes when too much memory is used. Also, the maxBuffer option can be used when the size of the file is unknown, for better error reporting.

But for importantly, this is not a problem with get-stream, so I believe this issue should be closed?