nodejs / performance

Node.js team focusing on performance
MIT License
371 stars 7 forks source link

readFile in promises very slow #151

Open xJeneKx opened 3 months ago

xJeneKx commented 3 months ago

Version

v21.7.1

Platform

Darwin evgeniis.local 23.4.0 Darwin Kernel Version 23.4.0: Wed Feb 21 21:44:43 PST 2024; root:xnu-10063.101.15~2/RELEASE_ARM64_T6000 arm64

Subsystem

No response

What steps will reproduce the bug?

slow

const fs = require('fs/promises');

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
    fs.readFile("./text.txt", { encoding: 'utf-8' })
        .then((data) => {
            if (data !== "Hello, world") throw 1;
            count++
            if (count === 10000) {
                console.log('time: ', Date.now() - start);
            }
        })
        .catch((err) => {
            throw 1
        })
}

fast

const fs = require('fs');
const util = require('util');
const readFile = util.promisify(fs.readFile);

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
    readFile("./text.txt", { encoding: 'utf-8' })
        .then((data) => {
            if (data !== "Hello, world") throw 1;
            count++
            if (count === 10000) {
                console.log('time: ', Date.now() - start);
            }
        })
        .catch((err) => {
            throw 1
        })
}

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

No response

What do you see instead?

The promise version is 2 times slower. My tests showed 200ms vs 100ms

Additional information

No response

AlttiRi commented 3 months ago

The same on Windows.

However, I need to decrease the count from 10000 to 8189. There is the error when I use 8190+:

[Error: EMFILE: too many open files, open 'C:\test\text.txt'] {
  errno: -4066,
  code: 'EMFILE',
  syscall: 'open',
  path: 'C:\\test\\text.txt'
}
Node.js v18.18.1
UlisesGascon commented 3 months ago

I suggest to transfer this issue to the performance team repo... what do you think @anonrig?

lemire commented 3 months ago

You can go faster (maybe 2x faster) by going synchronous.

const fs = require('fs');
const readFileSync = fs.readFileSync;

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
        var data = readFileSync("./text.txt", { encoding: 'utf-8' })
}
console.log('time: ', Date.now() - start)

When benchmarking using bun, all three approaches run at the same speed.

Here bun vs node on the first benchmark:

>  node  test.js
time:  218
>  bun run test.js
time:  53

The cost of promises in Node seems extraordinarily high.

lemire commented 3 months ago

You can basically recover the performance by calling readFileSync within an async function...

const fs = require('fs');
const readFileSync = fs.readFileSync;
async function f(name, options) {
  return readFileSync(name, options);
}

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
        f("./text.txt", { encoding: 'utf-8' })
                .then((data) => {
                        count++
                        if (count === 10000) {
                                console.log('time: ', Date.now() - start);
                        }
                })
                .catch((err) => {
                        throw 1
                })
}
meyfa commented 3 months ago

You can basically recover the performance by calling readFileSync within an async function

That cannot be optimal? Maybe in this synthetic example it is, but won't readFileSync just block the whole program until the entire file is read? The await in this case just gives the illusion of being async as a microtick is introduced after the entire file has been read synchronously...

meyfa commented 3 months ago

readFileSync has to return the file contents synchronously, so it must read the whole file before returning. And since JS isn't multi-threaded this will block execution. fs.promises.readFile can allow other stuff to run while it fetches the file asynchronously, fulfilling the returned promise when done. The presence of await or the fact that the readFileSync call happens in an async function does not affect its internal behavior.

While readFileSync may speed up the benchmark, I think it is unsuitable for applications that could be doing other work while waiting for I/O to complete.

aduh95 commented 3 months ago

You can basically recover the performance by calling readFileSync within an async function

That's going to depend on the size of your file and if you're program has other things to do – like @meyfa said, using readFileSync might get you the content of the file slightly faster, but the tradeoff is that your program cannot do anything in the mean time.

GeoffreyBooth commented 3 months ago

JS isn’t multi-threaded

https://nodejs.org/api/worker_threads.html

ronag commented 3 months ago

I think we are getting a bit off topic here... I would recommend all mentions of readFileSync should be hidden as off topic.

lemire commented 3 months ago

I ran some benchmarks on a large server. See blog post.

Screenshot 2024-03-12 at 11 45 15 AM
benjamingr commented 3 months ago

@linkgoron

benjamingr commented 3 months ago

Yeah last time we looked at it there were many opportunities to make the slower discrete versions (e.g readFile and not createReadStream) much much faster so this result isn’t too surprising.

lemire commented 3 months ago

I think we are getting a bit off topic here... I would recommend all mentions of readFileSync should be hidden as off topic.

In Bun, fs.readFileSync can be slower than fs.promises, and fs.readFile definitively appears to be able to be faster than fs.readFileSync. In Node.js, the opposite is true. See my numbers above. Granted, it is one test on one system, and one's results will vary.

Is that what one expects?

lemire commented 3 months ago

@aduh95

the tradeoff is that your program cannot do anything in the mean time.

In my test (see blog post link), readFileSync uses 300 ms of CPU time, fs.promises uses 7 seconds of CPU time (over several cores). And fs.promises return the file with significantly higher latency.

These are not small differences.

lemire commented 3 months ago

Our results suggest that fs.promises is significantly slower than fs.readFile. There is a related issue from 2021:

Related: https://github.com/nodejs/node/issues/37583

lemire commented 3 months ago

My tests ran with a small file. If you use a large file (e.g., 512kB), then all the Node.js tests except those relying on readFileSync fail/crash with an allocation failure.

Bun does not run out of memory.

mcollina commented 3 months ago

The problem is that readFile allocate 3x the memory of the file size.

  1. to read all the chunks
  2. to concatenate them in a single buffer
  3. to convert said buffer to utf8

The solution for utf-8 files is relatively simple to describe:

  1. move most of the readFile logic to C++
  2. use a streaming utf8 parser and use it to allocate V8/JS strings.
  3. Concatenate said V8 strings (this is cheap)

For binary reads is a bit more tricky.

  1. read most of the readFile logic to C++
  2. allocate a big chunk of memory to fit the file
  3. read it straight there

On this latter case, this is mostly done, but not in C++, causing a lot of thrashing.


Consider also https://github.com/nodejs/node/pull/17054 and https://github.com/nodejs/node/issues/25741. Notes that those fix a real problem, but we might want to make a different tradeoff there, or specify an option for the user to provide.

aduh95 commented 3 months ago

For binary reads […] 2. allocate a big chunk of memory to fit the file 3. read it straight there

I guess that's already what we do, although the "big chunk" is maybe not that big (64 kiB) – but it might already be considered too big depending on your use case. Maybe we could add an option to readFile to let the user specify the size of that buffer, a bit like the highWatermark option for streams.

mcollina commented 3 months ago

@aduh95 I think you misunderstood me. I wrote "to fit the file", so that if the file is 100MB, we would allocate 100MB of contiguous memory for it and avoid the concatenation.

mcollina commented 3 months ago

Anyhow I think it's more important to optimize the utf8 case.

aduh95 commented 3 months ago

Maybe that's a wrong assumption from my side, but I think it's not possible to know the size of a file ahead of time. So my point was the current buffer is already a "big chunk to fit the file", but that's only true for files smaller than 64kiB – 64K is arbitrary, but unless we can know the size of the file, we will inevitably need some concatenation given a large enough file. Anyway, optimising the UTF-8 case certainly makes sense.

Linkgoron commented 3 months ago

Maybe that's a wrong assumption from my side, but I think it's not possible to know the size of a file ahead of time. So my point was the current buffer is already a "big chunk to fit the file", but that's only true for files smaller than 64kiB – 64K is arbitrary, but unless we can know the size of the file, we will inevitably need some concatenation given a large enough file. Anyway, optimising the UTF-8 case certainly makes sense.

Maybe I'm misunderstanding what you're saying, but today the code uses fstat in-order to allocate the returned buffer. If it fails, it allocates 64kIb chunks for every read and then concats them in the end (note that even if fstat succeeds, Node still reads in 512 kib chunks into the larger buffer).

mcollina commented 3 months ago

@Linkgoron @aduh95 you are right, this optimization was already done.

The problem remains because there are a lot of C++/JS transitions for large files, and the utf8 parsing still allocates double the memory.

BridgeAR commented 3 months ago

@mcollina the fs.promises.readFile() implementation allocates ~100.5 MB when reading a 100 MB big file. See https://github.com/nodejs/node/pull/44295 It still has multiple C++ / JS boundary transitions though. Moving things into C++ would therefore likely provide further gains.

We do have different implementations throughout the codebase and that is obviously not ideal. Maybe we can unify the promise and the callback versions internally?

joyeecheung commented 2 months ago

You can check the impact combined with JS code execution by the way, for example (running this in the project directory to parse a markdown file):

See code ```js 'use strict'; const { readFileSync } = require('fs'); const { readFile } = require('fs/promises'); const marked = require('./test/fixtures/snapshot/marked'); let count = 0; let total = 100; const file = './README.md'; // 38K //const file = 'GOVERNANCE.md'; // 7.5K let result; function syncParse(name, options) { const start = Date.now(); for (let i = 0; i < total; i++) { const data = readFileSync(file, { encoding: 'utf-8' }); result = marked.parse(data); } console.log('syncParse: ', Date.now() - start, result.length); } async function asyncParse(name, options) { const start = Date.now(); for (let i = 0; i < total; i++) { readFile(file, { encoding: 'utf-8' }) .then((data) => { result = marked.parse(data); if (++count === total) { console.log('asyncParse: ', Date.now() - start, result.length); } }, (error) => { console.log(error); }); } } // Ensure it's cached. readFileSync(file, { encoding: 'utf-8' }); readFileSync(file, { encoding: 'utf-8' }); readFileSync(file, { encoding: 'utf-8' }); //syncParse(); asyncParse(); ```

For GOVERNANCE.md which is 7.5K, the synchronous version is ~20% faster on my Linux server, and for README.md (38K) the two version perform basically the same. On a MacBook both perform similarly for GOVERNANCE.md, and the asynchronous version of README.md is about 1% faster. For even bigger files (e.g. doc/api/fs.md) the difference is negligible because the actual parsing eats most of the time.

So I guess yes, the async version would allow you to run some code at the mean time, but the overhead of the asynchronous version may already cancel out the benefit you get from not blocking especially when your files aren't that big and are cached. Also on a reasonably fast file system, the more JS execution you need to do, the less the speed of fs calls is going to matter, because the JS execution will completely dominate the profile and the overhead of blocking file reads will only be a tiny part of it, blocking or not.

Also, to vary the input a bit, just run marked on all the documents under doc/api:

See code ```js 'use strict'; const { readdirSync, readFileSync } = require('fs'); const { readFile } = require('fs/promises'); const marked = require('./test/fixtures/snapshot/marked'); const files = readdirSync('./doc/api/').map(i => `./doc/api/${i}`); let result; function syncParse() { const start = Date.now(); for (let i = 0; i < files.length; i++) { const data = readFileSync(files[i], { encoding: 'utf-8' }); result = marked.parse(data); } console.log('syncParse: ', Date.now() - start, result.length); } async function asyncParse() { const start = Date.now(); let count = 0; for (let i = 0; i < files.length; i++) { readFile(files[i], { encoding: 'utf-8' }) .then((data) => { result = marked.parse(data); if (++count === files.length) { console.log('asyncParse: ', Date.now() - start, result.length); } }, (error) => { console.log(error); }); } } //syncParse(); asyncParse(); ```

There's no practical difference on either my Linux server (~1.4s)or MacBook (~0.7s) (those documents tend to be big and most time is spent on parsing). Also, to prove my point about JS execution speed matter more when you actually need to do a lot of JS, despite https://github.com/nodejs/performance/issues/151#issuecomment-1991973509 showing that Bun has faster file I/O, running the same script on my Linux server with Bun took ~35s both for the async and the synchronous version, I suppose something is wrong with JSC there, but at this point you probably don't even care about the few milliseconds of file I/O if you are going spend >35x more time on JS execution...