jprichardson / node-klaw

A Node.js file system walker with a Readable stream interface. Extracted from fs-extra.
MIT License
317 stars 41 forks source link

Memory Usage #14

Closed DomVinyard closed 7 years ago

DomVinyard commented 7 years ago

Does this module scale, if I have a directory with millions (or tens of millions) of files, will this scale elegantly as it iterates or does it have to read the entire directory into memory?

jprichardson commented 7 years ago

https://github.com/jprichardson/node-fs-extra/issues/329#issuecomment-269915644

DomVinyard commented 7 years ago

self.fs.readdir(pathItem, function (err, pathItems) {

This is the problem, the module is loading all of the entries into memory and streaming them out one by one, which means that millions of files will still balloon into a memory overload nightmare. readdir = unscalable.

The only module i've seen so far which addresses this is native-readdir which uses the underlying readdir(3) POSIX system call on *nix, but unfortunately is not compatible with windows.

jprichardson commented 7 years ago

This is the problem, the module is loading all of the entries into memory and streaming them out one by one, which means that millions of files will still balloon into a memory overload nightmare. readdir = unscalable.

Sorry, I assumed that when you stated that a directory has 1M entries, that you meant the directory and it's subdirectory contain a total of 1M entries (the entire tree), not the directory itself has 1M child entries. That is an edge case that very few will encounter and one that we won't currently be optimizing for.

DomVinyard commented 7 years ago

Agreed, it's a pretty niche case, thanks for your input