paulmillr / chokidar

Minimal and efficient cross-platform file watching library
https://paulmillr.com
MIT License
10.96k stars 580 forks source link

Network drives on win32: 24x slower #665

Closed greggman closed 3 months ago

greggman commented 6 years ago

I'm trying to use chokidar on Windows Shares and finding it's extremely slow on the initial scan. For a relatively small tree of files (183 entries of directories and files)

 tree /F path/to/share/folder

takes 3.5 seconds

dir /S /A /B path/to/share/folder

which gets most of the same stat data takes about the same amount of time where as

chokidar path/to/share/folder -p --verbose --initial -c "echo"

Takes 80 seconds until chokidar emits 'ready' or about 24x as long

Those same shares on Mac (same NAS, also smb), same options is fast (3.5 seconds)

I'm wondering if there is some way to speed up the initial scan?

Just doing my own recursive fs.readdir with an fs.stat on every file is only (9.5 seconds) or 8x faster (tho still 3x slower than dir/tree)

To see the ready event here is my chokidar test

const fs = require('fs');
const path = require('path');
const chokidar = require('chokidar');

const watcher = chokidar.watch(process.argv[2], {
  usePolling: true,
  alwaysStat: true,
});
watcher.on('add', (fn, stats) => { console.log(fn, "file", stats.size); });
watcher.on('addDir', (fn, status) => { console.log(fn, "dir"); });
watcher.on('ready', () => {
  console.log("---done---");
  watcher.close();
});

and here's my readdir test

const fs = require('fs');
const path = require('path');

function readDir(dirname) {
  fs.readdir(dirname, (err, fileNames) => {
    fileNames.forEach((filename) => {
      const fullpath = path.join(dirname, filename);
      fs.stat(fullpath, (err, stats) => {
        console.log(fullpath, stats.isDirectory() ? "dir" : "file", stats.size);
        if (stats.isDirectory()) {
          readDir(fullpath);
        }
      });
    });
  });
}

readDir(process.argv[2]);

And here's what the test tree looks like. Note: this is not a typical folder I'm watching it was just something on my NAS that was a reasonable size to test. A typical folder is MUCH deeper with 1000s of files and chokidar takes many many minutes until it gets done with it's initial scan.

Z:\SRC
├───bin
│   │   .gitignore
│   │   backup-stuff.py
│   │   chup.sh
│   │   cyg-wrapper.sh
│   │   e
│   │   e.bat
│   │   encode-html-entities.py
│   │   publish-gh-pages
│   │   s
│   │   setup-ssh-agent
│   │   start-chrome-chromium.bat
│   │   start-chrome-google.bat
│   │   start-chrome-home.bat
│   │   start-chrome.bat
│   │   update-web
│   │   v
│   │
│   ├───.git
│   │   │   config
│   │   │   description
│   │   │   HEAD
│   │   │   index
│   │   │   packed-refs
│   │   │
│   │   ├───branches
│   │   ├───hooks
│   │   │       applypatch-msg.sample
│   │   │       commit-msg.sample
│   │   │       post-update.sample
│   │   │       pre-applypatch.sample
│   │   │       pre-commit.sample
│   │   │       pre-push.sample
│   │   │       pre-rebase.sample
│   │   │       pre-receive.sample
│   │   │       prepare-commit-msg.sample
│   │   │       update.sample
│   │   │
│   │   ├───info
│   │   │       exclude
│   │   │
│   │   ├───logs
│   │   │   │   HEAD
│   │   │   │
│   │   │   └───refs
│   │   │       ├───heads
│   │   │       │       master
│   │   │       │
│   │   │       └───remotes
│   │   │           └───origin
│   │   │                   HEAD
│   │   │
│   │   ├───objects
│   │   │   ├───03
│   │   │   │       363d47c3fea3d8c0ec2355f07d901d0c440213
│   │   │   │
│   │   │   ├───07
│   │   │   │       1bd8a26f7e3617f2de32c357895bb5ccfaa085
│   │   │   │
│   │   │   ├───0c
│   │   │   │       5eeed0d70908f44475750d2549168928996b2f
│   │   │   │
│   │   │   ├───11
│   │   │   │       558cfb280cac51bc9cb1788f2735a57468a5c9
│   │   │   │
│   │   │   ├───1a
│   │   │   │       32d2f604ab8fc183b004e8f6fcf18c75f05c47
│   │   │   │
│   │   │   ├───21
│   │   │   │       8bb4889192ef127c7ff066b37dd3af493530ed
│   │   │   │
│   │   │   ├───27
│   │   │   │       84fa693f18a9ad2be3abd520d5ebc740209b66
│   │   │   │
│   │   │   ├───2d
│   │   │   │       b323901c6ee0886260b0bcfb0481fecd60f5f1
│   │   │   │
│   │   │   ├───30
│   │   │   │       b9317a0548d6b3d289d5fe9cc7fd62bdf55b06
│   │   │   │
│   │   │   ├───32
│   │   │   │       def067fb17c7c3e0b865d234dc4552adf543bb
│   │   │   │
│   │   │   ├───3a
│   │   │   │       90791ed76a404d3e8c989435a05aa7eb410c51
│   │   │   │       ec19f6a056dda4caa8ba5e19fca54e464e2b9a
│   │   │   │
│   │   │   ├───3d
│   │   │   │       432b7f63c08364e0535e7cbd26c65c63c648b5
│   │   │   │       50cf1dcd27ae0d85354ba55b020aa4f816ddb6
│   │   │   │
│   │   │   ├───44
│   │   │   │       9103a30b9d7bc5870448d0f6ed324d68884579
│   │   │   │
│   │   │   ├───45
│   │   │   │       41207af798ffa1f235fca585ef77e70daf6a8e
│   │   │   │       8426bc2e14cb3abed7b96f2f1a931779e1195b
│   │   │   │
│   │   │   ├───4d
│   │   │   │       515444344c936b46635887590da438c79c81ec
│   │   │   │
│   │   │   ├───54
│   │   │   │       59cb69a7e536c9437cb4ea4984a2302658a440
│   │   │   │       a8c4aa03c0503537e8b0dc8c65073858abeee9
│   │   │   │
│   │   │   ├───57
│   │   │   │       af4fc2ebda5c5cb77c40b6f4900b2b05bf53a0
│   │   │   │
│   │   │   ├───60
│   │   │   │       9b3e429c28a58ce7039a2bad2ef4a491155314
│   │   │   │
│   │   │   ├───61
│   │   │   │       b5f59b18d27bc950f2e8fe334c573b8274d22f
│   │   │   │
│   │   │   ├───64
│   │   │   │       434719a28165c0b7036829cb5d90dcce8ff36a
│   │   │   │
│   │   │   ├───68
│   │   │   │       52e32d26be4a47de7dc3a87be401ac2815a9cf
│   │   │   │       cc999ab379c511e54f9fb13993730ea0b54296
│   │   │   │
│   │   │   ├───78
│   │   │   │       8ded0c04ff8dbcb7ccdf2aa089d399fcc4f358
│   │   │   │
│   │   │   ├───84
│   │   │   │       d03810e83fa96d8f4d84fc9cca3e2a19ceb871
│   │   │   │
│   │   │   ├───86
│   │   │   │       05bb9f069c960e7cb9952d297045fa7e44aa74
│   │   │   │
│   │   │   ├───8f
│   │   │   │       93558890dd01847ecf569ef5f7ae2761c9e016
│   │   │   │
│   │   │   ├───93
│   │   │   │       f612e1c8fe4adb31f1dc8cb08dc0eb3ae45695
│   │   │   │       f6f78361b97df5c3a174a74d0897d0dcde00b9
│   │   │   │
│   │   │   ├───96
│   │   │   │       ff8ef91d4cf4be4000970bcbaf165e56c119d7
│   │   │   │
│   │   │   ├───97
│   │   │   │       08f19593cd22c53fd8b3b71aedead25ff64793
│   │   │   │
│   │   │   ├───99
│   │   │   │       6cc05654376d7b0425936528e317d784724806
│   │   │   │
│   │   │   ├───9a
│   │   │   │       d913202a47eecb0b909db8362d586f584c956b
│   │   │   │
│   │   │   ├───a0
│   │   │   │       2331f8f11e2794f6f36b25b9fd6f3b36428612
│   │   │   │
│   │   │   ├───a4
│   │   │   │       dc667606fc3197e3bf457697d2453929ef0f1d
│   │   │   │
│   │   │   ├───a6
│   │   │   │       52c68ca1f01698f38ddd55b73f8663f1354913
│   │   │   │
│   │   │   ├───a9
│   │   │   │       578757f41ef789f438da94b9426458853a0de0
│   │   │   │
│   │   │   ├───ad
│   │   │   │       23a6aebabdb8123c87fccfe016003e31c8f967
│   │   │   │       a81e5ba6524043aa18982891c226aac13b92ce
│   │   │   │
│   │   │   ├───af
│   │   │   │       35d02e690981c67371882740a45af58e4b335b
│   │   │   │
│   │   │   ├───b0
│   │   │   │       759cd397e761acfa8d3034868ed719c1527b4b
│   │   │   │
│   │   │   ├───b5
│   │   │   │       8e5a59af4da431bdcd84d26a91f8740dbecc62
│   │   │   │
│   │   │   ├───c1
│   │   │   │       d64e57f97779858b0afdef747a3e50952c2128
│   │   │   │
│   │   │   ├───c8
│   │   │   │       2b4cd298c2b7fb832c3b9d7a71f242a41b9e45
│   │   │   │       2d45953e6818fea94c08f9e526242d4e451412
│   │   │   │
│   │   │   ├───cb
│   │   │   │       9176d8d1cbe3c8bfd9b3d53775398f44274ce1
│   │   │   │
│   │   │   ├───cf
│   │   │   │       3d36971864f54cddcc77cc54c72864d06268a9
│   │   │   │
│   │   │   ├───d3
│   │   │   │       3fb4e61bb4c04b78c47010638b56c4e658b3ce
│   │   │   │       5aa78a875e83ff96f6527a7d6b77ff8a1bfcb6
│   │   │   │
│   │   │   ├───e1
│   │   │   │       72ea793de97424e15325195c53b344a6d855cd
│   │   │   │       b21ea7a63fb33e0fedb2f73bacb5dcd91bd57f
│   │   │   │
│   │   │   ├───e2
│   │   │   │       24d8f3b2c11e15fbd7a9869184fa25c105ca8e
│   │   │   │
│   │   │   ├───e4
│   │   │   │       7f63b4137ab498c42b36b5cb93a6dd8a10accb
│   │   │   │
│   │   │   ├───e6
│   │   │   │       cf29cffb27147e6786dc2971be38bd6e95eb14
│   │   │   │
│   │   │   ├───ee
│   │   │   │       95d7c4a3adcd9a73800e76209b52ea966615a3
│   │   │   │
│   │   │   ├───ef
│   │   │   │       c3ae2b92cf6fec820ca7a3ad9a64b2e02412a3
│   │   │   │
│   │   │   ├───f7
│   │   │   │       09b0af23b0a9758b5d5458e2704a9b003dd49c
│   │   │   │
│   │   │   ├───f9
│   │   │   │       a097df37316f512c534e05ee5c07672ff1e7a0
│   │   │   │
│   │   │   ├───fe
│   │   │   │       ebf96df6259edc101651daa647b1f989ea7435
│   │   │   │
│   │   │   ├───info
│   │   │   └───pack
│   │   │           pack-4b251282c949a8ad1225390be8aa4722ec7187c3.idx
│   │   │           pack-4b251282c949a8ad1225390be8aa4722ec7187c3.pack
│   │   │
│   │   └───refs
│   │       ├───heads
│   │       │       master
│   │       │
│   │       ├───remotes
│   │       │   └───origin
│   │       │           HEAD
│   │       │
│   │       └───tags
│   ├───dotfiles
│   │   │   bash_stuff
│   │   │   git-completion.bash
│   │   │
│   │   └───.subversion
│   │           config
│   │
│   └───platform
│       └───osx
│               set-chrome-window-size.oascript
│
└───slickedit
        SlickEditOptions20.zip

Any ideas on how to speed up chokidar? Should it's initial scan do something different? Am I better off doing my own initial scan?

greggman commented 6 years ago

I tested with one of the folders trees I really intended to use. It has 2370 entries.

That means the 24x slower is apparently consistent. It doesn't appear to be a function to the number of folders, depth, or, entries for example.

While I can pre-scan with readdir myself I'm a little conserned that chokidar won't notice changes for 22 minutes into my app though. Any ideas what might be causing the issue?

es128 commented 6 years ago

Chokidar isn't just scanning the files, it's setting file watchers as it goes. Stat polling is very CPU-intensive. Trying to watch a huge directory across a network share is the source of your performance concern. If you just need the scan and not the watching then don't use chokidar.

It is not likely that you'll find a significant performance boost for this situation just by optimizing code, but please send a PR if you do. You could do things like run your program on the system where the files are in order to get off of polling.

greggman commented 6 years ago

It seems kind of sad to me you closed this ticket. Isn't this an issue people should be aware of they can look for solutions?

Checking on Mac to the same share with usePooling: true, useFsEvents: false, alwaysStat: true takes 2 seconds. Since I've also demonstrated the same sitaution on Windows can run fast and since chokidar is not running fast on Windows there is clearly room for improvement and this issue seems like it should stay open.

greggman commented 6 years ago

You claimed the issue is chokidar is CPU intensive. Watching the CPU on my test above I see no evidence of this. It never goes above 3%

es128 commented 6 years ago

There's something missing from the info you're providing. If you actually ran with the typo you showed here (usePooling: true) then you weren't really observing polling, although then across a network share you wouldn't actually be watching anything.

greggman commented 6 years ago

The code in question is exactly as pasted above. (the typo was only in my last message not in any code I've been running) On Mac it's exactly as pasted above except with useFsEvents: false added. Here's the code again

const fs = require('fs');
const path = require('path');
const chokidar = require('chokidar');

const watcher = chokidar.watch(process.argv[2], {
  useFsEvents: false,
  usePolling: true,
  alwaysStat: true,
});
watcher.on('add', (fn, stats) => { console.log(fn, "file", stats.size); });
watcher.on('addDir', (fn, status) => { console.log(fn, "dir"); });
watcher.on('ready', () => {
  console.log("---done---");
  watcher.close();
});

So to be clear,

chokidar on mac to smb network share - fast chokidar on windows to same smb network share - extremely slow readdir on windows to same smb network share - fast

cpu usage low

There are also other tools for windows that scan and keep track of network changes on Windows fast. Since you've already got a custom native plugin for Mac for FsEvents it seems like something similar for Windows is probably the way forward. Either that or visiting libuw to figure out why node is so slow here.

es128 commented 6 years ago

Ok, I'll go ahead and reopen the issue. Take a look at #410 and #412, although the end result seemed to be a solution to managing system resources during the scan, not an overall time reduction to the ready event.

Can you set up a test for walk-filtered vs readdirp in your environment to see how they fare against each other?

greggman commented 6 years ago

I tried walk-filtered and it took 3-4 seconds in the same setup.

I have a feeling I'm just going to run into all the problems previous adventurers have run into 😅. I compiled libuv last night and setup a test. I see I get notifications for changes on the share but filename info is missing. Trying to find other native examples on Windows to see if I can find a way to make it work but if no one has fixed it previously there's probably a reason 😓. Crossing my fingers 🤞

es128 commented 6 years ago

So if walk-filtered is a huge improvement for your setup, have you tried the swapping in the #412 version of chokidar in your original setup? This could help us determine if the problem you've experienced is in readdirp or somewhere in chokidar.

greggman commented 6 years ago

Sorry just starting to get back into this. What I did notice today is chokidar has issues on Windows even not on shares. I tried to run this small sample code (non-polling) just on one of my local node project folders as in "C:\Users\me\src\myproject" and it failed with stat errors in some of the node_modules subfolders. I haven't dug into why.

const fs = require('fs');
const path = require('path');
const chokidar = require('chokidar');

const watcher = chokidar.watch(process.argv[2], {
  alwaysStat: true,
});
watcher.on('add', (fn, stats) => { console.log(fn, "file", stats.size); });
watcher.on('addDir', (fn, status) => { console.log(fn, "dir"); });
watcher.on('ready', () => {
  console.log("---done---");
  watcher.close();
});

Since chokidar's readme mentioned that VSCode uses it and since VSCode doesn't crash on that folder I thought I'd go see how they are using chokidar. Turns out they only use chokidar on linux and mac but they use their own solutions on Windows

https://github.com/Microsoft/vscode/tree/2f76c44632b0d47ba97f66fbc158c763628e30b3/src/vs/workbench/services/files/node/watcher

For windows they have a C# program they wrote that they spawn that watches a tree and outputs the changes which they read.

kumarharsh commented 6 years ago

I don't think VSCode uses chokidar. It uses vscode-nsfw

greggman commented 6 years ago

vscode still uses chokidar. You can see it in the package.json, Here's the top level code for their watcher

This code

https://github.com/Microsoft/vscode/blob/master/src/vs/workbench/services/files/node/fileService.ts#L198

used vscode-nsfw based on a flag, otherwise it uses chokidar for linux/mac and their own custom solution for windows.

Maybe they'll remove chokidar and their custom solution in the future

paulmillr commented 5 years ago

new readdirp is out now, so this should be improved a lot with chokidar v3

jsantos98 commented 5 years ago

Using chokidar 3.0.0 (and readdirp 3.0.1) the problem still happens.

My test:

console.log(`####readdirp Start: ${new Date()}`);
let fileCount = 0;
readdirp('C:\\Users\\jpsantos\\AppData\\Local\\Temp\\AutomationDriverInstance_1904270036450000001', {fileFilter: '*.*', alwaysStat: true})
.on('data', (entry: any) => { fileCount++; })
.on('error', (error: any) => console.error('fatal error', error))
.on('end', () => console.log(`####readdirp End: ${new Date()}, files=${fileCount}`));

and

console.log(`####chokidar Start: ${new Date()}`);
let fileCount = 0;
const watcher = chokidar.watch(".", {
    cwd: 'C:\\Users\\jpsantos\\AppData\\Local\\Temp\\AutomationDriverInstance_1904270036450000001',
    useFsEvents: false,
    usePolling: true,
    alwaysStat: true,
});
watcher.on('add', (fn: any, stats: any) => { fileCount++; });
watcher.on('ready', () => {
    console.log(`####chokidar End: ${new Date()}, files=${fileCount}`);
    watcher.close();
});

got the following results:

####readdirp Start: Thu May 02 2019 10:51:15 GMT+0100 (GMT+01:00)
####readdirp End: Thu May 02 2019 10:51:16 GMT+0100 (GMT+01:00), files=1056

####chokidar Start: Thu May 02 2019 10:52:46 GMT+0100 (GMT+01:00)
####chokidar End: Thu May 02 2019 10:53:14 GMT+0100 (GMT+01:00), files=1056

Note: The directory used is a symlink to a network drive on my own computer:

01/05/2019  01:17    <SYMLINKD>     AutomationDriverInstance_1904270036450000001 [\\NB-PF13U9SQ\Network]

Any other thought?

paulmillr commented 5 years ago

I think this only happens because you're using a network drive, even though it's locally-bounded.

greggman commented 5 years ago

That is the entire point of the issue. That chokidar is 24x slower on a network drive than other techniques. Not sure why you closed it. Seems like 24x slower is something that should be fixed rather than ignored. Especially when there is ample evidence there are faster methods.

Also pointed out chokidar is just plain buggy on windows and pointed out that that VSCode saw this too and stopped using chokidar on windows.

ahmedbrandver commented 5 years ago

I am getting same issue in ubuntu mounted NAS folder. Chokidar don't read any changes in it until I access that folder through bash

whyboris commented 4 years ago

@greggman -- what does VS code use now? I see chokidar in its dependencies: https://github.com/microsoft/vscode/blob/master/package.json#L50 🤷

greggman commented 4 years ago

The watchers are here

https://github.com/microsoft/vscode/tree/master/src/vs/platform/files/node/watcher

On Windows it uses a C# app which it runs in a separate process and looks at its output

https://github.com/microsoft/vscode/tree/master/src/vs/platform/files/node/watcher/win32

michaelmorton-vh commented 4 years ago

@greggman have you found any workarounds for this issue?

greggman commented 4 years ago

I ended up making my own solution using the chokidar on Mac/Linux and the C# program from VSCode on Windows. Unfortunately my solution is not well tested and IMO poorly designed ATM but I haven't had time to deal with it.

I would prefer to find a way to add it back into chokidar instead but IIRC the semantics of what comes out of windows are different so , though I don't recall honestly as it's been a couple of years.

paulmillr commented 3 months ago

The backlog is going to start from zero as a preparation for v4 release. v4 would bring massive rewrite to the table and drop most dependencies. All issues are being closed as preparation for v4 release.

In the future, only issues with enough community support would be considered.

See issue 1195 for more info. Thank you.