vitejs / vite

Next generation frontend tooling. It's fast!
http://vitejs.dev
MIT License
66.18k stars 5.92k forks source link

Support @parcel/watcher #13593

Open rtsao opened 1 year ago

rtsao commented 1 year ago

Description

Chokidar is slow and known to have issues with file descriptor limits when working with large projects. Parcel's watcher is much faster out of the box and additionally supports watchman as a backend, which improves support for Windows and minimizes overhead for projects already using watchman.

Nuxt recently added support for @parcel/watcher: https://github.com/nuxt/nuxt/pull/20179

Suggested solution

Add option to use @parcel/watcher

Alternative

No response

Additional context

Related issues:

Twitter discussion https://twitter.com/patak_dev/status/1649060364431028226

Validations

bluwy commented 1 year ago

We've discussed this before (Discord thread) and are open to trying it in Vite 5. The main blocker for me is that @parcel/watcher isn't publishing precompiled binaries like lightningcss and esbuild, so it brings a larger published size by default.

silverwind commented 1 year ago

We've discussed this before (Discord thread) and are open to trying it in Vite 5. The main blocker for me is that @parcel/watcher isn't publishing precompiled binaries like lightningcss and esbuild, so it brings a larger published size by default.

Vite should not depend on a C++ compiler being present as that is a major pain point for users.

bluwy commented 1 year ago

How so? Vite is already using esbuild. And has ecosystem support for SWC and lightningcss.

sapphi-red commented 1 year ago

@parcel/watcher isn't publishing precompiled binaries like lightningcss and esbuild

@silverwind I think you misunderstood this comment. I guess you understood that comment as "@parcel/watcher isn't publishing precompiled binaries" and "@parcel/watcher uses node-gyp / C++ compiler". But I believe @bluwy meant to say "@parcel/watcher isn't publishing precompiled binaries for each OS/arch in a separate package like lightningcss and esbuild does".

silverwind commented 1 year ago

Ah, so it is one big package of precompiled binaries with all os and archs in the package. Not ideal for install performance, but at least requires no install-time compilation, so it may be acceptable.

gajus commented 1 year ago

Just for my curiosity, any reason for not considering Turbowatch?

Mentioned in the original thread https://github.com/vitejs/vite/issues/12495#issuecomment-1602091976

It uses fs.watch under the hood by default and otherwise fallsback to other mechanisms, including Watchman.

silverwind commented 1 year ago

Just for my curiosity, any reason for not considering Turbowatch?

Mentioned in the original thread #12495 (comment)

It uses fs.watch under the hood by default and otherwise fallsback to other mechanisms, including Watchman.

From a maintenance standpoint, a pure JS module would be much better than C++ modules, which tend to require constant maintenance to not break because of ABI changes etc. It you check the issue tracker of parcel-bundler, there are a number of open issues about non-working platforms.

gajus commented 1 year ago

That + it is very easy to add custom backends. Here is an example chokidar backend.

The default backend basically checks if fs.watch can be used, and if it can, then it uses it, or otherwise fallsback to chokidar.

The other thing worth mentioning is that Turbowatch implements file hashing on top of file change triggers provided by the backend. In practice, we've found that this dramatically reduces the number of unnecessary re-renders as compared to mtime based systems.

devongovett commented 1 year ago

Releasing separate npm packages per platform has been on my list for @parcel/watcher for a while, just haven't had time yet. If someone wants to contribute that, it would be helpful, otherwise I will get to it at some point soon hopefully.

which tend to require constant maintenance to not break because of ABI changes etc

This particular change is very rare actually. Node-API is ABI stable, so should not break between versions of node. There are a few very obscure platforms that we haven't implemented backends for yet, or we need to provide pre-builds for so you don't need a compiler to install (once we have separate packages per platform this will be much easier), but in general it is very stable. It's exceedingly rare that an issue comes up that is caused by the watcher itself. Implementing low level stuff like this in compiled languages with deep integration into the OS provides a lot of benefits, especially in regards to performance (e.g. watching and debouncing occurs on a background thread).

@parcel/watcher is extremely widely used at this point (3M+ downloads per week on npm alone), including by VSCode, Nuxt, Nx, Parcel, etc. If there are things Vite needs in order to adopt it, I'm happy to help where I can.

devongovett commented 1 year ago

Update: I've released @parcel/watcher v2.2.0 which publishes individual packages per platform/architecture, and includes prebuilt binaries for a bunch more as well. See the release notes for a full list.

bluwy commented 1 year ago

I made two branches to test it out:

Made fs.watch mainly because we need a wrapper for @parcel/watcher to work, and that wrapper can be easily applied to fs.watch too. Both branches are not optimized and failing in tests now though (pnpm test-serve).

For @parcel/watcher, I'm getting FATAL ERROR: HandleScope::HandleScope Entering the V8 API without proper locking in place. For fs.watch, the tests are hanging somewhere. Both are likely because they are creating many watcher instances to fit Vite needs, where we need an API to watch out-of-project-root paths.

devongovett commented 1 year ago

The Entering the V8 API without proper locking in place error appears to be due to the tests running inside worker threads. If you set the vitest threads configuration to false it does not occur. This hasn't been something we've needed yet, but I can look into it. Sorry about that!

devongovett commented 1 year ago

I did some work to support multiple threads over in https://github.com/parcel-bundler/watcher/pull/146. There is an alpha version published if you want to try it out: v2.2.1-alpha.0. I ran the vite tests from your branch with this version and it seemed to work better. The remaining failures seem related an API change for the on method in your Watcher class that aren't accounted for in @vitejs/plugin-vue for example (causing "eventHandler is not a function" errors later). Let me know if this works better for you. I'll be doing a bit more testing on my side as well to ensure this is stable before releasing.

bluwy commented 1 year ago

Thanks! I forgot to give an update that the threads: false tip helped. I've fixed the plugin-vue issue and pushed it to the branch. I'm not getting the V8 errors now, but there are existing HMR test fails which I haven't figure out yet.

I'm also curious on your thoughts if the Watcher class and how it calls subscribe is performant. I do wish @parcel/watcher could provide that Watcher interface directly instead (similar to chokidar), but maybe that's out of scope of the project.

devongovett commented 1 year ago

Yeah it's probably not the most performant. On macOS it's probably fine, but on Linux there is a pretty low inotify watcher limit and watchers are not recursive by default so we have to traverse the directory hierarchy and watch each sub-directory manually. I noticed that you're calling subscribe on a lot of overlapping directories inside the project, rather than watching once from the root of the project. This will currently result in common directories being traversed and watched more than once.

If possible I'd recommend watching from the root of the project instead of watching each directory. You could also potentially use a similar interface to what you have but keep track of all the directories that were watched and find the minimal set of common root(s) between them to actually watch. We could potentially do something like that in @parcel/watcher itself eventually, but I might even implement that as a JS wrapper anyway - not sure if there is much benefit to doing it in C++.

bluwy commented 1 year ago

It currently also adds the project root here

https://github.com/vitejs/vite/blob/f385f2025c63f8804795ee359e53e4e76fcc27d1/packages/vite/src/node/server/index.ts#L356

So the subscribe shouldn't be called too much as there's an overlap guard, unless it's out of root. In Rollup/Vite, we have a this.addWatchFile API for plugins to watch and depend on certain files, so we need a Watcher.add function to dynamically handle those. A potentially not-so performant part with this is that we can only watch directories, so addWatchFile can only watch the file's directory and indirectly watch more than it's needed.

@ArnaudBarre also had an idea that we only watch what's needed. For example, at the start we don't watch anything, but as we start to serve the app and crawl the transformed files, we add each of them to be watched. With the current Watcher class, this might not be great for perf.

devongovett commented 10 months ago

Apologies for the delay! I have now released @parcel/watcher v2.3.0, which includes support for worker threads, a new kqueue backend to support freebsd, and a wasm backend to support stackblitz. https://github.com/parcel-bundler/watcher/releases/tag/v2.3.0

bluwy commented 8 months ago

I've took another stab to use @parcel/watcher (https://github.com/vitejs/vite/pull/14731). We need to keep compatibility with chokidar's API for now to not break the ecosystem, which I implemented in the PR, but it's becoming a larger undertaking than expected.

The wrapper to use @parcel/watcher is fairly big that it might be worth creating a separate library too, so for now we'll be moving this out of the Vite 5 milestone. We can definitely experiment this more in future minors (e.g. optional peer deps to enable it) and then create a migration path for the ecosystem to move to this new watcher API.

gajus commented 8 months ago

@bluwy If compatibility with chokidar is desirable, what was the reason for not considering Turbowatch?

We've been using it now for ~7 months without an issue.

bluwy commented 8 months ago

Personally the package size is too big (though Vite bundle dependencies and I have not measured what the size would be). But it also depends on chokidar, and in that case it would be simpler if we use chokidar directly? It looks certainly robust but we're not going to use all of its features, and we usually try to balance the cost of functionality per package size in Vite.

gajus commented 8 months ago

Interesting. Was not even aware of the package size.

If zx gets updated (or if it is made a peer dependency), the size drops to 10MB.

Removing of the other dependencies though would be challenging without compromising the developer experience.

Could maybe separate the core from the binary. Happy to explore if there is interest.

devongovett commented 8 months ago

Please let me know what would help. We can probably put at least some of what you need in @parcel/watcher itself. I've already put in quite a bit of work to support Vite, so it would be awesome to see it through! 🙂

bluwy commented 8 months ago

@devongovett Appreciate the work as always! I think what we need is an easy migration path to use it, since it's usually an implementation detail so we don't want it to be too breaking. A couple ideas:

  1. Create a chokidar fork with the same API but uses @parcel/watcher.
  2. Or create a new (and better) watcher interface, and Vite can easily create a chokidar compat layer.

The packages could be done by anyone though. We can then support as an optional peer dep and push it. However I'm a bit tied on bandwidth at the moment 😬


What would be nice in regards to @parcel/watcher is:

  1. https://github.com/parcel-bundler/watcher/issues/92
  2. A way to identify if create or delete is caused by a file or directory (chokidar has this, though I don't have a large usecase for it)

Additional notes:

  1. We initially thought that @parcel/watcher was similar to chokidar, but it seems to be closer to fs.watch so it's not the anticipated amount of work we expected for the milestone. Though I tried my best attempting it.
  2. An ideal case would be to have Rollup adopt @parcel/watcher too, and I think the requirements would be the same as us.
ArnaudBarre commented 8 months ago

From what I know of Vite HMR, I don't think 1. and 2. are needed. Plus I think that this change would be the occasion to remove the default watch of the whole working dir that then need some edge cases. This is something that I like to work on, but didn't get to it during v5 beta

bluwy commented 8 months ago

1 and 2 is needed because we expose server.watcher. I've also thought about the "watch lazily" plan but ultimately I dont think it's feasible because:

ghiscoding commented 8 months ago

Chokidar also supports glob pattern for file watch and ignored option as well but I don't see any glob support by Parcel Watcher (at least not from the docs on the readme).

To be fair though, it seems that the Chokidar author wanted to remove globs in this open Chokidar version 4 PR but I doubt it would be merge anytime soon since the PR is 1.5 year old at this point... which reminds me globs feature was removed from rimraf not long ago but quickly added back because too many users begged for the glob support :). So anyway, does Parcel Watcher support globs and if not then how would Vite go about it?

devongovett commented 8 months ago

Yes, glob ignores are supported. They're pretty efficient too - they get compiled to a regex and matched in the C++ layer (off main thread). https://github.com/parcel-bundler/watcher#options

benmccann commented 1 month ago

Yeah it's probably not the most performant. On macOS it's probably fine, but on Linux there is a pretty low inotify watcher limit and watchers are not recursive by default so we have to traverse the directory hierarchy and watch each sub-directory manually.

Node just made native recursive watching available on Linux. It was originally added in Node 20, but had some bugs, so you couldn't use it just yet. It seems to be working now as of Node 20.13/22.1+. Perhaps @parcel/watchers could adopt it now?

devongovett commented 1 month ago

We have already used the same OS APIs that node is using for years. Their implementation would be subject to the same OS limits.

benmccann commented 1 month ago

Ah, yeah. I guess you would need to implement fanotify to address this (https://github.com/parcel-bundler/watcher/issues/87)

benmccann commented 1 month ago

We need to keep compatibility with chokidar's API for now to not break the ecosystem

This feels like probably too large of a barrier to me. Vite exposes chokidar's options, which means you would need to re-implement every single option that chokidar has. And it's also a bit constraining too - what if @parcel/watcher has an option that chokidar doesn't? Then how do users access it? If we're releasing a new major then I think we can break the API and have folks migrate. Especially since most users won't be setting the server.watch option anyway.

If we told people they needed to migrate from chokidar to @parcel/watch then there are some things that don't seem possible to migrate today, which is the larger concern to me. Probably the largest one I noticed is the followSymlinks option that chokidar has : https://github.com/search?q=followSymlinks+path%3Avite.config&type=code

What would be nice in regards to @parcel/watcher is: Support a non-recursive mode parcel-bundler/watcher#92

Does chokidar support this? I didn't see that it did, but might have missed it

A way to identify if create or delete is caused by a file or directory (chokidar has this, though I don't have a large usecase for it)

I guess this refers to having add|addDir|unlink|unlinkDir events.

I found only one example of listening only to directory events: https://github.com/avisek/frontend-mentor-solutions/blob/8deb2852d58228a4dffe06d4653c8b8ceab904d0/vite.config.ts#L174

Listening only to file events happens more often, but I'm not sure if adding in directory events would harm anything in these use cases or not.

E.g. with vite-plugin-watch-and-run: https://github.com/patsissons/movies.place/blob/822051126dc30e077bb381528f62c7e7fe9fe429/vite.config.ts#L26

And vite-plugin-virtual-mpa: https://github.com/PhoenixIllusion/babylonjs-jolt-physics-plugin/blob/9e8980804bbea735cdadc141553b2eeb7f32cf6e/basic-examples/vite.config.ts#L29

ghiscoding commented 1 month ago

Does chokidar support this? I didn't see that it did, but might have missed it

I assume you made a typo and meant @parcel/watcher instead of chokidar

benmccann commented 1 month ago

No, I actually meant chokidar. Because if chokidar doesn't support it, then we're no worse off if we switch to another library that also doesn't support it

devongovett commented 1 month ago

Just so it's on the record: I am very open to adding new features to @parcel/watcher to support Vite. We just need to define what those features are, and someone needs to have time to implement them.

benmccann commented 1 month ago

it seems that the Chokidar author wanted to remove globs in this open https://github.com/paulmillr/chokidar/pull/1195 PR but I doubt it would be merge anytime soon since the PR is 1.5 year old at this point...

Someone else picked up that PR in https://github.com/paulmillr/chokidar/pull/1304, which is being actively worked on, so it does look like glob support will probably be removed from chokidar soon

benmccann commented 1 month ago

I wonder in which ways @parcel/watcher is better than chokidar? It sounds like it might primarily be better for Windows users:

Parcel's watcher is much faster out of the box and additionally supports watchman as a backend, which improves support for Windows and minimizes overhead for projects already using watchman.

But is it worth adding all these native dependencies for Mac and Linux users? Why not just use the Node.js APIs on those platforms instead of needing a native distributable?

devongovett commented 1 month ago

Why not just use the Node.js APIs on those platforms instead of needing a native distributable?

same reasons you'd use chokidar: https://github.com/paulmillr/chokidar?tab=readme-ov-file#why

I wonder in which ways @parcel/watcher is better than chokidar?

Generally more efficient: It is implemented natively, runs in a background thread, avoids crawling the file system on startup (see tweet linked above from when Nuxt switched), etc. See also: this issue on vscode which led to them switching away from chokidar. https://github.com/microsoft/vscode/issues/3998

bluwy commented 1 month ago

@benmccann

Re chokidar options, I mean it's exposed API instead, under server.watcher where plugins could call .add() to watch additional files etc. The options can be handled like how lightningcss is integrated today, so we don't have to match the options.

Re non-recursive watching, chokidar supports this when you watch a single file, and only its directory is watched. Any recursive directories within it is not watched. This is commonly used for this.addWatchFile.


Also, (personally) @parcel/watcher's scope now is more comparable to fs.watch than chokidar. We need a higher-level interface to manage watched files. (The relationship is like svelte and sveltekit)

And with chokidar v4 coming along (could propose changes), plus node constantly improving fs.watch, maybe it still make sense to stick with chokidar?

benmccann commented 1 month ago

@devongovett what I meant is why doesn't @parcel/watcher use the Node.js APIs instead of needing a native distributable? Is there some API that Node doesn't expose that @parcel/watcher requires or something like that?

I just looked through the two open issues linked to from the issue description and they don't really relate to what file watching library we use. One is about which files we should watch and the other is about the dev server hanging mostly related to circular dependencies or maybe too many open files (but not too many watched files).

I'm wondering if anyone is even having issues with chokidar now at all or what the driving motivation for a switch would be. It would be helpful if there were some very large project we could test on where you could clearly see the difference between libraries. While some other projects have chosen @parcel/watcher, they have different usage patterns and make use of different APIs and so I don't know if people are seeing the same types of issues while using Vite. I don't really see issues clearly attributable to chokidar in the issue tracker here.

devongovett commented 1 month ago

And with chokidar v4 coming along (could propose changes), plus node constantly improving fs.watch, maybe it still make sense to stick with chokidar?

Is there info on this somewhere? The v4 branch on the repo is over 2 years old.

why doesn't @parcel/watcher use the Node.js APIs instead of needing a native distributable?

It uses low level C APIs provided by the OS that Node doesn't use or expose. That's why it's more efficient.

What's your motivation for commenting on this issue? You don't like native modules? FWIW, chokidar uses a native dependency too. Seems like you're interested in this space but if you're not having issues with chokidar then why get involved here?

I'm wondering if anyone is even having issues with chokidar now at all or what the driving motivation for a switch would be.

There are clear issues that led to this being opened in the first place, and led others like Nuxt to switch. I don't use Vite so I'm sure others can speak to this better than me. I'm mainly here to help Vite adopt if they choose to. I have already put in a significant amount of work to help with this, including implementing several new features, as you can see in the thread. So far it has been deferred, but I remain available to help if needed.

benmccann commented 1 month ago

Is there info on this somewhere? The v4 branch on the repo is over 2 years old.

There was active discussion today about releasing chokidar v4 in the next month: https://github.com/paulmillr/chokidar/pull/1304

It uses low level C APIs provided by the OS that Node doesn't use or expose. That's why it's more efficient.

Any details you can share? I'm curious which situations we'd see the performance differences

What's your motivation for commenting on this issue? Seems like you're interested in this space but if you're not having issues with chokidar then why get involved here?

I'm a maintainer and want to ensure we're doing our due diligence in choosing whether to introduce breaking changes, native dependencies, etc.

You don't like native modules? FWIW, chokidar uses a native dependency too.

I think that they can at times introduce an extra layer of complications. I know I've hit issues with sharp's on numerous occasions, but esbuild's have never caused much of an issue. I'm not really sure why they've caused me headaches in one situation and not the other, but it does seem like it would be nice to avoid if possible. We do use native dependencies already, so I'm not saying we can't use native dependencies, but I was interested to understand why they help since it seems like it's pretty much the same underlying API used (e.g. inotify) in either case

FWIW, chokidar uses a native dependency too.

I guess you're referring to fsevents? It's an optional dependency only used on MacOS and has been removed entirely in chokidar v4 now that updates in the latest versions of Node.js make it obsolete

I have already put in a significant amount of work to help with this, including implementing several new features

I definitely appreciate the willingness to help!! And to be clear I'm just one of many maintainers and not a core maintainer, so wouldn't want you implementing anything based off anything I've raised here. I think the Vite team as a whole should discuss whether we'd want to migrate and what the blockers would be so that we don't cause you any unnecessary work.

devongovett commented 1 month ago

Ok, here are the main things I am aware of in regards to performance:

  1. Recursive file watching is supported on all platforms natively. If crawling needs to be done (e.g. on Linux where inotify is not recursive at an OS level), it is done on a background thread. This used to not be supported by Node on Linux at all, but they recently added support. However, like chokidar before, it is implemented in JavaScript on the main thread, and does not support ignore paths, so it is unlikely to be much more performant than chokidar. This likely has a huge impact on startup times on these platforms.
  2. Events are batched in a background thread and delivered to the JS thread all at once rather than per file/event. This is especially useful for large changes like switching git branches or running npm install. In these scenarios, there are often hundreds of thousands of events emitted by the operating system, some of which are irrelevant or temporary. For example, there will be a separate event for the creation of each file, then writing to it, updating the metadata (e.g. mtime, size), etc. Sometimes temporary files are created and then renamed or deleted as part of these processes. @parcel/watcher will batch all of these events together and only report the final state to the JS thread, and in addition batch together events from multiple files into one callback. This means the JS thread only needs to worry about the changes that actually matter rather than intermediary states, and doesn't get overwhelmed with unnecessary events that invalidate the build many times. fs.watch always calls the event callback immediately for each individual event of each individual file.
  3. Ignore lists are processed in the native layer. These can be directories to ignore like .git, or globs like **/dist/**. This is especially efficient on some operating systems that provide a native way to ignore paths, in which case events for irrelevant files are never even delivered to the process. In other cases, @parcel/watcher can efficiently process the ignores in its background thread, and the JS thread doesn't need to worry about these. fs.watch doesn't have an option for ignore paths or globs at all.
  4. On OSes where @parcel/watcher needs to crawl the file system to create watchers (e.g. Linux/BSD), it can avoid even creating watchers for ignored directories entirely. This helps with common problems like running out of file descriptors, as well as startup performance. fs.watch and anything built on it like chokidar, has no way of natively ignoring files like this. It is forced to handle all events and throw them out after, resulting in worse startup performance and file descriptor errors.
  5. Multiple watchers for the same directory are deduplicated and only result in a single native watcher, reducing startup cost and consumed file descriptors.
  6. Watchman is used if installed. This improves performance especially on linux because watchman runs as a daemon in the background, so it doesn't need to re-crawl the file system every time your process restarts.

In terms of reliability, @parcel/watcher fixes many of the caveats in the node docs for fs.watch, similar to chokidar. It ensures that filenames are always reported, supports recursive watchers on all platforms, attempts to determine the real type of change that occurred when files are renamed, etc.

I am aware that there have been some improvements to fs.watch lately and that's great, but it is quite a bit lower level (basically just delivers OS events directly) so a library on top of it like chokidar or a replacement like @parcel/watcher is probably still needed to get good reliability and consistency across platforms. @parcel/watcher has the additional benefit of using OS APIs directly rather than being built on top of fs.watch, so it can achieve better performance for the reasons I outlined above as well.

benmccann commented 1 month ago

Thank you @devongovett!! This is really great information! (You should add it to the @parcel/watcher readme or docs as it's really great advertising :smile:)

paulmillr commented 1 month ago

has the additional benefit of using OS APIs

with a disadvantage of being a native module which doesn't run everywhere

devongovett commented 1 month ago

We've worked pretty hard to support as many platforms as possible. Is there one missing that you need?

paulmillr commented 1 month ago

not really

it’s that I maintain fsevents and through the years and nodejs updates some bug (incl compilation bug) somewhere will arise. so it needs to be constantly updated and supported.

shellscape commented 1 week ago

Just for my curiosity, any reason for not considering Turbowatch?

The response to the issue for the Windows user would be my # 1. The complete lack of maintenance for the last year would be my # 2.