web-infra-dev / rspack

The fast Rust-based web bundler with webpack-compatible API 🦀️
https://rspack.dev
MIT License
9.2k stars 537 forks source link

[Bug]: watcher blocks CPU after each rebuild #7490

Open vkurchatkin-miro opened 1 month ago

vkurchatkin-miro commented 1 month ago

System Info

ystem: OS: macOS 14.5 CPU: (10) arm64 Apple M1 Pro Memory: 167.59 MB / 32.00 GB Shell: 5.9 - /bin/zsh Binaries: Node: 20.11.0 - ~/.volta/tools/image/node/20.11.0/bin/node Yarn: 4.1.0 - ~/.volta/tools/image/yarn/1.22.10/bin/yarn npm: 10.2.4 - ~/.volta/tools/image/node/20.11.0/bin/npm Browsers: Chrome: 127.0.6533.99 Safari: 17.5 npmPackages: @rspack/cli: 1.0.0-beta.1 => 1.0.0-beta.1 @rspack/core: 1.0.0-beta.1 => 1.0.0-beta.1 @rspack/plugin-react-refresh: 1.0.0-beta.1 => 1.0.0-beta.1

Details

In our setup after each interactive rebuild the watcher is restarted and blocks the CPU for ~2 seconds. Subsequently HMR or live reloading is delayed for this amount of time.

It looks like watching is restarted from scratch after every rebuild here: https://github.com/web-infra-dev/rspack/blob/2bd905d47e731772365637188c929f5371e60267/packages/rspack/src/node/NodeWatchFileSystem.ts#L66

while it is not necessary and the watcher instance can be reused (watchpack supports watch to be called again so that it can only create watchers that don't exists).

Reproduce link

No response

Reproduce Steps

N/A

vkurchatkin-miro commented 1 month ago

This fixed the issue for me:

index c78273a3f..906abb084 100644
--- a/packages/rspack/src/node/NodeWatchFileSystem.ts
+++ b/packages/rspack/src/node/NodeWatchFileSystem.ts
@@ -62,8 +62,9 @@ export default class NodeWatchFileSystem implements WatchFileSystem {
                if (typeof callbackUndelayed !== "function" && callbackUndelayed) {
                        throw new Error("Invalid arguments: 'callbackUndelayed'");
                }
-               const oldWatcher = this.watcher;
-               this.watcher = new Watchpack(options);
+               if (!this.watcher) {
+                       this.watcher =  new Watchpack(options);
+               }

                if (callbackUndelayed) {
                        this.watcher.once("change", callbackUndelayed);
@@ -107,9 +108,6 @@ export default class NodeWatchFileSystem implements WatchFileSystem {

                this.watcher.watch({ files, directories, missing, startTime });

-               if (oldWatcher) {
-                       oldWatcher.close();
-               }
                return {
                        close: () => {
                                if (this.watcher) {
hardfist commented 1 month ago

while it is not necessary and the watcher instance can be reused (watchpack supports watch to be called again so that it can only create watchers that don't exists).

I test in webpack and it did recreate watcher every time, but it seems Rspack can reuse the old watcher since we use patch algorithm which is different than webpack now

hardfist commented 1 month ago

@vkurchatkin-miro it seems create watcher itself shouldn't cost too much but new watcher need to rewatch the old files that will cost time, can you help check whether the 1-2s is spent on

@vkurchatkin-miro can you share cpu-profile of before | after reuse watcher

vkurchatkin-miro commented 1 month ago

Here are profiles before, one is with some ignored entries, one without.

CPU-20240807T155610.cpuprofile CPU-20240807T162720.cpuprofile

vkurchatkin-miro commented 1 month ago

Here is a profile after CPU-20240808T164334.cpuprofile

escaton commented 1 month ago

@hardfist the time is spent here https://github.com/web-infra-dev/rspack/blob/2bd905d47e731772365637188c929f5371e60267/packages/rspack/src/node/NodeWatchFileSystem.ts#L108

But I doubt the straightforward solution https://github.com/web-infra-dev/rspack/issues/7490#issuecomment-2273955633 will work in cases when you add new import and add new modules to the build graph.

I'm thinking of comparing files, directories and missing with their previous values and only recreating Watchpack if they are different.

escaton commented 1 month ago

I think the ideal solution will be if Watchpack has an incremental implementation for watch that would only add new handlers for new files.

hardfist commented 1 month ago

But I doubt the straightforward solution https://github.com/web-infra-dev/rspack/issues/7490#issuecomment-2273955633 will work in cases when you add new import and add new modules to the build graph.

actually it works for add module but not idea for remove module since reuse the watcher will not remove the delete module

I think the ideal solution will be if Watchpack has an incremental implementation for watch that would only add new handlers for new files.

yeah the ideal solution would be this, but watchpack don't have incremental api now, and we consider a rust implementation of watcher in the future, so we plan to reuse watcher when no deletion happens, if deletion happens create a new watcher

vkurchatkin-miro commented 1 month ago

It actually looks like watchpack is incremental already. The implementation starts here: https://github.com/webpack/watchpack/blob/main/lib/watchpack.js#L166

It only make changes that are required since the previous watch call:

https://github.com/webpack/watchpack/blob/main/lib/watchpack.js#L253-L264

                // Close unneeded old watchers
        // and update existing watchers
        for (const [key, w] of fileWatchers) {
            const needed = fileWatchersNeeded.get(key);
            if (needed === undefined) {
                w.close();
                fileWatchers.delete(key);
            } else {
                w.update(needed);
                fileWatchersNeeded.delete(key);
            }
        }
escaton commented 1 month ago

@vkurchatkin-miro indeed, but it's a bit late, as it already runs all files through filter() right before.

hardfist commented 1 month ago

@vkurchatkin-miro indeed, but it's a bit late, as it already runs all files through filter() right before.

yes, and a more incremental friendly api would be incremental_watch(added_filed,removed_files) which let users deal with diff logic, so the computation would be minimal since users may already have this info. the current implementation would still require watch do tons of extra work before diff internal watchers

hardfist commented 1 month ago

This fixed the issue for me:

index c78273a3f..906abb084 100644
--- a/packages/rspack/src/node/NodeWatchFileSystem.ts
+++ b/packages/rspack/src/node/NodeWatchFileSystem.ts
@@ -62,8 +62,9 @@ export default class NodeWatchFileSystem implements WatchFileSystem {
                if (typeof callbackUndelayed !== "function" && callbackUndelayed) {
                        throw new Error("Invalid arguments: 'callbackUndelayed'");
                }
-               const oldWatcher = this.watcher;
-               this.watcher = new Watchpack(options);
+               if (!this.watcher) {
+                       this.watcher =  new Watchpack(options);
+               }

                if (callbackUndelayed) {
                        this.watcher.once("change", callbackUndelayed);
@@ -107,9 +108,6 @@ export default class NodeWatchFileSystem implements WatchFileSystem {

                this.watcher.watch({ files, directories, missing, startTime });

-               if (oldWatcher) {
-                       oldWatcher.close();
-               }
                return {
                        close: () => {
                                if (this.watcher) {

It's weird that this shouldn't fix the performance problem for you, if your performance bottleneck is here https://github.com/web-infra-dev/rspack/issues/7503#issuecomment-2275795127, since even we reuse the old watchers it will still reexecute the same times filter function(since the this.watchers.watch still reexecute), @escaton @vkurchatkin-miro can you help double check this?

escaton commented 1 month ago

Agree, it shouldn't fix the issue, as there is still https://github.com/web-infra-dev/rspack/blob/2bd905d47e731772365637188c929f5371e60267/packages/rspack/src/node/NodeWatchFileSystem.ts#L108 running every time.

actually it works for add module but not idea for remove module since reuse the watcher will not remove the delete module

@hardfist can you clarify it? If I add create a new file and import it, how watcher will start watching it?

hardfist commented 1 month ago

@escaton the new import file and its dependencies will be insert into the new files set in the done hook, so the watcher can watch it

escaton commented 1 month ago

But in order for watcher to start watching it, the Watchpack::watch needs to be called.

hardfist commented 1 month ago

But in order for watcher to start watching it, the Watchpack::watch needs to be called.

yes, https://github.com/web-infra-dev/rspack/issues/7490#issuecomment-2273955633 did call watch

escaton commented 1 month ago

I'm in doubt about where to add the early bailout:

If instead, it will be a Set, or better, there will be a flag if the array has been changed since the last build or not, that will be perfect.

hardfist commented 1 month ago

we do have dirty flag in compilation and we can check whether dependencies change before call watch

escaton commented 1 month ago

Can you point where to look?

hardfist commented 1 month ago

Can you point where to look?

I check the code, and not record update dependencies info, we will add related api soon

vkurchatkin-miro commented 1 month ago

It's weird that this shouldn't fix the performance problem for you, if your performance bottleneck is here https://github.com/web-infra-dev/rspack/issues/7503#issuecomment-2275795127, since even we reuse the old watchers it will still reexecute the same times filter function

That's fair, it fixes like 50% of the issue, another 50% are fixed by removing all the ingnores from the config

escaton commented 1 month ago

@jerrykingxyz @hardfist Since the added/removed compilation dependencies are now exposed, I can contribute the fix for the watcher, if you haven't started working on it yet.

hardfist commented 1 month ago

@jerrykingxyz @hardfist Since the added/removed compilation dependencies are now exposed, I can contribute the fix for the watcher, if you haven't started working on it yet.

@escaton we haven't started yet, you can give it a try, the general idea is to provide a rewatch api for watchpack(you can extends Watchpack class to provide this method) which use the incremental fileDependencies info

jerrykingxyz commented 1 month ago

@jerrykingxyz @hardfist Since the added/removed compilation dependencies are now exposed, I can contribute the fix for the watcher, if you haven't started working on it yet.

@escaton You need to pay attention to the contents of added/removed fields

  1. compilation.#inner.dependencies().addedFileDependencies includes files added during the make stage and FileDependencies added by plugins. The dependencies added by plugins will always be included. Don't worry though, The number of dependencies added by plugins is usually not very large and will not have a significant impact on performance.
  2. compilation.#inner.dependencies().removedFileDependencies only include removed file in make stage.

Although there are defects in these fields, it will not affect your work. I will fix them after rspack stable.