ojkelly / yarn.build

Build 🛠 and Bundle 📦 your local workspaces. Like Bazel, Buck, Pants and Please but for Yarn Berry. Build any language, mix javascript, typescript, golang and more in one polyglot repo. Ship your bundles to AWS Lambda, Docker, or any nodejs runtime.
https://yarn.BUILD
MIT License
326 stars 28 forks source link

Always rebuilds in a monorepository although all workspaces are uptodate #231

Closed jeffrson closed 2 years ago

jeffrson commented 2 years ago

Describe the bug I have a monorepository with about 35 workspaces. yarn build from workspaces root or with -A rebuilds most (not all!) workspaces. However, when I go into any of the workspaces that are be rebuilt (-d) and execute yarn build there, it does not rebuild, but says all projects before inside the tree are up-to-date.

To Reproduce Not really - maybe local problem - but there is no way to get hold on why :-(

Expected behavior Should not rebuild when there are no changes.

Desktop (please complete the following information):

What can I do to find out why it is rebuilding? Maybe yarn build could log why it is not using cached builds.

Edit: It seems there is a project that is rebuilt at the latest branch of the dependency tree. That is probably because the "build" script in its package.json copies prebuilt files into it with newer timestamps. That's a "local" thing, yes. However, I cannot find a reason why yarn.build is rebuilding the whole dependency tree starting from its root.

jeffrson commented 2 years ago

So I tested a bit (by using yarn build -d) what causes rebuilds for me in this case. Here are my findings which partly may be seperate issues...

I can understand that it's not feasible to read all files completely in order to detect changes and that therefore a new timestamp is a cause for rebuilding, but probably change in size should be as well. In any case, additional, missing or renamed files (eg in an assets subfolder) should trigger a rebuild.

cmark1302 commented 2 years ago

Have you tried specify this on each package.json ?

  "yarn.build": {
    "input": "src",
    "output": "dist"
  },
zinserjan commented 2 years ago

I have the same problem. I debugged the whole morning to find the reason for that behavior.

In my case I identified three problems:

  1. Output will be determined automatically, even if you override it via the yarn.build.config within your package.json (not really a problem, but good to know): 1.1. Get output config from package.json or use default (build) 1.2 Enhances already provided output with fields from package.json (bin, files, main)
  2. Cache status determination: All collected output paths must be an existing directory, otherwise rebuild is triggered.
  3. If one project in your monorepo needs a rebuild, every project that depends on it needs a rebuild. This is correct and expected, but unfortunately every dependency of the dependent projects will be rebuild again, regardless of its cache status. With my project structure and dependency tree this rebuilds most of the time the whole workspace.

At the moment I'm preparing a PR that will address 2. and 3.

ojkelly commented 2 years ago

Thanks for the details and troubleshooting on this issue. I've had quite a busy month, and so haven't been able to look into it yet.

What can I do to find out why it is rebuilding? Maybe yarn build could log why it is not using cached builds.

I've been working on a rewrite of the build graph and scheduler with the primary goal of making it fully explain itself.

copying a file locally in a src folder, thereby preserving timestamp, does not trigger rebuild changing timestamp of an existing file (without other modification) results in rebuild

Yep, the change detection is relatively naïve at the moment. I've been exploring hashing the contents of the directory (which is a step to enabling a remote build cache).

I have the same problem. I debugged the whole morning to find the reason for that behavior.

In my case I identified three problems:

  1. Output will be determined automatically, even if you override it via the yarn.build.config within your package.json (not really a problem, but good to know):

This is a bug.

1.1. Get output config from package.json or use default (build) 1.2 Enhances already provided output with fields from package.json (bin, files, main)

  1. Cache status determination: All collected output paths must be an existing directory, otherwise rebuild is triggered.
  2. If one project in your monorepo needs a rebuild, every project that depends on it needs a rebuild. This is correct and expected, but unfortunately every dependency of the dependent projects will be rebuild again, regardless of its cache status. With my project structure and dependency tree this rebuilds most of the time the whole workspace.

Yep this is most definitely a bug.

At the moment I'm preparing a PR that will address 2. and 3.

Thanks a fix will be much appreciated.

zinserjan commented 2 years ago

Thanks a fix will be much appreciated.

I started with a simple fix, but I found more and more issues while testing it... I'll submit a PR after work. One error that I know of is still open, but is definitely fixable.