Open UebelAndre opened 9 months ago
You can also get the same warnings, on latest (1.34.1
) and bazel 7.0.0
, by using npm_link_all_packages
.
That'll give you a warning per package.
Yup, I see that. Every package warns about this.
I believe Bazel 7 made this existing warning more noisy for some reason. It would be really useful if someone could bisect where this was introduced in Bazel, maybe that commit message will give us more clues about why it does this.
Adding this to .bazelrc
makes the warning go away:
# Allow the Bazel server to check directory sources for changes. Ensures that the Bazel server
# notices when a directory changes, if you have a directory listed in the srcs of some target.
# Recommended when using
# [copy_directory](https://github.com/aspect-build/bazel-lib/blob/main/docs/copy_directory.md) and
# [rules_js](https://github.com/aspect-build/rules_js) since npm package are source directories
# inputs to copy_directory actions.
# Docs: https://bazel.build/reference/command-line-reference#flag--host_jvm_args
startup --host_jvm_args=-DBAZEL_TRACK_SOURCE_DIRECTORIES=1
(from https://docs.aspect.build/guides/bazelrc#correctness-options )
Why not use globs instead? I’d rather not be relying on unsound behavior
Hey,
There is an issue opened on this topic: https://github.com/bazelbuild/bazel/issues/18646
I've tried with build --incompatible_disallow_unsound_directory_outputs=false
but no success
The issue i have with this on my side, is that the cache is not hit anymore :-(
FYI i am using rules_js only to provide a macro to release component in my monorepo through semantic-release I confirm that this behaviour was not present with bazel 6.4.0
@UebelAndre We don't want globs since that produces thousands of input files. We just want one input directory. It's Bazel's fault :) @opicaud I think that's a different issue about directory outputs. This one is about directory inputs.
@UebelAndre We don't want globs since that produces thousands of input files. We just want one input directory. It's Bazel's fault :)
I just don't want this warning and don't want to be required to use some experimental behavior to make them go away :(
It would be a massive performance regression to hand Bazel an input file for every file in the node_modules directory, so I don't think we'd consider turning that on for everyone. If you have a specific proposal of how to change it, I guess someone can make a PR on a fork of rules_js to bypass the warning.
@alexeagle Have you considered downloading the archive with http_file
/rctx.download
and then doing the untar in a build action and consuming the untarred directory in downstream actions? So it stays as one input to downstream actions but isn't a source directory.
The relevant commit to Bazel appears to be https://github.com/bazelbuild/bazel/commit/2fe450fd7f8888e619e38aa0b359073e801f1120, referencing https://github.com/bazelbuild/bazel/issues/18579
This does indeed appear to be also related to outputting directories.
Rules outputting directories (which is not sane, but unfortunately happens)
This implies that the copy_directory
rule is itself not sane, as it is a rule that outputs a directory. The general pattern is to use archives (e.g. tar or jar) instead of directories, and extract as-needed downstream.
@alexeagle Alternatively to avoid doing an untar in a build action, you could glob(["**/*"])
up all the inputs and use copy_to_directory
to turn them into a single directory and expose that as the directory in node_modules
.
Rules outputting directories (which is not sane, but unfortunately happens)
@SanjayVas I believe this is referring to rules which do ctx.actions.declare_file
during analysis but then create a directory during execution, such as the genrule in the example on that issue. See the later comments.
A directory properly declared with ctx.actions.declare_directory
(as copy_to_directory
does here) is well supported.
@alexeagle Alternatively to avoid doing an untar in a build action, you could
glob(["**/*"])
up all the inputs and usecopy_to_directory
to turn them into a single directory and expose that as the directory innode_modules
.
If a performance hit is observed with this change, odds are they can be reclaimed by making changes to Bazel.
Analysis time should perform well (hashing of TreeArtifact
instances should be cheaper vs. a plain list of files, it effectively caches a part of the computation).
"Preparation" tasks before execution such as runfiles creation might take a hit if Bazel is symlinking TreeArtifact
files instead of the directory. If such a problem exists, it'll likely need an --incompatible_
change in Bazel.
"Preparation" tasks before execution such as runfiles creation might take a hit if Bazel is symlinking TreeArtifact files instead of the directory. If such a problem exists, it'll likely need an --incompatible_ change in Bazel.
I tested this with an implementation of the suggested change. Bazel symlinks the TreeArtifact
directory, so runtime performance seems to check out. Analysis should but will require benchmarking to confirm.
I'll get a PR for this.
@jesses-canva that's a clever idea - but currently we need access to some files inside the package earlier (e.g. repository rule needs to read package.json) so it's not enough to delay extraction until an action runs. Perhaps we can extract just some of the files in a repository rule and ALSO extract them again in an action later.
Thanks for tracking down that change @SanjayVas - I'll reach out to @tjgq about our options here to keep the same performance but not trigger the warning.
Perhaps we can extract just some of the files in a repository rule and ALSO extract them again in an action later.
Patches are applied before package.json
is read (enables codegen for bin
entries), so extracting just package.json
wouldn't really work.
However it is possible to be more frugal. pnpm-lock.yaml
will tell us if a package has a bin
field.
# ...
packages:
# ...
/@aspect-test/a@5.0.2:
resolution: {integrity: sha512-bURS+F0+tS2XPxUPbrqsTZxIre1U5ZglwzDqcOCrU7MbxuRrkO24hesgTMGJldCglwL/tiEGRlvdMndlPgRdNw==}
hasBin: true
# ^^^^^^^^^^
dependencies:
'@aspect-test/b': 5.0.2
'@aspect-test/c': 2.0.2
'@aspect-test/d': 2.0.0(@aspect-test/c@2.0.2)
dev: true
This means for a typical scenario (out of all dependencies, few have hasBin: true
).
bin
will need to be extracted twice, and have patches applied twice.On paper performance should be close with potential for savings seen in certain scenarios (since extraction can potentially be skipped).
Stats for perspective (pulled from Rules JS pnpm-lock.yaml
).
hasBin: true
: 77If desiring to removal all extraction from repository generation, it may be doable with an API change (breaking change).
The generated lambdas and other symbols do not appear to need much information about the package. It may be possible to generalise them such that the user is responsible for specifying which binary to use (information which must already be known by users).
# From Rules JS, //e2e/bzlmod:BUILD.bazel
load("@npm//:jasmine/package_json.bzl", jasmine_bin = "bin")
jasmine_bin.test(
name = "jasmine_test",
bin = "jasmine",
# ^^^^^^^^^^^^^^
args = ["*.spec.js"],
data = ["test.spec.js"],
# jasmine doesn't know to run without runfiles
target_compatible_with = not_windows,
)
With the user provided bin
attribute (normally present in the macro name, e.g. jasmine_bin.jasmine_test
) it can be used to construct the same values currently eagerly computed.
bin_name
- No longer needed, is a santized version of the bin name that can be used for things like Starlark macro names.bin_mnemonic
- Capitalising each word, splitting on _
.bin_path
- The value of package.json#bin[bin]
, normalised into a dict
. No reason this cannot be resolved at runtime.This has been "mostly" mitigated in the latest rules_js release now that for most npm package the input to the graph is the .tgz
file and it is extracted into the package store.
See the PR summary of https://github.com/aspect-build/rules_js/pull/1538 and this comment on #1412 for more info.
Since there are still cases where a source directory input is required for packages with lifecycle hooks and/or patches this issue should stay open. The lifecycle and patches cases will be fixed in the future.
Is there any ETA on the lifecycle hooks/patches part of this? In CockroachDB, the newer version of rules_js
quashes a lot of these warnings, but there are still quite a lot, mostly in the various esbuild-
packages.
What happened?
I have a dependencies on
prettier@2.8.8
and after updating to the latest1.34.1
and Bazel 7.0.0 I get the following errorVersion
Development (host) and target OS/architectures:
Output of
bazel --version
: 7.0.0How to reproduce
Any other information?
No response