Closed achew22 closed 1 year ago
cc: @jayconrod
I really don't know where that would belong. Do gazelle plugins go with a gazelle contrib folder or where they seem used?
Then we have the question of how to bundle libraries for the optimal stardoc generation. One bzl_library per file is probably not wrong. If startdoc is going to do something for integrated doc sets, that could be structured at the stardoc rule level, rather than having a big library.
Ideally I think Gazelle extensions should live in the same repos where the rules are defined. That way, the extension can be coupled with the rules being used. Rule authors are also in a much better position to maintain those extensions than I am.
From the not-following-my-own-advice department, the Go extension lives in the Gazelle repository (Gazelle was originally Go-only). There have been some issues with Gazelle and rules_go versions drifting apart.
In order to push the discussion along, I decided to implement a strawman proposal. Please see #251 for that proposal. I have a few questions in there for both of you. If you would be so kind as to review and respond to them I would greatly appreciate it.
To be honest, I am not going to have the time to look at this for a while. Nor can I commit to approving, because I am not directly a skylib maintainer. @laurentlb @c-parsons Can either of you comment on the overall concept of doing this.
Friendly ping
Hi @achew22, I'm many months late to the party, but I wanted to ask if it's possible to move this to a separate repo? As you alluded to yourself, the added dependency on rules_go and bazel_gazelle could be problematic. It becomes much more conspicuous in the external deps overhaul (tl;dr doc) since a module needs to declare its direct dependencies and all of those propagate, which means that anyone depending on Skylib would necessarily bring in rules_go and bazel_gazelle. (In fact, that's why I'm only just asking now -- I'm trying to kickstart the Bazel central registry described in the overhaul proposal.)
@jayconrod re "Gazelle extensions should live in the same repos where the rules are defined", I would agree except that it brings a dependency on Go/Gazelle to any rules_X module for which such an extension exists. Is it feasible to always have such extensions in a standalone repo (read: standalone module, as described in the overhaul proposal)?
I also wanted to say for the record that I'm not entirely clear on the details yet, so please don't treat this as an eviction notice :)
How exactly does one use the gazelle plugin in this repo? I haven't used Gazelle before and the readme file here doesn't contain instructions. Depending on the answer to that question, we might be able to avoid this move by simply re-introducing the concept of dev-dependencies into the external deps overhaul proposal.
Hi @achew22, I'm many months late to the party,
@Wyverald, it's okay, we're just happy to see someone excited about the project.
but I wanted to ask if it's possible to move this to a separate repo? As you alluded to yourself, the added dependency on rules_go and bazel_gazelle could be problematic.
"Could be problematic" is an interesting turn of phrase. I would expect that by now it would have already become problematic given that 6 months and a number of releases have happened. It's already been integrated into most repos. Maybe you point me to some complaints from the past few months about adding this dependency? Looking at the data, my conclusion has to be that it is not problematic since not a single soul has complained in any forum that I can witness (with the notable exception of you). I'm open to rebuttal on this one.
It becomes much more conspicuous in the external deps overhaul (tl;dr doc) since a module needs to declare its direct dependencies and all of those propagate, which means that anyone depending on Skylib would necessarily bring in rules_go and bazel_gazelle. (In fact, that's why I'm only just asking now -- I'm trying to kickstart the Bazel central registry described in the overhaul proposal.)
Since this is the first instance of someone being concerned about this in the real world. Let's take this as an opportunity to discuss what it is you're doing and why the situation that's been in place for half a year is insufficient.
I also wanted to say for the record that I'm not entirely clear on the details yet, so please don't treat this as an eviction notice :)
Uh, I guess that's reassuring.
How exactly does one use the gazelle plugin in this repo? I haven't used Gazelle before and the readme file here doesn't contain instructions.
A usage example is actually included in this repo:
The languages
attribute points to a go_library
target and is included in a built version of the resulting Gazelle binary. When you invoke that gazelle_binary
, it will walk your filesystem and generate BUILD
files for you. There is 0 overhead if you do not depend on the go_library
.
Depending on the answer to that question, we might be able to avoid this move by simply re-introducing the concept of dev-dependencies into the external deps overhaul proposal.
I didn't realize the external deps overhaul had dropped the idea of dev-dependencies. I am concerned that is going to leave a lot of projects out in the cold. Most projects already depend on the ability to have deps that aren't a part of their distribution. For example:
rules_go
will force everyone to depend on https://github.com/grailbio/bazel-toolchain
, as well as an RBE executor through the bazel-toolchains
dep even though neither of those are necessary to use rules_go
.stardoc
would require bringing in rules_pkg
in its entirety which includes a system for both building and extracting Debian packages.rules_nodejs
will force everyone to depend on stardoc
, rules_go
, angular2
(which weighs 300MB), cypress
(which weighs about 100MB) and an executable version of bazel
itself in 15 different versionsThis means that if you don't support the idea of having a differentiation between the distribution version of a repo and the developer version of a repo, a hello world in TypeScript is going to require you to download 1500MB of data.
As described in the overhaul proposal, modules will specify their dependencies using MODULE.bazel files instead of WORKSPACE files, and deps will be pulled from registries (mostly the Bazel Central Registry or BCR) instead of raw URLs. I'm trying to put some initial modules into BCR, and that entails writing a MODULE.bazel file for them.
Skylib is the first one I tried. It being such a fundamental standard library, I expected it to have minimal dependencies, and was surprised that its WORKSPACE file contained references to rules_go and bazel_gazelle. They obviously need to go into the MODULE.bazel file as dependencies, which means that they'd propagate to basically all Bazel modules. Hence the original question. (see more discussion about dev-deps below)
As such, this is new territory and thus it's largely irrelevant whether anyone has complained in the past few months.
We dropped the idea of dev-dependencies because we couldn't cleanly support it. Imagine your module foo
has a dev-dependency on the module fancy_test
which you only use for testing. In one of your BUILD files (say foo/BUILD
), you could write
load("@fancy_test//:defs.bzl", "fancy_test")
cc_library(name = "lib")
fancy_test(name = "lib_test", deps = [":lib"])
Now, if another module bar
depends on foo
(specifically @foo//:lib
), they would get a build error at the load
statement. This is because fancy_test
is a dev-dep of foo
, so bar
doesn't transitively depend on fancy_test
through foo
. This error happens despite the fact that bar
technically never used any test target in foo
; it's simply because the package load
ed something from the dev-dep. (see also https://github.com/bazelbuild/bazel/issues/12835)
So we ended up removing dev-deps because:
bar
) could get an error out of the blue, and it's hard for the library author (foo
) to notice such potential breakages ahead of time (since they do get dev-deps).load
from dev-deps into separate packages, it can cause quite a bit of churn as it goes against what most people are doing already (ie. put test targets next to the libraries under test). rules_pkg
would never be downloaded even without dev-deps, as long as the user of stardoc
does not refer to any target in the stardoc
package that loads rules_pkg
. Similarly, a hello world in TypeScript does not require you to download 1.5GB of data.With all that said, there's a convincing argument to bring dev-deps back, and that is the de-cluttering of the dependency graph. We essentially have to weigh that against the drawback of potentially tricky errors (bullet point 1 above).
Thanks for providing the usage example. Judging from it, I don't think rules_go is a dev-dep of bazel_skylib (even if we supported dev-deps), because you are expecting a user of bazel_skylib to use the @bazel_skylib//gazelle/bzl:bzl
target, which is a go_library
coming from the rules_go
module. A dev-dep of bazel_skylib would be one that's only used for development of bazel_skylib itself (rules_pkg is a perfect example). Here rules_go is more of an "optional" dependency: if a user of bazel_skylib wants to use this part of the module, they should get this dep; if they don't, they shouldn't. In this case, it's much cleaner to separate out the part of bazel_skylib that brings in an extra dependency into a separate module.
It's worth noting that, we can make bazel_skylib flat out depend on rules_go, and someone who doesn't use the gazelle part of skylib still won't have to download rules_go. (Just like the no-dev-deps situation above.) The problem, of course, is that it clutters the dependency graph -- we can make skylib depend on 1000 other random modules without any users of skylib having to download anything extra (as long as they don't use those parts), but that doesn't make it justifiable.
cc @meteorcloudy
IIUC, it sounds like the resolution here from skylib's POV is that we likely will want to put the gazelle dir in another workspace by giving it its own WORKSPACE file?
@brandjon,
IIUC, it sounds like the resolution here from skylib's POV is that we likely will want to put the gazelle dir in another workspace by giving it its own WORKSPACE file?
One of the major benefits of having it in the same WORKSPACE
is that when you upgrade your dependency on bazel-skylib
you're implicitly upgrading your gazelle
plugin. If the bazel-skylib
were to add a feature that required a mechanical fix, the next run of gazelle
would fix it right up. As far as I can tell, the only other strategy for helping people upgrade is to have them to use a buildozer
fix. Which, of course, also has a dep on rules_go
.
@Wyverald,
Does this imply that if you bring in bazelbuild/rules_proto
, which I would argue is even more fundamental to the Bazel ecosystem than bazel-skylib
, you're going to be bringing in rules_{cc,python,java}
since it provides rules in each of those languages?
Is it possible there is an additional kind of dependency, and I'm throwing out a straw man here, an "optional" dependency. Let's pick a worse and maybe even pathological case, stackb/rules_proto
which produces android
, closure
, cpp
, csharp
, d
, dart
, go
, java
, node
, objc
, php
, python
, ruby
, rust
, scala
, swift
, gogo
, closure
, commonjs
, and typescript
. If I'm a user of that repo am I required to include 20 dependencies? Possibly more since I'm bad at both transcription and counting. What if a repo could define a dep as "optional" and it's recorded as a possible dep and used in version compatibility calculations, but not automatically added into the deps section by projects that depend on it? This way you can have a repo that potentially provides a great number of resources but you pay only for the things that you need.
The cost of not having dev-deps is, as you said, an explosion of unnecessary deps; however, this does not incur a crazy ton of extra downloads because Bazel doesn't immediately download all dependencies. A repo is only fetched when something is required from it. So things like rules_pkg would never be downloaded even without dev-deps, as long as the user of stardoc does not refer to any target in the stardoc package that loads rules_pkg. Similarly, a hello world in TypeScript does not require you to download 1.5GB of data.
I guess I'm confused by this section. If you're saying that not having an edge upon these dependencies means that you don't actually download them, then what is the major concern? Slowing things down? Polluting the dependency graph? Can you help me understand what this a little better?
Sorry for missing the updates, I really need to learn how to use GitHub notifications.
IIUC, it sounds like the resolution here from skylib's POV is that we likely will want to put the gazelle dir in another workspace by giving it its own WORKSPACE file?
That's not enough, as we're talking about MODULE.bazel files here. The gazelle dir needs to be its own module, which means that it needs its own name, versions, and it needs to be able to be shipped separately (ie. I need to be able to download an archive containing its source, and preferably only its source).
I can see how that is a usability hurdle. Every Bazel module needing to be its own GitHub repo can be quite taxing. Maybe we can provide explicit support for using part of a VCS repo as a Bazel module; I can see how that could be possible (for example, the entire Git repo can be packed into one archive, but we can then use strip_prefix
in source.json to say that only a certain subdirectory is the source for a module). Then, we'd only need a MODULE.bazel file under the gazelle
directory, and it can comfortably be a separate module from the root directory (which would have its own MODULE.bazel file).
Does this imply that if you bring in
bazelbuild/rules_proto
, which I would argue is even more fundamental to the Bazel ecosystem thanbazel-skylib
, you're going to be bringing inrules_{cc,python,java}
since it provides rules in each of those languages?
Yes indeed. Ideally the cc, python, and java parts would be separated out into distinct modules, but again, I can see how that can be annoying (see above for a potential solution).
Is it possible there is an additional kind of dependency, and I'm throwing out a straw man here, an "optional" dependency. Let's pick a worse and maybe even pathological case,
stackb/rules_proto
which producesandroid
,closure
,cpp
,csharp
,d
,dart
,go
,java
,node
,objc
,php
,python
,ruby
,rust
,scala
,swift
,gogo
,closure
,commonjs
, andtypescript
. If I'm a user of that repo am I required to include 20 dependencies? Possibly more since I'm bad at both transcription and counting. What if a repo could define a dep as "optional" and it's recorded as a possible dep and used in version compatibility calculations, but not automatically added into the deps section by projects that depend on it? This way you can have a repo that potentially provides a great number of resources but you pay only for the things that you need.
Thanks for bringing up this example. I would definitely consider stackb/rules_proto
a pathological case.
We have discussed optional dependencies for bzlmod before. There is prior art in Cargo, but it's not quite as simple as just marking certain dependencies as optional. A Cargo package A
can have optional dependencies on B
, C
, and D
, and they are grouped into "features" (say F1
which requires B
and F2
which requires C
and D
), which dependents of A
can then selectively ask for. On top of that, features are exposed as a Rust language concept, so they can enforce that code that uses something from B
must be guarded under the feature F1
.
As you can see this is something that warrants very careful design, in particular because it would need changes in Bazel (right now, there's no way for you to tell me in code that certain targets in your gazelle
directory needs an optional dependency). Plus, it's the type of feature that can be added later on, and there is always the much cleaner solution of using separate modules. For those reasons, we decided to not introduce optional dependencies in v1.
The cost of not having dev-deps is, as you said, an explosion of unnecessary deps; however, this does not incur a crazy ton of extra downloads because Bazel doesn't immediately download all dependencies. A repo is only fetched when something is required from it. So things like rules_pkg would never be downloaded even without dev-deps, as long as the user of stardoc does not refer to any target in the stardoc package that loads rules_pkg. Similarly, a hello world in TypeScript does not require you to download 1.5GB of data.
I guess I'm confused by this section. If you're saying that not having an edge upon these dependencies means that you don't actually download them, then what is the major concern? Slowing things down? Polluting the dependency graph? Can you help me understand what this a little better?
First, to clarify the first sentence in the quoted section: by "the cost of not having dev-deps", I meant "the cost of not having the concept of dev-deps -- i.e. specifying all otherwise dev-only dependencies as normal dependencies". Having a normal dependnecy edge on rules_pkg does not cause you to fetch rules_pkg until something is needed from it. The downside of not having the dev-deps concept is then exactly as you named them: the pollution of the dependency graph and, consequently, the slowdown.
Slightly off topic, I was trying to use the gazelle skylib extension using the following rule
load("@bazel_skylib//:bzl_library.bzl", "bzl_library")
load("@bazel_gazelle//:def.bzl", "DEFAULT_LANGUAGES", "gazelle", "gazelle_binary")
gazelle_binary(
name = "gazelle-skylib",
languages = DEFAULT_LANGUAGES + [
"@bazel_skylib//gazelle/bzl:bzl",
],
visibility = ["//:__pkg__"],
)
gazelle(
name = "gazelle",
gazelle = ":gazelle-skylib",
)
I get this error when I run bazel run //:gazelle
,
no such package '@bazel_skylib//gazelle/bzl': BUILD file not found in directory 'gazelle/bzl' of external repository @bazel_skylib. Add a BUILD file to a directory to mark it as a package. and referenced by '//:gazelle-skylib'
I am using skylib version 1.1.1
. Anything I am doing wrong?
I'm not aware of any reason why that would not work. Rather than hijacking this thread, can you make a small repo that demonstrates the issue and file a new bug which includes a link to it? That will make it a lot easier to understand what's gone wrong. Thanks!
@srikrsna-buf is just hitting the same issue being discussed in this thread, which is that the distribution of bazel-skylib ships without the gazelle/ directory. (see target //:distribution
in this repo)
As a workaround you can fetch bazel-skylib from source instead of using the distribution, but it doesn't work under bzlmod where you must trust the central registry (and MODULE.bazel
happens before WORKSPACE
so you cannot override).
@Wyverald any progress thinking through the problem of "optional deps"? As it stands, the rules authors SIG is recommending @achew22 's gazelle extension for new rules repos https://github.com/bazel-contrib/rules-template/blob/main/BUILD.bazel#L6 so this is making a bit of a mess for rules authors when they want to introduce bzlmod.
An approach I don't see on the thread above is to have the gazelle/
folder be a nested workspace and a separate Bazel module, but have it be released together with bazel-skylib. This is a pretty common approach in npm for example, where projects often release a bunch of packages together.
Granted, nothing enforces that users take the same version of bazel_skylib
as bazel_skylib_gazelle
(picking a name for the module for the sake of discussion)
But they could extract a constant in either WORKSPACE
or MODULE.bazel
to get that assurance
versions = {"skylib": "1.1.1"}
bazel_dep(name = "bazel_skylib", version = versions["skylib"])
bazel_dep(name = "bazel_skylib_gazelle", version = versions["skylib"])
and we could document this as the suggested way to install.
Since it's a nested workspace, we still get the atomicity of CI covering the two together, avoiding drift and ensuring test coverage.
I'm adding a gazelle extension to our aspect-build/rules_ts right now and am planning to try this approach.
The approach advocated by @alexeagle seems sensible to me. In contrib_rules_jvm we use the Gazelle plugin to help us keep our build files up to date. Because the plugin isn't available in the modularised skylib, we can't fully transition the project to bzlmod
yet.
Also, I note that the gazelle plugin is already in its own directory. That's what we do in contrib_rules_jvm
, since we also have our own gazelle plugin (for generating build file entries from java source). To handle that case, we provide a different macro for loading all our dependencies. That means that in the common case, people don't need to pull in the dependency on rules_go
, but when they want to use Gazelle it's an option.
I'm hoping that bzlmod
will be smart enough to only load the repos that are actually used in a build, rather than everything.
I'm hoping that bzlmod will be smart enough to only load the repos that are actually used in a build, rather than everything.
That's already the case, if a repo is not required to build your target, it won't be fetched. Just like a repo defined in the WORKSPACE file. Just watch out this issue to avoid accidentally making rules_go a hard dependency.
Alright. So it sounds like a module with a dependency on rules_go
is fine. In which case, I think the "extra deps macro" provides the path of least resistance to getting the gazelle plugin in skylib shipped. Anyone fancy a PR?
So it sounds like a module with a dependency on rules_go is fine
Yes, but it does introduce the transitive dependencies of rules_go and affects the final resolved dependency graph though. But all dependencies are lazily fetched.
I've opened #400 with the macro exposed in the distribution and the changes required to use it added to the README.
Closing; the gazelle
subdirectory has been available as a separate module since 1.4.0.
Hey @aiuto and @laurentlb (who appear to be OWNERS on this repo),
First, thanks so much for all you're doing in the Bazel ecosystem. I hugely appreciate your efforts. One of the things that I have been thinking about creating for a while was a bazel-gazelle plugin for
skylib
that generatedskylark_library
targets. I think this would be generally useful. In my repo's I've found it helpful to haveskylark_library
targets to feed intostardoc
and manually creating the targets is kind of a drag.Fortunately, Gazelle has a really nice plugin system that can be used to add generation for languages beyond golang. If I were to write a skylib plugin for Gazelle, would this repo be a good place to host it? Since the repo is packaged by
rules_pkg
I could iterate on it and not have it released as part of the distribution bundle until it was deemed to be good enough. Additionally I would be happy to update the distribution scripts to support both a gazelle version and a non-gazelle version to keep down dependencies if that's a concern.Please let me know what you think. Thanks so much!