bazelbuild / bazel-gazelle

Gazelle is a Bazel build file generator for Bazel projects. It natively supports Go and protobuf, and it may be extended to support new languages and custom rule sets.
Apache License 2.0
1.18k stars 373 forks source link

repository_macro generated in external repository causes gazelle to have slow performance #752

Open grahamjenson opened 4 years ago

grahamjenson commented 4 years ago
rules_go = 0.22.1
gazelle = 0.20.0
bazel = 2.0.0

Does this issue reproduce with the latest releases of all the above?

yes

What operating system and processor architecture are you using?

latest osx, 3.3 GHz Dual-Core Intel Core i7

What did you do?

Generated go_repository rules from go.mod in an external WORKSPACE so that I don't have to run update-repos and commit a large unorganized .bzl file to source. This caused a performance degradation in gazelle running bazel run gazelle as it could not locate the repository_marco to parse the existing imports.

The complete description with proposed solution is in #748

What did you expect to see?

It to be as fast as if the bzl repository macro was in my local workspace.

What did you see instead?

It was very slow.

jayconrod commented 4 years ago

So the issue, as I understand, is that Gazelle needs to be configured to use the same set of external repositories that Bazel is using. This is needed in both the main workspace and in go_repository rules. Without that configuration, Gazelle goes out to the network to try and resolve each external import to a module, which is slow.

Normally, when Gazelle starts in the main workspace, it reads the WORKSPACE file and looks for go_repository rules, # gazelle:repository directives, and # gazelle:repository_macro directives that point to other .bzl files. The set of repository rules it finds are "known", so it won't go out to the network to resolve imports that have one of those repositories as a prefix.

go_repository does something similar. gazelle_dependencies declares a workspace named bazel_gazelle_go_repository_config that generates a configuration file with that information. By default, that file is derived from WORKSPACE, but the go_repository_default_config parameter may be set to a label of another WORKSPACE-like file. That was added so that unrelated changes in WORKSPACE won't invalidate go_repository rules.

There's no way to set a configuration file for the gazelle rule in the main workspace though. #748 partially adds that, but it doesn't generate it the same way bazel_gazelle_go_repository_config does. They need to be consistent.

It may also make sense to be able to generate a configuration file from a .bzl file, not only a WORKSPACE file. That way, an external workspace could provide a dependency macro, and the same file could be used with Gazelle.

grahamjenson commented 4 years ago

Would setting the configuration file/method by having # gazelle:repository_macro take a label work? This looks half implemented in the gazelle_dependencies method. Something like:

# gazelle:repository_macro @go_modules//repositories.bzl%go_repositories

This might be difficult because it will still not be a dependency on the BUILD rule, which means it might not be available to its run files when scanning.

jayconrod commented 4 years ago

This might be difficult because it will still not be a dependency on the BUILD rule, which means it might not be available to its run files when scanning.

That's the key issue. All of those files have to be available when Gazelle runs. Adding them to the data attribute might work, but there would need to be a good error message when they're not there.

grahamjenson commented 4 years ago

I thought about adding them to the data attribute, but then you would still need to tell it where to look for that file.

e.g.

gazelle(
    name = "gazelle",
    extra_args = ["--repo-config=external/go_modules/repositories.bzl"],
    data = ["@go_modules//:repositories.bzl"]
)

But this seems error prone given we would need to map it to the runfiles path.

jayconrod commented 4 years ago

It should already know where to look for the files from # gazelle:repository_macro directives. That should be enough to find the runfile.

grahamjenson commented 4 years ago

I don't think it does. This was the reason for the changes in the bash script in the #748 PR.

Could you give me an example of what you think should work and I will go and try it out and report back.

jayconrod commented 4 years ago

Given a directive like:

# gazelle:repository_macro @go_modules//repositories.bzl%go_repositories

Gazelle should be able to find that file, if it exists. Currently, any label outside the main workspace is rejected so that would need to change.

Gazelle will not be able to ensure this file exists and is up to date on its own though. That could be done with either go_repository_config (when Gazelle is invoked through go_repository) or the gazelle rule (when Gazelle is invoked in the main workspace). go_repository_config could do this without any change to its interface, since repository rules can resolve labels dynamically. The gazelle rule would need files to be specified with the data attribute. But no additional flags should be needed to Gazelle itself, other than perhaps the existing -repo_config flag, which would keep its current meaning.

chrismgrayftsinc commented 1 year ago

@jayconrod the above repository_macro directive did not work for me. I get

Error in fail: generate_repo_config: generate_repo_config: failed to load /home/path/to/code/@go_modules/repositories.bzl in repoRoot /home/path/to/code: open /home/path/to/code/@go_modules/repositories.bzl: no such file or directory

I have been able to hack around the problem by using build_config, but actual labels in repository_macro would be highly appreciated.

norbjd commented 1 year ago

Hello @chrismgrayftsinc, I'm trying to do the same thing (like # gazelle:repository_macro @go_modules//repositories.bzl%go_repositories), how do you have solved this with build_config?

Thanks!

chrismgrayftsinc commented 1 year ago

Since I wrote the comment I've forgotten how the hack worked. :)

The way it looks is I have two files. The first called foo_deps.bzl that has a macro

load("@bazel_gazelle//:deps.bzl", "go_repository")

def foo_deps():
    go_repository(
        name = "com_github_containerd_containerd",
        build_file_proto_mode = "disable_global",
        importpath = "github.com/containerd/containerd",
        sum = "h1:rQyoYtj4KddB3bxG6SAqd4+08gePNyJjRqvOIfV3rkM=",
        version = "v1.5.7",
        build_config = "@repository//:foo_dependencies.bzl",
    )
    ...

Then foo_dependencies.bzl has repository directives and go_repositorys defined again.

# gazelle:repository go_repository name=org_golang_x_text importpath=golang.org/x/text
# gazelle:repository go_repository name=org_golang_x_tools importpath=golang.org/x/tools
# gazelle:repository go_repository name=org_golang_x_xerrors importpath=golang.org/x/xerrors
...

go_repository(
    name = "com_github_containerd_containerd",
    build_file_proto_mode = "disable_global",
    importpath = "github.com/containerd/containerd",
    sum = "h1:rQyoYtj4KddB3bxG6SAqd4+08gePNyJjRqvOIfV3rkM=",
    version = "v1.5.7",
)
...

Then @repository//:foo_deps.bzl can be used in other repositories.

I hope that makes sense. @norbjd

norbjd commented 1 year ago

Ooh I see, that's clever, thanks! :eyes:

Do you remember if you were able to generate those files automatically with Gazelle? foo_deps.bzl is similar to the one generated by default with gazelle update-repos (except the build_config), but what about foo_dependencies.bzl?

Obviously we can copy every go_repository manually in that file but it looks like something that would quickly become outdated in the future (like everything manually managed :sweat_smile:).

Thanks again!

chrismgrayftsinc commented 1 year ago

I wasn't able to generate the files completely automatically. The amount of editing is fairly low using something like multi-cursor mode in emacs.

I completely agree that they will become outdated. Having actual labels in repository_macro would help.