bazelbuild / bazel-central-registry

The central registry of Bazel modules for the Bzlmod external dependency system.
https://registry.bazel.build
Apache License 2.0
237 stars 268 forks source link

[Idea] Support Starlark overlays without patches #1566

Open aaronmondal opened 4 months ago

aaronmondal commented 4 months ago

At the moment all modules are overlaid with patches and those patches are hashed and tracked in source.json.

Many modules probably just need a MODULE.bazel file and some buildfiles. I think it would be nice UX improvement if those files could be put into the BCR directly and have some mechanism to "automatically" overlay these Bazel-specific files.

Since we already have MODULE.bazel in the BCR without patches, it might be viable to have some sort of overlay mechanism, maybe even an overlays directory that adds Starlark files directly on top of the external repository.

Such an approach would be somewhat similar to how it feels like to work in nixpkgs: Have the build overlays as regular source code and only use patches for things that are "true" functionality-changing patches.

I believe this could make it nicer to browse the BCR and get inspiration. It would also make it easier to spot functional changes and would potentially reduce the friction of having to generate patch files all the time.

alexeagle commented 3 months ago

+1 it's really painful to have to encode new files as patches

fmeum commented 2 months ago

We could add a remote_files attribute to http_archive that takes a dict from repo-relative paths to URLs and downloads the files from these URLs to the given path (for simplicity, assuming that the original file didn't exist). We would need an additional remote_files_integrity dict to store the hashes.

Then registries could start populating this attribute in their source.json files, but they would only be compatible with Bazel versions with support for the new attributes. I need to check what an old Bazel version would do with such a source.json - hopefully it would fail rather than silently dropping the attributes. But we could always have a bazel_compatibility requirement for such modules.

@meteorcloudy @Wyverald @SalmaSamy What do you think?

meteorcloudy commented 2 months ago

SGTM, PR welcome. And we could backport the change to 6.x and 7.x.

fzakaria commented 2 months ago

I am interested in working on this. I think it would be a big QoL improvement.

@fmeum if could point me to some source files or maybe a PR that did something similar in changes to sources.json to mimic I can try that.

fzakaria commented 2 months ago

I want to add not only is it annoying to submit patch files but the BUILD files aren't searchable on GitHub search using the starlark language filter. That would help expose more samples.

fzakaria commented 2 months ago

Okay here is my research:

  1. Augment https://github.com/bazelbuild/bazel/blob/master/tools/build_defs/repo/http.bzl#L235 to add a new attribute for an overlay directory (auto) or the dict files (manual)
  2. Augment https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/bazel/bzlmod/ArchiveRepoSpecBuilder.java#L28
  3. Augment https://github.com/bazelbuild/bazel/blob/618c0abbfe518f4e29de523a2e63ca9179050e94/src/main/java/com/google/devtools/build/lib/bazel/bzlmod/IndexRegistry.java#L189 to handle support for the new fields in the source.json

Looks like (1) can be starlark only. (2) and (3) need to be changed for bzlmod but I'm not 100% certain of their innerworkings.

fmeum commented 2 months ago

@fzakaria That's highly appreciated! Your research looks solid, most of this can indeed be done in Starlark and I can also help with the Java parts.

For starters, I would suggest to focus on the Starlark part and augment http_archive with two new attributes:

I think that attribute structure is going to be simpler than a full-blown overlay JSON, with a very similar experience: In the BCR, we could just have a files sibling directory of patches with files neatly laid out according to their relative path. The usual tooling (//tools:update_integrity in the BCR) could then add the integrity values to source.json in whatever format we want.

When implementing the download logic for the remote files, I would suggest to keep the following in mind:

fzakaria commented 2 months ago

@fmeum thank you for the advice -- I was able to start https://github.com/bazelbuild/bazel/pull/22155 It is working. I finished 90% -- I can add remaining tests and finish it but I wanted some early feedback.

fzakaria commented 2 months ago

@fmeum sent me next steps I am capturing here:

You will have to extend this file: https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/bazel/bzlmod/IndexRegistry.java;l=131;drc=a87a8db23885535232ace3f64ac45e595c95d2f5

In particular the getRepoSpec method. You can mostly mimic the logic for patches, I think.

Once that lands, we can update the tooling in the bazel-central-registry repo.

fmeum commented 2 months ago

While @fzakaria is close to getting overlay support into Bazel, I noticed something that could make our lives easier today: It's possible to reuse patches from other module versions, which could be used to simplify reviews of C++ modules. See https://github.com/bazelbuild/bazel-central-registry/pull/2024/files#diff-90fa618704d4146385f96570f2033f70b09f59ae5c630825ea0dde357520ebd5R5 for an example.

fzakaria commented 2 months ago

Final piece is up for PR #2046 2046