oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.53k stars 300 forks source link

Bazel: Support `archive_override` #8809

Open nnobelis opened 1 week ago

nnobelis commented 1 week ago

The user can specify that a dependency should come from an archive file (zip, gzip, etc) at a certain location, instead of from a registry. See https://bazel.build/rules/lib/globals/module#archive_override.

Example in MODULE.bazel:

archive_override(
    module_name = "rules_cuda",
    integrity = "sha256-3B9PcEylbj1e3Zc/mKRfBIfQ8oxonQpXuiNhEhSLGDM=",
    patches = ["//tools/bazelbuild/patches:nvcc.bzl.patch"],
    strip_prefix = "rules_cuda-v0.1.2",
    urls = ["https://github.com/bazel-contrib/rules_cuda/releases/download/v0.1.2/rules_cuda-v0.1.2.tar.gz"],
)

ORT should support this alternate locations for downloading the source code.

@haikoschol FYI

haikoschol commented 1 week ago

These are very common. My understanding was that this is the "old way" of doing things (aka "workspace") and it is at least discouraged for bzlmod. But now I'm not so sure this is accurate. And in practice it is unlikely that people will migrate this to bzlmod if they have to maintain a custom registry in addition.

This concrete example is for a Bazel rules module, which I think is exactly the use case for bzlmod and it most likely is in or will be added to BCR.

But in general ORT might have to support archive_override to be usable in real world projects.

fviernau commented 1 week ago

@haikoschol is it fair to assume that the content of the archive file is identical with the content of the file from the registry? (If not, we'd have an issue with ambiguous ORT package IDs)

haikoschol commented 1 week ago

@haikoschol is it fair to assume that the content of the archive file is identical with the content of the file from the registry? (If not, we'd have an issue with ambiguous ORT package IDs)

The registry does not contain source archives, only metadata about them. For the rules_cuda example, the only available version in BCR is 0.2.1: https://github.com/bazelbuild/bazel-central-registry/blob/main/modules/rules_cuda/0.2.1/source.json

If bzlmod was used instead of archive_override, the MODULE.bazel file would contain something like:

bazel_dep("rules_cuda", version="0.2.1")

Then ORT needs to figure out what registry to query to get the information in source.json, in order to download the source archive for scanning.

nnobelis commented 1 week ago

Thanks for your feedback @haikoschol

The key sentence for me is:

But in general ORT might have to support archive_override to be usable in real world projects.

If this configuration is used, IMO ORT should support it, even if this is the legacy way of doing things.

fviernau commented 1 week ago

The registry does not contain source archives,

In any case, the underlying question still remains IIUC: Is the archive override only used to use an alternative location to obtain the source code from, while the actual obtained sources remain identical. Or is it used to provide a different / alternative implementation for a dependency?

haikoschol commented 1 week ago

The registry does not contain source archives,

In any case, the underlying question still remains IIUC: Is the archive override only used to use an alternative location to obtain the source code from, while the actual obtained sources remain identical. Or is it used to provide a different / alternative implementation for a dependency?

It could be anything and I assume replacing the actual bits is a common use case, especially for local_override in monorepos.

haikoschol commented 1 week ago

I should have tried this sooner, but did just now:

cc$ cat MODULE.bazel
bazel_dep(name = "glog", version = "0.5.0", repo_name = "com_github_google_glog")
bazel_dep(name = "googletest", version = "1.14.0", repo_name = "com_google_googletest", dev_dependency = True)

archive_override(
    module_name = "rules_cuda",
    integrity = "sha256-3B9PcEylbj1e3Zc/mKRfBIfQ8oxonQpXuiNhEhSLGDM=",
    patches = ["//tools/bazelbuild/patches:nvcc.bzl.patch"],
    strip_prefix = "rules_cuda-v0.1.2",
    urls = ["https://github.com/bazel-contrib/rules_cuda/releases/download/v0.1.2/rules_cuda-v0.1.2.tar.gz"],
)

cc$ bazel mod graph
ERROR: the root module specifies overrides on nonexistent module(s): rules_cuda. Type 'bazel help mod' for syntax and help.

cc$ bazel build //main:main
ERROR: Error computing the main repository mapping: the root module specifies overrides on nonexistent module(s): rules_cuda
Computing main repo mapping:

Looks like archive_override is incompatible with bzlmod after all.

edit: It's an override so I guess it only makes sense if there is a bazel_dep invocation for the same package. After doing that bazel mod graph fails with a different error that suggests it might be fixable. But I don't have time to mess around with this further.

haikoschol commented 1 week ago

But in general ORT might have to support archive_override to be usable in real world projects. If this configuration is used, IMO ORT should support it, even if this is the legacy way of doing things.

The scope for the initial implementation was limited to bzlmod. Without that, Bazel can't be considered a dependency manager (and with it IMHO still isn't a package manager).

Of course, those pesky "real world projects" don't care about that. IMHO, the only way to make progress on Bazel support in general is on a case by case basis. Basing the implementation on speculation about how people use the bazillion ways of skinning cats in Bazel doesn't seem promising to me.

fviernau commented 1 week ago

Basing the implementation on speculation about how people use the bazillion ways of skinning cats in Bazel doesn't seem promising to me.

This is exactly the reason I was asking this question. Would support for archive_override introduce such speculations?

nnobelis commented 1 week ago

Looks like archive_override is incompatible with bzlmod after all.

Hum the customer project I am working on has this in its MODULE.bazel:

bazel_dep(name = "rules_cuda", version = "0.1.1")

archive_override(
    module_name = "rules_cuda",
    integrity = "sha256-3B9PcEylbj1e3Zc/mKRfBIfQ8oxonQpXuiNhEhSLGDM=",
    patches = [
        "//tools/bazelbuild/patches/cuda:nvcc.bzl.patch",
        "//tools/bazelbuild/patches/cuda:cuda_helper.patch",
        "//tools/bazelbuild/patches/cuda:cuda_library.patch",
    ],
    strip_prefix = "rules_cuda-v0.1.2",
    urls = ["https://github.com/bazel-contrib/rules_cuda/releases/download/v0.1.2/rules_cuda-v0.1.2.tar.gz"],
)

The first line is bzlmod no ?

haikoschol commented 1 week ago

Looks like archive_override is incompatible with bzlmod after all. The first line is bzlmod no ?

Yes, I spoke too soon. Just updated my comment above.

haikoschol commented 1 week ago

Basing the implementation on speculation about how people use the bazillion ways of skinning cats in Bazel doesn't seem promising to me.

This is exactly the reason I was asking this question. Would support for archive_override introduce such speculations?

Since @nnobelis has a concrete project with this constellation, I'd say no.

haikoschol commented 1 week ago

Hum the customer project I am working on has this in its MODULE.bazel:

bazel_dep(name = "rules_cuda", version = "0.1.1")

archive_override(
    module_name = "rules_cuda",
    integrity = "sha256-3B9PcEylbj1e3Zc/mKRfBIfQ8oxonQpXuiNhEhSLGDM=",
    patches = [
        "//tools/bazelbuild/patches/cuda:nvcc.bzl.patch",
        "//tools/bazelbuild/patches/cuda:cuda_helper.patch",
        "//tools/bazelbuild/patches/cuda:cuda_library.patch",
    ],
    strip_prefix = "rules_cuda-v0.1.2",
    urls = ["https://github.com/bazel-contrib/rules_cuda/releases/download/v0.1.2/rules_cuda-v0.1.2.tar.gz"],
)

So this is replacing version 0.1.1 of rules_cuda with version 0.1.2, plus some patches (which AFAIK, ORT has no concept of currently). @nnobelis what's the output of bazel mod graph?

nnobelis commented 1 week ago

Something quite weird:

<root>
├───rules_cuda@_ 
│   ├───bazel_skylib@1.7.1 (*) 
│   └───platforms@0.0.9 (*) 

No version is outputted for this package ?!

haikoschol commented 1 week ago

Something quite weird:

<root>
├───rules_cuda@_ 
│   ├───bazel_skylib@1.7.1 (*) 
│   └───platforms@0.0.9 (*) 

No version is outputted for this package ?!

Weird indeed and unclear how to handle it in ORT. But better than hiding the fact that 0.1.1 was replaced.

sschuberth commented 1 week ago

plus some patches (which AFAIK, ORT has no concept of currently).

Correct. Though coming up with a solution (like applying patches pn-the-fly before scanning, or creating custom source archives for source code with the patches applied) could become useful in the context of BitBake support as well.