bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.95k stars 4.02k forks source link

Make build without bytes take into account `http_file` #22366

Open ismell opened 3 months ago

ismell commented 3 months ago

Description of the feature request:

The http_file repository rule takes in a URL, a SHA256 and the name of the output file. Since we "sometimes" have the SHA256 of the file, in theory it should be possible to use this information in the build without bytes computation. This would allow users/CI bots to skip downloading these files and directly download the output artifacts if they are in the CAS.

A work around for this is to write our own http_file bazel rule that performs the network access. The disadvantage to this is that it doesn't interact with the local repos CAS. i.e., .cache/bazel/_bazel_$USER/cache/repos/v1/content_addressable/sha256

Which category does this issue belong to?

Core

What underlying problem are you trying to solve with this feature?

Prevent unnecessarily downloading action inputs if they outputs are already in the CAS.

Which operating system are you running Bazel on?

No response

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

haxorz commented 3 months ago

team-Core doesn't own "build without bytes"

tjgq commented 3 months ago

"Building without the bytes" as currently implemented only applies to output artifacts (as opposed to source artifacts). If I understand it correctly, you're asking to extend it to cover source artifacts, so that we can avoid materializing "intermediate sources" (e.g. the result of decompressing a source archive into the source tree) in the same way BwoB avoids downloading "intermediate outputs" today. Is this an accurate restatement?

I think this is a reasonable feature request, but it would be a fairly large change; because Bazel fundamentally treats source/output artifacts and the rules that produce them differently, it's not a simple matter of plumbing repository rules into the existing BwoB infrastructure. If we decide to work on this, we should probably look at the broader applicability of "lazy source artifacts" (e.g. https://github.com/bazelbuild/bazel/issues/16380, while describing a different use case, would likely boil down to the same feature request).

ismell commented 3 months ago

Yep, exactly. I think one way of tackling this would be to implement an http_file bazel rule (instead of a repo rule). That's our current plan to deal with this, but it would be nice if bazel provided this at some point.

Wyverald commented 3 months ago

See previous design proposal: https://docs.google.com/document/d/1OsEHpsJXXMC9SFAmAh20S42Dbmgdj4cNyYAsFOHMibo/edit

And discussion: https://github.com/bazelbuild/bazel/discussions/20464

(tl;dr: the principled solution to this would take a lot of design work and potentially years to yield a usable result)