Open tbaing opened 1 year ago
Do I understand correctly that you want to filter out existing files that control Bazel from code you vendored so that you can provide your own?
Does "cross-repository" mean:
WORKSPACE.bazel
files so that e.g. in a/BUILD.bazel
you have glob(["**"])
it matches e.g. a/b/c
even if a/WORKSPACE.bazel
exists?@x
and @y
)?Would it help if you could customize the names of files Bazel reads (e.g. you could make it so that BUILD.bazel
files are called BUILD.kitten
in your source? (somehow)
This issue comes up every once in a while (see https://github.com/bazelbuild/bazel/issues/16707 ) but it has never quite crossed our pain threshold.
Please clarify what cross-repository globbing would mean.
Is it just globbing as normal but not stopping at a WORKSPACE.bazel
subdirectory boundary? If so, that seems like odd behavior, given that globbing stops at even BUILD.bazel
subdir boundaries.
Maybe there's a bazelignore feature request somewhere in here, like the ability to ignore a sub-WORKSPACE at a known location?
Gentle ping @tbaing
Sorry for the slow responses on this.
My original request was inaccurate because I'd misunderstood the problem we were encountering. We don't need the globbing to work across repositories, only the ability to not stop at BUILD.bazel
(probably also WORKSPACE.bazel
would be good, but BUILD.bazel
is the primary need) subdir boundaries within a single repository. Sorry for the confusion I created from that inaccuracy. I'll edit to improve.
lberki@ is partially correct that we want to filter out existing files that control Bazel from code we vendored so that we can provide our own, but we also have ChromeOS-maintained code that might include Bazel build files so it's not all vendored third-party code. We're generating our own Bazel wrappers around each package and then we invoke the existing package-level build (which might or might not eventually call a child Bazel invocation), and we don't want to consider the BUILD files that might be present in that package-level code when applying a glob()
over the code.
brandjon@'s idea about the ability to specify patterns for which locations should have their build files considered (or not) for globbing boundaries seems more in line with what I think would work best. Partitioning our "outer" BUILD files from our "inner" BUILD files by filename might work, but I think it would be awkward to control this based on file naming.
I hear you! Globbing across packages (and workspaces) seems to be something a lot of people want. Technically, it's easy to cook up cross-package globbing using a macro, glob()
and subpackages()
, but that requires changing the intermediate BUILD.bazel
files, which, if I understand correctly, is not on the table this time.
I have been toying with the idea of making the name of BUILD.bazel
files configurable since a good while; it'd be a big change, but it would be a pretty thorough solution to the problems of vendoring projects that have their own BUILD.bazel
files but which don't quite work in the context of the outer project.
The best current workaround is to create a symlink tree with the undesired BUILD.bazel
/ WORKSPACE.bazel
files edited out; from what I hear, that's what you are already doing? (that's what Android is doing, too)
cc @brandjon and @comius
re: UTF-8, what is not supported by Bazel? I do realize that we embarrassingly don't support a number of characters in file names, but those are <0x80. UTF-8 should work.
Yep, the symlink tree approach is exactly how we're currently handling this. It works, but involves I/O it would be nice to avoid.
Re: UTF-8, my understanding is that we have a small number (O(tens to hundreds)) of files in third-party packages we control that have characters that aren't valid in Bazel names, so we create a shadow hierarchy with escaped names for any non-valid characters. At one point I wrote some Bash commands to find out what the invalid characters we use actually are, and I could recreate it if you need specifics.
Can you find out which files these are? My understanding was that Bazel supports every character in file names except :
(colon), \
(backslash), 0x7f
(DEL) and control characters 0x00
- 0x1F
inclusive.
Here are the non-[0-9a-zA-Z]
characters we have in our file paths:
!%&'()*+,-./:;=?@[\]_~🌐😀$áíőú
The input includes directory names (not just filenames) as well as the '/' path separator.
Ow. I was hoping that :
and \
are not there, but they are, which means that we'll have to put our brains into gear and come up with something :(
I'm also completely disappointed by the lack of poop emojis there.
@lberki I'll triage this to P4 for now but let me know if you'd like to prioritize.
As soon as @tbaing has a reasonable path forward, it's fine to keep this as P4. It's mighty embarrassing that Bazel cannot handle files with colons in their names, but as long as he can live with the status quo, there is no pressure to do something about this right now.
In the longer term, what I expect is that once "proper' Unicode support lands, we'll be able to start thinking about how to represent files with colons etc. in their names; @haxorz had a vague plan, but I didn't spend too much time evaluating it because it IIRC depended on proper Unicode support.
Then there will still be the problem of ignoring existing BUILD.bazel
files and that's orthogonal from the question of supporting colons so let's tackle them separately?
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.
Description of the feature request:
Please allow globs to operate across directories that contain Bazel BUILD files, rather than stopping at a BUILD file. In our specific context, we would like to use globs to include files in directories that contain Bazel BUILD files, and stopping at a BUILD file means we can't do that.
What underlying problem are you trying to solve with this feature?
ChromeOS currently creates a shadow hierarchy of symlinks with altered names to work around the fact that there are several categories of files that aren't currently valid in the source tree that Bazel is working over. These cases are:
We'd like to avoid the need for this shadow hierarchy, which will require several changes that are probably best captured independently since each one is self-contained in its function/implementation.
Which operating system are you running Bazel on?
gLinux
What is the output of
bazel info release
?release 6.0.0-pre.20221012.2
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response