Open csitarichie opened 1 year ago
As to the change in environment variables being confusing, agreed, I think we hope to add a warning for that at some point.
- There is no auto discovery - the expectation is that you setup your toolchain so it works. We ship toolchains that pull things off the PATH, but it isn't meant to auto-discover, and you may want to specify the toolchain more precisely.
Unfortunately, the problem is a bit more complicated since my setup without lld
is a valid setup. The default toolchain is coming with buck2 expects that the toolchain on the path has lld
which is optional. It is a hardcoded requirement that comes from the default toolchain not from the application.
In my opinion, the correct behavior would be:
1.) if c++ default toolchain expects lld to be installed and configured it probes for it and provides a proper error message even with the suggest what to execute to solve the problem, rather than failing (at the end) with a cause of a cause.
or alternatively (this is how Linux make configure
or CMAKE probe
works).
2.) The default c++ toolchain probes for compiler/linker versions/options/types and sets up default options accordingly. The application under BUILD might specify in the buck2 rule certain requirements like c++ std version or linking with ldd etc. and the build might exit with a proper error message in this case that the supplied toolchain from the PATH does not meet the minimum requirements from the application. But if there is a common subset of options possible the application compiles
- Buck2 is not hermetic unless you use remote execution. We are potentially going to address that by having a local remote execution, see Idea for a local execution and cache #105.
This one is critical. I consider buck2 / bazel just because I hate non-hermetic builds. Today I work it around by delivering docker images to execute, build and test my application but it is not native to the build env. I would like to see that this is the default behavior in bulk2 and there are a bunch of toolchain rules which give hermetic toolchains. and the user can just pick one which is matching with the application and the application can restrict default hermetic toolchains.
Going even further would be nice to have for toolchain providers an infrastructure where they can create an easy declarative configuration for the toolchain in buck2. So we can build an ecosystem where we can provide SDKs, compilers, and code generators, as toolchains. And buck2 would abstract the execution environment from it so If I provide my toolchain rules it would work on local machines on a remote cloud, potentially on all major platforms -> if native binaries are provided.
As to the change in environment variables being confusing, agreed, I think we hope to add a warning for that at some point. Yes, that is a nice-to-have feature, it just would save beginners time like me :)
IMO, probing for compiler and linker flags the same way autotools does is absolutely a footgun for Buck. This immediately makes all results between two users non-hermetic anyway even with a local RE cache — imagine you have ld.bfd
only while someone else has ld.lld
installed and Buck "helpfully" discovers lld
and uses it by default instead. Then you already have a cache miss and a hermetic violation anyway, because the action graph changes based on the ambient environment! Now if the second person installed ld.lld
then maybe they would get a cache hit next time they buck build
(because the graph would change, resulting in some recompilation and possibly a hit) and everything would return to normal. But this behavior is not immediately apparent.
Of course, you could say that — well, this is just a natural extension of the existing problem, because User A and User B could be using clang-14 and clang-15 respectively — but they're both called clang
in the ambient environment in $PATH
, so you'd have a cache miss anyway once buck2
tries to hash the results and do a cache look up. It's true that the default toolchain examples just materializing things from "thin air" AKA $PATH
isn't ideal. But I strongly believe the solution is not to exacerbate the problem with auto-discovery.
The solution is to instead specify your toolchain .bzl
files precisely to instead call clang-15
, or /usr/bin/clang-15
, or /absolute/path/to/llvm/tarball/that/i/downloaded/from/llvm/dot/org/bin/clang-15
, and also things like mandate the linker to use. There should be nothing stopping you, for example, from using something like http_archive
in Starlark to download toolchains as a target, and then make the toolchains use that downloaded file. I think this should be possible?
This crosses paths with your "my setup without lld
is valid." I know what you mean, but... If the CXX toolchain for project foobar
specifies it's necessary — then it sort of isn't, because it doesn't have the necessary tools to build the project, by definition. The toolchain specification is committed alongside the source code as a record of what tools to use. Particular projects will have particular toolchain specifications, and they may really need it for a reason. I know what you mean — this particular example doesn't strictly require lld. That should be fixed. But beyond that, the specification of the toolchain for a project might say "use lld", and that requires you to install lld
by definition. By a similar metaphor: a Linux system without gcc
installed is also "valid" by many definitions, and useful but certainly not useful for "being able to run gcc
to compile code."
I do agree that if the default toolchain specification can't be satisfied (e.g. "I want to use lld
but it isn't found") there should be real error messages, though. I also strongly agree being able to support non-lld linkers and non-clang compilers would be great and should be a feature if it can't work already. But it should never auto-discover these things. That's an attempt to make Buck into something it isn't, I think — and ultimately is just a hack to emulate existing systems (waf, scons, cmake) in order to make the initial impression slightly more familiar. That's the wrong path, I think.
So zooming out a bit, I feel there are some real pain points hidden here:
1) Publicly available binary toolchain packages, with a known hash (i.e. all developers use the exact same copy no matter what) don't exist that Buck can immediately leverage, and
2) It isn't immediately apparent how to use your own toolchain binaries, if you built them. Like, let's say you have a build of GCC you want people to use, how can you make buck
download that and use it? And,
3) The existing example toolchains seem to endorse a non-hermetic structure, which seems confusing and the anti-thesis of buck's goal!
There is a related, golden rule that governs point 3, as well:
So, first. I wonder if just adding an example file where toolchains//:cxx
was provided by an http_archive
download of a pre-built binary would work? This is basically how the Go toolchain in examples/no_prelude
works by default. I think something that downloaded a toolchain and used that would be really valuable to instruct people how it's intended to work. Shouldn't this be possible already with a little elbow grease? You could use a binary download from llvm.org
as an example.
Second: in light of the above, I think #105 is possibly a critical feature to have. For users, and really anybody — people really really really want the build tools to immediately provide a "uniform" surface area. It's weird you would need to be using RE to find out you were depending on ld.lld
being installed in the ambient environment, for instance, when it might not be by default. If the local execution cache was implemented and on by default, then it would act as a forcing function to get all of this correct. You wouldn't be able to make such simple mistakes. Arguably, the example toolchain files would then always work OOTB! This is a clear example where they failed.
Finally: it seems inevitable that people will start providing binary toolchains for popular languages, one way or another. Their rules will then require this, so a rule ecosystem for buck2 will also lead to toolchains as well. It's mostly a matter of who is responsible for them, and what quality they are, and whether they're "official". This is part of the reason I started buck2-nix
(you just get every toolchain from Nixpkgs and use that — problem solved), but it isn't generally viable for end users, I think. So perhaps the Meta team should think about this: assuming you also provided toolchains for end-users, what might that look like in the FOSS world? It's a hard question and has no immediate answer, and would take time to plan out, but it might be worth thinking about.
buck2-nix
I believe this repo. I generally think when people provide SDKs for a certain product / HW for example raspberry pico SDK they could do it with buck2 and buck2-nix to be cutting edge and avoid all the problems which come from local install and CMake. (As a home project I try to convert this SDK to buck2-nix and instead of cmake build it with buck2 let's see how far I get, I do it as FOSS). I think having toolchains using a local installation is very thin ice and gives the wrong example to the users, which should be discouraged.
Thank you for your comments.
Be careful with buck2-nix. It's very experimental and might bite you, and definitely not any kind of officially endorsed project; you'll need to talk to me if you have questions (unless you're also a relative expert in all the relevant things.) That said -- you should also be able to combine Nix with the existing prelude, too! Or just write completely custom rules for things. There's a bunch of options, but I think it's powerful.
I'm just started to play with Buck2 on mac os. I have clang installed for Xcode but I did not install
llvm
lld
with egbrew install llvm
and did not set it up in zsh.if I run with the setup above:
building hello world example fails like this
After installing llvm and setting up the linker and path. And more importantly, restarting buck2 with
buck2 kill
(took some time to figure out that changes in environment taking effect only after restart of the buck daemon) it started to work like this:I have in general 2 questions:
If the default toolchain is doing some kind of auto-discovery Why there is no check that the default settings are working (eg CMAKE's compiler diagnostic)?
to achieve hermetic builds I was expecting that bulk2 would install in a sandbox a toolchain which is matching with my default platform. Are there examples of how to create hermetic builds with sandboxes where the compiler toolchain is populated during the build?