Roadmap for local hermetic builds

zjturner commented 1 year ago

If I have a target:

cxx_library(
    name=“foo”,
    srcs=[“foo.cpp”],
    include_directories=[“include”]
)

And foo.cpp looks like this:

#include “foo.h”

this will work on a local build, despite not having specified a headers attribute.

I understand this is because builds are only sandboxed remotely, but I’m curious if there are any concrete plans or roadmaps to make this work locally, out of the box, without standing up any kind of build farm (even over localhost).

I actually thought I had heard comments to the effect of buck2 would use symlinks to establish a local sandbox, but in practice buck2 doesn’t seem to be doing this.

Would appreciate Any insight or future plans around this area

steveklabnik commented 1 year ago

I've heard someone (@stepancheg , maybe?) say that local builds aren't hermetic yet, but they plan to be eventually.

In the meantime, you can do a "local remote" build, where you run the "remote" build locally.

Some community members (myself included) have said that we really like this feature, because it helps easing people into things when onboarding, so we'll see how the final feature is designed and shipped.

zjturner commented 1 year ago

In the meantime, you can do a "local remote" build, where you run the "remote" build locally.

I do plan to try this, I just haven't taken the time yet to look for something that is well supported on Windows, which is my primary platform at the moment. Can you share what you're using for this? I know there are various options around, but if you have a recommendation that goes a long way :)

thoughtpolice commented 1 year ago

See https://github.com/facebook/buck2/issues/105#issuecomment-1509307503, as well as #225 and #297. There's been quite a bit of talk about this. In short, local execution is basically a special case of remote execution, and so the hope to make local builds hermetic is just to reuse the remote path against a server on localhost, leaving the code path for local execution as it is. This will result in a spectrum of options.

Whenever I get time to implement reapi-server correctly, the hope is that buck2d can just reuse the code entirely as a Rust dependency; then buck2d will simply offer up the needed gRPC endpoint to itself, and it will all "just work." But nobody has committed to it yet on the Meta team, since they already have a solution, and I'm the only person I'm aware of who has talked about something like reapi-server.

Currently I think symlink farms are used for some parts of the build, but only on headers actually. But it's only one part of the overall attack plan. The symlinking solution would probably be how you make things work on Windows for example, but on Linux you could use cgroups and on macOS there's another thing.

Unfortunately, this requires a full implementation of all the major RE APIs including the CAS, ActionCache, and Execution services. That's a bit of work to plan out. I'd have to do something like fork turbocache if I wanted to avoid rewriting the first two from scratch (every other implemenation is written in Go), but it's unclear that all of the design decisions for a local exe+action cache make sense for a big multi-user one (for example, you can just not support anything but Linux, but reapi-server needs to support everything). And this needs to be "click one button" level of ease of use, so using third party servers isn't ideal IMO.

Naturally I have a bunch of other things I'm dealing with now, but I do have a project using reindeer and buck2-prelude, so a hermetic solution for my own builds is a higher priority now (I guess). I have some thoughts about a nice, space-efficient design that can work for Linux at least. But unless someone actually cajoles me into it I probably won't spend time on it soon. That said, all I would need to do is fill out all the unimplemented!() bits.

Regarding the non-hermetic local mode, my opinion is split between "It's OK" and "It's an abomination." (But my opinion doesn't matter much, so don't fear, it's probably not going anywhere.) This is based on my experience with Nix, where the single-user non-sandbox mode (a comparable feature) is almost constantly broken, nearly to the point of uselessness, and is a source of massive issues for newcomers and the inexperienced.

The TL;DR is that if most people use sandboxing, but you don't, you basically can't guarantee anything works ever when you try to build it — it just seems OK now but, inevitably, once people test with sandboxes, and write Buck rules that are only tested in sandboxes, they never test outside of sandboxes. So things that are naturally "purified" by the sandbox can leak through easily. For example if you execute in a sandbox, there can't be a PYTHONPATH set with incompatible libraries that might not work in the interpreter. So you write the buck rules with, say, .py scripts to go along that do things (like rustc_action.py in buck2-prelude) and they are only tested in places without PYTHONPATH available (remote execution). Now a user has a fancy PYTHONPATH setup because they're a "Python Packaging Master", they execute without the sandbox locally, and somehow the .py script then picks up their custom PYTHONPATH and could load another library or load something else implicitly. That's very bad.

So over time as people write more and more reusable Buck2 code, if that code is predominantly tested inside sandboxes, it will probably find ways to break in a user's environment without sandboxing, because it can't possibly think up every necessary defense. So it might as well not exist at that point since you are inflicting vast pain on them. The most confusing example is cache misses, which can sometimes be nearly totally inexplicable, but it can also result in execution errors. Think of it as a whitelist versus a blacklist. It's very hard to blacklist and immunize your build against every customized workstation setup for every user. It's very easy to whitelist your build to work in an exact environment and then have everyone reproduce that.

But the problem, very simply, is that packaging pre-built language toolchains for N platforms, that work well with the prelude rules is a lot of work, and it's understandably something that Meta (and nobody else) has committed to yet. It basically requires absolute experts in the language and ecosystem in question to commit to supporting it. Or you need to be smart and reuse the work from elsewhere (e.g. Nix.) Therefore setting up hermetic toolchains is difficult and time-consuming. If we had a local hermetic sandbox, and a lot of high-quality toolchains, well, it wouldn't nearly be as big of a problem. But that's not the world we live in.

And honestly few prospective users tend to care about anything I just wrote until much later on, they mainly care about how fast things are and feature support and expect everything else to be someone else's problem. The good news though is that it's probably easier to make your existing Buck2 builds hermetic than it is to port your project to Buck2 and make it hermetic at the same time, once you're crippled from technical debt. The best place to start with Buck2, paradoxically, is very early on. So, if the un-sandboxed mode helps people adopt it earlier rather than later — it's probably still a net win. And we can still have a "cache-only" mode for reapi-server in that case, so even unsandboxed builds will get to have ccache on steroids if they choose.

krallin commented 1 year ago

In short, local execution is basically a special case of remote execution, and so the hope to make local builds hermetic is just to reuse the remote path against a server on localhost, leaving the code path for local execution as it is.

If we were to do this (and I think we should and probably will — this is definitely causing some problems for us internally as well, to be clear), I am not certain that this is how we would go about it.

This kind of approach is conceptually elegant, but I'm afraid it would probably not be practical to implement it as described here. Here's why:

First, there's going to be substantial overhead to sending local inputs over GRPC to a "local" RE implementation, and considering we often use local execution when things are a bit too large to reasonably send over GRPC, that's likely to be a problem.

Second, Buck2 does not currently support talking to 2 different RE backends. There are a lot of assumptions baked throughout the codebase that there is only one RE backend. We support multiple platforms, and we could even have multiple backends for execution or action cache (not that I'd expect that to be useful), but there can only be one CAS.

So, concretely, if we were to implement local sandboxing by having our local executor actually be RE, then that would prevent you from using local sandboxed execution and a RE backend together. That's arguably not desirable for most users but it also means it would definitely not be something we can use at Meta (whereas I think it is best if the community is using the same features we are: that guarantees better support).

FWIW, I think the changes that'd be needed in our local executor to support sandboxing are actually fairly minimal. We'd have to allocate a temporary directory, then:

For inputs in buck-out (those tend to be largest), hardlink them
For inputs not in buck-out (those tend to be smaller), copy them (for OSS users, hardlinking could work, but at Meta those two things are on different file systems).
Then run from said temporary directory

There is definitely some overhead to doing this, especially for actions with large symlink trees, but it's going to be strictly less than going over a RE backend.

With regard to "should sandboxing be the default or not", I think there's space to provide this as an option here.

I think if you buck2 init we should probably turn it on by default by injecting it into your config, but turning it off should be a matter of just switching up said config.

As a side note, with regard to caching, we do have support for caching local actions.

I think it's a bit unfortunate that right now enabling it requires also enabling deferred materialization, which is hard to recommend as a default for open source users whom may not have the kind of always-connected environment we rely on .

That said, the reasons for requiring deferred materialization there aren't good reasons. That's just happening because we currently tie state keeping and deferred materialization together, but that's not very hard to change.

yeswalrus commented 11 months ago

FWIW: In my experience, having the ability to opt-in to non-sandboxed mode for very specific use cases is sometimes super useful, especially when trying to migrate legacy systems - I've done a few bazel migrations & one of the most challenging things has been figuring out how to deal with external tools that expect building to place files in certain places - compile_commands.json in bazel is just one example, a BUILD folder with a particular layout is another. Neither of these are arguably good practice and I would have them print warning messages every time or something like that, but having the ability in the toolkit would be potentially really helpful in the migration process. I think of it a little like allowing unsafe {}. You wanna minimize it, but sometimes it's really what you need.

Timmmm commented 11 months ago

Any plans to use Landlock on Linux? That seems like a better solution than symlinks.

ndmitchell commented 11 months ago

Landlock does seem plausible - I imagine we'd have a trait representing the isolation mechanism, since Bazel has a number of options to choose from with various trade offs. Landlock might plausibly be one of those options.

burdiyan commented 7 months ago

This is a major pain point preventing me to get into Buck2. I've been using Bazel for many years, lately mostly like a glorified Make, doing pretty coarse grained targets, mainly just taking advantage of the inputs isolation and explicit dependency graph. No remote execution, no remote caching. I really like features Buck2 introduces for "dynamism" in the build, but I do need the isolation, at least for the input files (don't care that much about network sandboxing and so on).

Hope Buck2 will soon be able to catch up with Bazel on this.

cormacrelf commented 7 months ago

This is where local materialize deps + output dir creation + execution happens, if anyone wants to give it a crack. A reasonable first step would be to define a trait and make LocalExecutor either generic over it, or accept a boxed trait object. Then start working on describing everything Landlock would need as a trait and some types.

https://github.com/facebook/buck2/blob/4939e906b3cd4f9abebc5e45680b01ba2c3c9c83/app/buck2_execute_impl/src/executors/local.rs#L210-L537

Edit, some notes:

there is a landlock crate that looks pretty good, and includes compatibility detection. Doesn't include the headers for the somewhat experimental (?) network sandboxing that landlock can do. (I just noticed: it is written by Mickaël Salaün, aka the author of Landlock.)
the Bazel docs on sandboxing are an excellent overview of what the various sandboxing technologies are capable of. Notably the linux one using PID namespaces to be able to kill daemons would be nice.
Jart's landlock-make is programmed against the OpenBSD pledge API (which cosmopolitan libc offers), and that may be a useful starting point to make a sandbox trait. Landlock itself is rather fine grained, more fine grained than we would need I think.
For configuring sandboxes, Bazel's approach doesn't allow you to declare the capabilities you rely on and have it fail if it can't select a sandbox to satisfy them. It just lets you limit the executor to picking from a smaller list of sandboxes.
If we did allow declaring sandbox capabilities, then I'm not sure where they would be declared. Rules? Toolchains? Executors? Buckconfig?

ndmitchell commented 7 months ago

The Bazel folks seem to have spent a lot of time doing sandboxing implementations - ideally we just get a Rust library that supplies everything and is cross-platform and we can keep most of the complexity outside Buck2, and have it reused elsewhere in the Rust ecosystem. We're not keen to make a big ongoing investment in local hermetic builds, ideally it would be shared with others and we are just a consumer.

As to where it gets declared, you really need some rules to use the sandbox, and some to escape it. Maybe we need ctx.actions.run to have sandbox = False to declare things incompatible with the sandbox? If you did that, then maybe Buckconfig is the right place to opt in? Or maybe the execution platform? I could see arguments for both.

burdiyan commented 7 months ago

One sandbox strategy that Bazel has which is very poorly documented (but it's actually my favorite 😁) is processwrapper-sandbox, which basically creates a symlink forest for the inputs, and makes it available to the build actions. No other fancy sandboxing is done in this strategy. Doesn't work on Windows though, because Bazel keeps believing that symlinks are hard on Windows, which is becoming less and less true over time.

Without knowing anything about the internals of Buck2, I'd assume a sandbox strategy like this would be the easiest to implement.

zjturner commented 7 months ago

buck2 is already symlinking individual headers into the buck2-private-headers folder and then adding the include paths for the buck-private-headers folder to each target. But it's also adding the original directories you specify into the include paths. It seems like if it just didn't do that, local builds would be much more hermetic, because it's already doing a lot of the other work.

burdiyan commented 7 months ago

Is that only for C/C++ then? Bazel does it for any build action and any kind of input (which always end up being some kind of a file anyway).

ndmitchell commented 7 months ago

@burdiyan - that is only for C++. And it's really being done to make sure private headers don't leak too much - so even though the action has access to the headers, it doesn't use them too much.

As for making it an execution strategy, that seems entirely reasonable. Although I would caution that on Windows while there are symlinks, they have different constraints and different rules and have upset antiviruses (including Microsoft Defender). That said, having it work only on Linux/Mac would still have value. PR welcome.

@zjturner - this seems like a reasonable thing to control on the C++ toolchain. Internally have lots of headers that are reached by both reasonable and unreasonable means so unlikely it would be compatible with our stuff. Hopefully you have better header hygiene than we do!

burdiyan commented 6 months ago

The other day I downloaded the latest buck2 release, and I'm not sure if something changed recently in Buck2 or I am missing something.

$ buck2 --version
buck2 db95957ed6423045823901fd83baab09 <build-id>

It seems like now Buck2 does do some isolation for source files. I'm not sure if it's something new, but I remember last time I tried buck2 long time ago it didn't do that.

So now, I wrote a simple genrule that would concatenate some text files, and appends the output of ls -la into the same output file. And if before the rule would have access to all the files in the package, now it has access only to the declared sources, which are symlinked into the execution directory.

This is exactly the level of sandboxing and hermeticity I want :) In addition to having clean wiped environment variables in actions, which still doesn't happen (all environment variables are inherited from the machine running the action).

burdiyan commented 6 months ago

OK, I think I now understand what's happening. Apparently the first time I tried to see whether Buck2 had some sort of isolation I did this with a custom action, and now I was doing it with a genrule. So indeed, custom action doesn't seem to have any isolation by default, but genrule does seem to symlink the sources into a separate directory (i.e. does some sort of a sandboxing).

facebook / buck2

Roadmap for local hermetic builds #358