Build-time sandboxing - Githubissues

tarcieri commented 5 years ago

There's been a lot of discussion in the WG (and in the past, several pre-RFC style proposals) to add some sort of sandboxing around code executed during the build process, including things like build.rs files and procedural macros.

I wrote down my rationale for what I'd like to defend against with such a sandbox:

https://tonyarcieri.com/rust-in-2019-security-maturity-stability#sandboxing-for-code-classprettyprintbuildrsco_2

tl;dr: build-time attacks are stealthier than trojans in build targets, and permit lateral movement between projects when attacking a build system.

The devil is in the details, though. This issue is for discussing them.

Some prior art / discussion:

tarcieri commented 5 years ago

In broad strokes, I think there are two paths to be pursued, which are not mutually exclusive and also do not depend on each other and therefore can be independently pursued by anyone interested:

"Do No Harm" sandbox: retrofit sandboxing in such a way that it does not break any existing crates, e.g. add sandboxing which does not break crater. This severely limits what can be sandboxed, but also means we can turn it on for everyone today.
"Ambitious" sandbox: off-by-default sandboxing, but where we can explore stricter restrictions which would be bigger wins. These would be things like restricting network access, filesystem access, and things like seccomp policies or running all builds under gaol.

Regarding number 2, what I think would be nice is for crates to opt into a more restrictive sandbox initially, then make that sandbox the default for the next Rust edition (with the ability to opt out). Then in the next edition after that, make it mandatory.

ckaran commented 5 years ago

"Ambitious" sandbox: off-by-default sandboxing, but where we can explore stricter restrictions which would be bigger wins. These would be things like restricting network access, filesystem access, and things like seccomp policies or running all builds under gaol.

I like/prefer this approach. The hard problem is determining what should/shouldn't be allowed. I propose that we create an 'underhanded rust' competition similar to the underhanded C competition. I don't think we need to go to the extent of actually writing code, at least not until someone starts writing a real sandbox, but it would be a good way to look for bugs in ideas. If we set up a wiki or other permanent area where all the ideas that are generated are gathered, then we can start figuring out what a real sandbox implementation will need (and what it will need to defend against).

tarcieri commented 5 years ago

The hard problem is determining what should/shouldn't be allowed.

For this sort of approach, I think it might be nice to start with the most restrictive sandbox that makes sense, and allow crate consumers to opt into the special permissions some crates might need to do unusual things at build time.

Here's a longer writeup of that idea: https://internals.rust-lang.org/t/build-script-capabilities/8635

Shnatsel commented 5 years ago

underhanded Rust

https://underhanded.rs

"Do no harm"

"Ambitious"

Why not both?

ckaran commented 5 years ago

@Shnatsel And that is why I should google more and talk less...

tarcieri commented 5 years ago

Why not both?

I think both are potentially valuable. To be clear this isn't necessarily an either-or decision.

tarcieri commented 5 years ago

Interestingly enough, it looks like crater now forbids network access: https://github.com/rust-lang-nursery/crater/pull/336

ckaran commented 5 years ago

FYI, there is a tracking issue that might be of interest to others reading this thread.

Shnatsel commented 5 years ago

Do we have a threat model defined? I don't think we could move forward very effectively without one.

tarcieri commented 5 years ago

@Shnatsel I wrote up a bit about that in my blog post:

Build scripts / proc macros allow attackers to move laterally between low value and high value targets
They also permit stealthy attacks, whereas trojan payloads in targets leave forensic evidence

Now, the above isn't universally true of course. Some people don't have low vs high value targets. Some people do builds in virtual machines or isolated clusters so as to avoid lateral movement. The extent to which these threats can be addressed by other means than adding a compile-time sandbox to Rust is certainly debatable.

Shnatsel commented 5 years ago

I.e. we assume that a dependency that's pulled in is malicious or compromised?

tarcieri commented 5 years ago

Yes, this all presumes an attacker has published a malicious crate and a victim has consumed it

ckaran commented 5 years ago

Would Docker or LXC be a good way to mitigate the compile-time problems? My thought is that the rust community shouldn't reinvent the wheel; if at all possible, we should reuse whatever security methods are available to us.

tarcieri commented 5 years ago

@ckaran personally I already do containerized builds using Docker. I agree that mitigates a lot of the same problems, and that Rust shouldn't reinvent Docker, but not everyone is using containerized builds, and even then I'd prefer a "belt and suspenders" approach.

ckaran commented 5 years ago

@tarcieri I agree about the 'belt and suspenders' approach, but I was thinking of what could be done to mitigate issues quickly. One method would be to make docker/LXC/something else containers be the standard; when you install rust, you get the containerized versions.

Once that is done, we've got a little breathing room and can consider what to do next properly, rather than stomping out fires if they come up.

tarcieri commented 5 years ago

I don't think containers make sense for that. It's great if they're part of an (existing) build system, but a containerization environment is an extremely heavyweight dependency to require for Rust itself.

I think something like gaol, as a lightweight, self-contained Rust library (and one which the core team is already somewhat familiar with, I'd wager) designed specifically for the purposes of sandboxing, would make more sense:

https://github.com/servo/gaol

ckaran commented 5 years ago

@tarcieri Gaol is a good idea, but as the project states on its own page "At the moment, gaol is only lightly reviewed for correctness and security. It should not be considered mature or "battle-tested". Use at your own risk." Given that, I figure that it might be better to get something that plugs the hole now, and then replace it with something better when that something (gaol, or something else) improves to the point that it can be used instead.

tarcieri commented 5 years ago

I suspect making Docker, runc, or any sort of containerization tool a mandatory dependency of Rust is going to be a nonstarter.

ckaran commented 5 years ago

OK, so how do you suggest we proceed? Use gaol (or something similar) ASAP, and fix bugs as they come up? While I accept that gaol is the better approach, my concern is one of time; it may be quite a long while before gaol is considered to be as good as the current available containerization technologies. How do we deal with any problems that crop up between now and then?

tarcieri commented 5 years ago

I think we're at the point we can attempt a prototype (perhaps an experimental out-of-tree one). I think that's something @alex expressed interest in.

I can provide a writeup of the sort of thing I'd like to see.

kpcyrd commented 5 years ago

If I recall correctly gaol requires special privileges that a regular user doesn't have. Restricting certain syscalls with seccomp is probably the best way forward.

tarcieri commented 5 years ago

gaol allows you to configure a profile regarding which kinds of sandboxing you'd like performed and also detect whether that type of sandboxing is applicable to the current platform.

I'm not sure which of its sandboxing features require elevated privileges, but if they exist, we can shut them off for non-superusers, and enable them for superusers. This is particularly helpful as it's quite common for builds to run as root. If anything, I would like to see that functionality leveraged in those scenarios.

One of the many things gaol does is configure Linux seccomp policies, so if that's what people would like to see, we could start there with a gaol profile.

zachreizner commented 5 years ago

Jotting this idea down that we came up with on the zulip channel: If we don't want sandboxed build scripts to have networking, but we do want to support the use case of downloading a native source dependency that is missing (such as zlib or openssl), one potential solution would be to have optional "sidecar" zips that include the necessary source. If the build script determines that it needs it, it signals this to cargo which will download a whitelisted zip from crates.io.

tarcieri commented 5 years ago

@zachreizner what makes a "sidecar zip" different from publishing the same contents as a crate?

zachreizner commented 5 years ago

The difference is that most crate content is not optional, where as this explicitly would be. That being said the idea of "sidecar zip" could be implemented using existing crate mechanisms.

tarcieri commented 5 years ago

Yeah, I think leaning on crates as the archive format is the way to go. Potentially multi-stage builds could be used to e.g. run a script to determine of the next stage needs those optional resources, potentially enabling them as cargo features, which are the existing conditional compilation mechanism for selectively downloading crate dependencies.

DoumanAsh commented 5 years ago

Why security of user, is not user responsibility? Default sandboxing seems as excessive measure, instead it should be enabled as option by user. You should consider that downloading/fetching FFI code is common for C wrapper libraries. It makes no sense to require special actions from such libraries

tarcieri commented 5 years ago

@DoumanAsh the only thing it'd require is packaging the FFI code in the -sys crate itself, rather than invoking arbitrary commands (git, curl, etc) to download arbitrary artifacts at build time.

Alternatively, as discussed above, it could also be put in a separate crate (e.g. foobar-src) which is pulled in via a cargo feature of foobar-sys, e.g. src = ["foobar-src"]. With something like that, build.rs could drive a second stage build with that cargo feature active in no system install of a particular library is available, which would use cargo to fetch the source code. Figuring out how to make that work properly with sandboxing will be a little tricky, but more doable if cargo is the main distribution mechanism everything leans on.

What that really requires is establishing some better conventions for handling of third party artifacts like this in cargo projects, and getting people to adopt them. The other positive side effect of this approach is that builds are reproducible, whereas builds that shell out to arbitrary tools may encounter artifacts they require disappearing, thus breaking them.

DianaNites commented 5 years ago

I was looking into FFI wrappers recently and it looks like most if not all of them use git submodules/vendor the third party sources directly if they need to compile it. Possibly behind a feature.

This is much better, easier, and quicker than trying to download them in a build script, which brings in a huge amount of dependencies for hyper and ssl and takes forever to compile.

DoumanAsh commented 5 years ago

@tarcieri FFI code that is in -sys crates, still needs to pull dependencies. Either by downloading it, or fetching from submodule so you cannot get away with arbitrary commands

arbitrary commands

This is not arbitrary, if command is required as part of build step, it it part of build procedure.

it could also be put in a separate crate (e.g. foobar-src) which is pulled in via a cargo feature of foobar-sys,

I don't think fetching source code requires for each ffi create to have additional -src crates that would only contain C/C++ code There is no point in such crates for us, as Rust users, because they cannot be used directly. Not to mention someone would need to constantly update them and publish. While with download links/git submodules updates can be done as part of existing -ffi crate. No need to increase maintenance burden

This is much better, easier, and quicker than trying to download them in a build script,

This is the same as downloading though

tarcieri commented 5 years ago

When I say "arbitrary commands", I mean ad hoc, non-cargo-based mechanisms for fetching what are fundamentally build artifacts. It seems there's a plethora of such mechanisms in use: grabbing artifacts from git using the git2 crate, executing git as a subprocess, using curl-rust or invoking wget, curl, etc.

Ultimately all of these mechanisms are just different ways of downloading some code. However, using any mechanism other than a crate does not provide immutability, and runs the risk that the resource being fetched will disappear in the future, or be changed/updated to a new version.

I don't think fetching source code requires for each ffi create to have additional -src crates that would only contain C/C++ code

The simple alternative to this is to simply ship the code in the -sys crate itself. It's still free to link to the system version if it's present, or otherwise compile the vendored source.

Not to mention someone would need to constantly update them and publish.

The only alternative to someone updating the -sys crate whenever there is a release of the crate it binds to is to have a -sys crate that grabs the "latest version" somehow, be it through git or using something like curl to grab a latest.tgz...

...but that risks an upstream change being incompatible with a given version of your -sys crate, which means something that compiled today does not guarantee it will compile tomorrow. This makes builds non-deterministic, which I would personally consider an antipattern.

While with download links/git submodules updates can be done as part of existing -ffi crate. No need to increase maintenance burden

There are two options here:

Update a link to a resource in the -sys crate
Update a submodule ref in the -sys crate, as @DianaNites suggested

Both of them require the same amount of work. The latter guarantees a reproducible build where the upstream resource will not go away, and avoids the need to use non-cargo based mechanisms to fetch external resources.

ckaran commented 5 years ago

Going back to what @DoumanAsh said earlier about why we need sandboxing; in addition to there being a risk to the person who is compiling the code, stealthy attacks risk rust's reputation as a better language. Right now, I can compile arbitrary C code knowing that its macro system is too primitive to do any damage to my system. I think I might be able to do the same with C++. I'm pretty sure that I don't have that luxury with rust at the moment.

Does that mean rust is worse than other languages? I don't think so; after all, if you're running arbitrary python code, then you're already at the same risk level as rust is, and I suspect that the same is true for a lot of other code out there. However, since a major selling point of rust is safety in one form or another, making it safer from an end user's point of view just seems to make sense.

tarcieri commented 5 years ago

In an attempt to move this forward, I've created a cargo-sandbox GitHub project and associated crate with some initial boilerplate:

I've also opened an initial issue to discuss the project's goals and initial design:

https://github.com/rust-secure-code/cargo-sandbox/issues/3

I am going to go ahead and close this issue and would suggest that anyone interested in this particular topic head over to that GH issue / repo.

rust-secure-code / wg

Build-time sandboxing #29