buildkite / feedback

Got feedback? Please let us know!
https://buildkite.com
25 stars 24 forks source link

Taking external contributions from untrusted contributors more safely #360

Open petemounce opened 6 years ago

petemounce commented 6 years ago

So, one use-case that I have is offering CI for the various open-source projects and products that improbable.io has and will offer. I assume I'm not unique.

Some requirements that I have:

  1. contributors need to be covered by a contributor-license agreement.

    • I was planning on representing "has signed / not signed" via a github label and a bot to apply it.
  2. contributions from untrusted contributors should be assessed for "safety" - since, obviously, continuous-integration is aka "Remote Code Execution as a Service". Examples of recon below. I'm sure we could collectively come up with others with some effort; penetration testing is entertaining. Thing is, I'm not really sure how to defend against things like this without also being extremely restrictive. It's a blacklist sort of approach, and that's inherently reactive. Probably there're the makings of a blog post as I do this...

    • I was planning on representing "is trusted / is random person on internet" via a github label and a bot to apply it.
    • Unfortunately, the best I can come up with here is that a trusted contributor review the PR before "unlocking" CI for the commit at which the review happened. I don't think it's enough to whitelist a PR after that, since obviously a contributor could, after having their PR whitelisted, commit some recon/exploit.
  3. I would very much like external contributors to be able to see the build logs, so that they can self-service debugging their PRs.

    • However, I'd also like the build logs to have some features around auto-scrubbing of secrets - but I'm not really sure how that would work without also giving BuildKite the secrets themselves so that they could obfuscate them away from build logs as/when they're emitted from things. That might be reasonable if those secrets were encrypted at rest, BuildKite side, and BuildKite doesn't know what the secrets are for...

Some outcomes that I'd like to explore:

I'd be really interested to see some links to good reading on setting up a CI cluster that accepts external contributions but in as safe a way as possible, since this is exactly one use case that I'm going to be enabling.

I'd also be really interested to see to what degree people that have done that, have also set up the workflow for vetting external contributions, and then whether there are ways to track "trustedness" over time beyond subjective "well, they've done x PRs and look legit, let's tick the box now".

My context

I'm going to be running buildkite agents within GCE for Linux and Windows (on naked VMs first, then probably shifting to managed Kubernetes), and on-premise for macOS.

I'm likely to, later, spin up clusters of Linux and Windows within AWS (again, probably naked VMs first, but possibly skipping that if I got to Kubernetes within GCE earlier than I expect).

I'm very likely to be using Hashicorp Vault to give secrets to agents.

The on-premise estate will also include some Windows machines because we need to involve games consoles in CI, as well. This is a really entertaining prospect, too.

Recon/exploit examples

Brute force

Get me the registration token, now I can set up my own agents and start to steal IP...

(I'm pretty sure you've got this covered, since I've seen this value be hunter2'd)

echo $BUILDKITE_AGENT_TOKEN

Default credential locations for cloud things

Both of these recons will probably get me at least read/write to the artifacts bucket(s), if successful, allowing it to be scraped or poisoned. Additionally, it would get me read/write to my Bazel remote cache, which would be Really Bad.

Probably also read-access to the secrets bucket if https://github.com/buildkite-plugins/s3-secrets-buildkite-plugin is in play, for the duration of the build running at any rate. I think the plugin could be improved by using session-tokens (so, the IAM profile allows the machine to assume a role that can read from the bucket, rather than allowing it directly) and then explicitly revoking the issued token inside the pre-exit hook? I recently discovered the AWS credentials files support role-assumption built-in (https://docs.aws.amazon.com/cli/latest/userguide/cli-roles.html), which would have saved me a tonne of typing if I'd realised that four years ago...

AWS

On a competently run AWS instance, this will get me that instance's IAM profile until the next 60 minute rollover (default, I think; might be out of date on that), but I might luck out and get an AWS instance that has been overprivileged...

echo $AWS_ACCESS_KEY_ID
echo $AWS_SECRET_ACCESS_KEY
echo $AWS_SESSION_TOKEN 

GCE

On a competently run GCE instance, this will do similar:

# if set, points to a service account file
echo $GOOGLE_APPLICATION_CREDENTIALS

Metadata services

Both clouds have unfettered access to metadata services that one can reach, from a shell. So, probably remove curl/wget. But... On a build machine? Probably need both of those, and that's a blacklist approach anyhow.

Metadata services contain things like ssh-keys and cloud-credentials, as well as other metadata.

ssh keys

The buildkite agent docs guide how to set up ssh keys so that the agent can checkout from git etc, and mention ssh-agent only in the context of using more than one ssh key.

I'm not an expert, but I think at the end of following that guide, a build script would be able to do:

ls -al "${HOME}/.ssh"
cat "${HOME}/,ssh/id_rsa"

and then start checking out my source code shortly thereafter.

I think I've read that it'd be possible for one user to give access to a running ssh-agent to another user, so that the 2nd user (here, buildkite-agent) doesn't have access to the keys at rest on disk. I think that would be a great addition to that documentation page - a directive to make the ssh private keys not readable to the buildkite-agent user, but have some other user mount them into the ssh-agent, and then share the socket to the buildkite-agent.

docker

Oh, docker. Apparently, docker installs by default to run with root privileges. I gather that might not be A Great Plan.

@lox pointed me at https://github.com/buildkite/sockguard which looks pretty promising.

lox commented 6 years ago

Thanks for the writeup @petemounce! I'll copy my response to your initial email in here and then follow up on the bits that are new:

Securely running third-party builds is a really tricky problem that we've just started to think about in some depth. Generally our advice here is to have a block step at the start that checks a white list of repos and contributors and then do a code review before running the build, but eventually we'd like to have some good answers on how to have enough defense-in-depth in place that we feel comfortable recommending automated third-party builds.

As you've raised, if you run third-party builds on your buildkite-agent host, it's close to impossible to achieve a level of security. Thankfully we have containers on windows and linux. If you run your builds inside an ephemeral container, you have a much better chance of locking down what can and can't be accessed.

I just spent some time putting together https://github.com/buildkite/docker-bootstrap-example, which might be a starting point for you. It mentions two pieces of tech that are going to be a key part of secure, ephemeral third-party builds, namely an IAM metadata service proxy and sockguard. Check it out and we can discuss, with the IAM service in place (that can deliver limited AWS permissions to a single container) it would handle a lot of the secrets questions (via SSM Parameter Store).

One piece we don't have a good answer to (yet) is the Agent Access Token. At the minute the Agent registration token is used by the agent to connect to our API's, where it gets an Agent Access token, which is a session-based token for the agent's connection to our API's. Leaking the Registration key is a major deal, because it allows new agents to be registered, but the Access token is scoped much more strictly. Still, you could do enough things with an Agent Access Token, for instance restarting a build and injecting some new steps via a pipeline upload. I'd suggest not exposing the Access Token to third-party builds and perhaps deal with artifacts and metadata some other way.

petemounce commented 6 years ago

Well Hello: https://cloudplatform.googleblog.com/2018/05/Open-sourcing-gVisor-a-sandboxed-container-runtime.html

lox commented 6 years ago

Started work on an alternative to passing an Agent Access Token to jobs in https://github.com/buildkite/agent/pull/759.

lox commented 6 years ago

Agent v3.1.2 has an experiment in it for passing a socket to builds vs $BUILDKITE_AGENT_ACCESS_TOKEN, we're hoping this will become a standard feature in the agent soon!

KevinGrandon commented 6 years ago

This is mentioned in the other bug thread, but if you haven't seen this little tool, it's a great starting place for secure open source contributions. I highly recommend it for the time being. https://github.com/mvines/ci-gate

petemounce commented 6 years ago

Oh, excellent. We'd been intending to write a plugin for the Kubernetes' team's workflow automator, prow (https://prow.k8s.io/plugins) if one didn't already exist. Basically, /ok-to-test from a trusted contributor.

petemounce commented 5 years ago

https://diogomonica.com/2017/03/27/why-you-shouldnt-use-env-variables-for-secret-data/ is a good-sounding writeup of why to not use environment variables for secrets.

lox commented 5 years ago

Yup, completely agree with that @petemounce!

petemounce commented 5 years ago

Highly relevant: https://edoverflow.com/2019/ci-knew-there-would-be-bugs-here/

petemounce commented 2 years ago

Similarly relevant: https://research.nccgroup.com/2022/01/13/10-real-world-stories-of-how-weve-compromised-ci-cd-pipelines/