Closed mattmoor closed 6 years ago
There are several places where we need secrets for the Build CRD.
For the first, this can already be done today, but only by attaching imagePullSecrets
to the default service account for the namespace hosting the Build (since w/o the linked PR we don't allow overriding that currently).
I like the cleanliness of certain types of secrets that automatically mount from the service account.
I propose that to achieve this, we introduce special "types" of Secret
that direct us to mount secret volumes in particular ways and potentially pass additional environment variables.
Given a secret of this type:
apiVersion: v1
kind: Secret
metadata:
name: foo-bar
type: cloudbuild.dev/image-push-secrets
data:
config.json: <base64 encoded ~/.docker/config.json>
Which is added to the service account linked to a Build as:
apiVersion: v1
kind: ServiceAccount
metadata:
name: baz
secrets:
- name: foo-bar
This will cause the Job
to get the following Volume
definition:
volumes:
- name: foo-bar-secret-volume
secret:
secretName: foo-bar
This will cause the Step
containers to get the following VolumeMount
entries:
volumeMounts:
- name: foo-bar-secret-volume
mountPath: /var/build-secrets/foo-bar
This will also cause the Step containers to get the following EnvVar
entries:
env:
- name: DOCKER_CONFIG
value: /var/build-secrets/foo-bar
@ImJasonH WDYT?
Continuing in a new comment, I thought I'd expand on some of the Git credentials I envision here.
@ImJasonH I'm sure your git
-fu is waaaaay beyond mine, so feel free to beat me up, this is mostly just something that works based on man
pages :)
This builds on top of Git's support for XDG_CONFIG_HOME
, which enables us to put assorted Git configuration files in more arbitrary places in the filesystem.
Given a secret of this type:
apiVersion: v1
kind: Secret
metadata:
name: baz-blah
type: cloudbuild.dev/git-config
data:
# These will be materialized as files under $XDG_CONFIG_HOME/git
# See bottom for samples for username/password
config: <base64 encoded config file>
credentials: <base64 encoded credentials file>
Which is added to the service account linked to a Build as:
apiVersion: v1
kind: ServiceAccount
metadata:
name: baz
secrets:
- name: baz-blah
This will cause the Job
to get the following Volume
definition:
volumes:
- name: baz-blah-secret-volume
secret:
secretName: baz-blah
This will cause the Source containers to get the following VolumeMount
entry:
volumeMounts:
- name: baz-blah-secret-volume
mountPath: /var/build-secrets/baz-blah/git
This will also cause the Source containers to get the following EnvVar
entry:
env:
- name: XDG_CONFIG_HOME
value: /var/build-secrets/baz-blah # No .../git is intentional
Sample with username / password:
$ cat $XDG_CONFIG_HOME/git/config
[credential]
helper = store
$ cat $XDG_CONFIG_HOME/git/credentials
https://username:password@github.com
TODO(mattmoor): Figure out if this pattern can work for SSH auth as well.
I've never heard of $XDG_CONFIG_HOME
, what is it? Is it just whatever supported/documented place Git looks for config, that isn't $HOME
?
Yeah, as I was crawling through the Git docs looking for what their answer for DOCKER_CONFIG
is, I found this. It's some kind of standardized configuration home path, but I also hadn't heard of it.
Sadly the other two options (more standard) were essentially:
HOME
: which GCB's already overriding / mounting, and/etc
: which is far too broad for us to just mount over.I'm definitely open to ideas :)
It's a freedesktop.org standard.
https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
Which BTW specifies falling back to $HOME/.config if not set.
On Wed, Feb 7, 2018, 07:56 Jason Hall notifications@github.com wrote:
I've never heard of $XDG_CONFIG_HOME, what is it? Is it just whatever supported/documented place Git looks for config, that isn't $HOME?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/build-crd/issues/4#issuecomment-363814382, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4Bq2HFAy4f6fThIhOXqmK-JuzncjVdks5tScekgaJpZM4R4dK8 .
@BenTheElder @cjwagner Does Prow deal with this at all?
@bparees I'd love your thoughts on this. This is one of the "possible future conventions" I listed on my slides.
We do mount secrets to jobs frequently and currently we use an env to point to an SSH key (mounted as a volume / secret) for one of our repos.
I'm not sure there's exactly a broader convention for this yet, our git checkout logic should be redone :-)
@fejta as well
@BenTheElder I smell opportunity knocking :)
Does this seem sane to you? What would you change?
Bear in mind that you'd still be able to mount secrets via volumes and do a bunch of explicit stuff, but I'd like common cases to be more streamlined (e.g. /workspace
).
It looks pretty nice, though I'm not sure how to integrate this with the currently planned init container style checkout. We like the idea of moving the entire checkout process out of the test / build images.
Supposing I have multiple "git" secrets and all of them are mounted, how does the git clone logic decide which secret to use (presumably all of them are mounted)? (it looks like you're anticipating using XDG_CONFIG_HOME, but if there are multiple git secrets that falls down)
Similar issues w/ the pull secrets if you end up having to pull from multiple docker repositories (or just have multiple pull secrets defined w/ your service account)
Another thought might be to consider the approach we took for auto-matching secrets to builds via an annotation on the secret:
In general we took a more opinionated approach in which we specifically determine the git, push, and pull secrets for a given buildconfiguration and mount only them, in specific locations where our build logic knows to look for them, to avoid that ambiguity. It's definitely more work for the build controller up front, though.
I would also have expected to see the mounting path be "/var/build-secrtes/git/secret1" and "/var/build-secrets/docker/secret1", etc, rather than the current "/var/secret1/git" as it would make iterating all available secrets easier, for a given type of secret. Was there a reason you did it the other way? Anticipation of a single secret having multiple types embedded?
Supposing I have multiple
@bparees My initial bias was towards having users coalesce secrets for the multiple-{Git, Docker} cases, and allowing only one. e.g. these are all the {Git,Docker} creds available to this K8s build robot. You should be able to achieve "least privilege" through vigilance creating K8s service accounts, but I agree this could get tedious.
Some of this stems from the underlying tools picking the "one true" config file, I briefly debated writing a tool to blend these as a pre-step, but am currently biasing away from that (if we can achieve it thru just env/volume).
Another thought might be to consider the approach we took for auto-matching secrets to builds via an annotation on the secret
Awesome, I'll TAL in a bit.
I would also have expected to see the mounting path be
Yes, this was half laziness and half structural requirements from XDG
. I would prefer a common scheme, so perhaps given a 1:1 requirement we could make the paths:
/var/build-secrets/docker/...
/var/build-secrets/git/...
I think this may be consistent with how the K8s service account's token is mounted (to a fixed path), but I'd have to look to confirm.
@bparees Still digesting, please forgive stupid questions :)
The annotation trick is neat, and I can see how it would work with GitSourceSpec
, but I don't think it would work with Custom
(type: corev1.Container
), which someone might use if they need to do something more specialized than how we're checking things out. I'm assuming the filtering is done before the credential is mounted, vs. synthesizing a blended credential configuration? I also waffled on whether the credentials should be made available to just Source or also Steps and biased towards both to enable folks to potentially do things like cut release branches (that's perhaps a mistake, and folks should use explicit Volumes
). The relevance being that perhaps another Git credential could/would match the destination repo, and it's unclear how to detect that.
This (manual secrets on git:
) is another approach I was considering, but biased away for similar reasons (e.g. re-use by custom git clone
logic).
This (custom .gitconfig
) actually seems pretty similar to what I'm describing here, but the examples aren't as detailed as other places, so hard to say for sure?
I totally missed kubernetes.io/basic-auth
and kubernetes.io/ssh-auth
when I looked through the built-in types (I'm honestly facepalming, so thank you). This is making me rethink some of the choices here. I'll try and sketch it here (feels closer to what y'all do):
Secret
types to the K8s service account assigned to the Build
.Secret
with TBD metadata to guide it's translation (see below) into Git/Docker metadata (e.g. scope to URLs)Secrets
attached (tho Env/Volume isn't good enough, since no annotations), which sets up $HOME
with this metadata appropriately.I briefly debated writing a tool to blend these as a pre-step
FYI, the third step is a throwback to this.
Something I still don't like about this is that it doesn't (easily) enable a container (e.g. docker
) to provide its own configuration for installed credential helpers as a fallback when there's no matching secret... I liked this about the approach above optionally overriding DOCKER_CONFIG
...
I have a whole ton of meetings starting shortly, but I'll keep thinking on this and perhaps expand upon it if it doesn't seem crazy :)
I don't think it would work with Custom (type: corev1.Container), which someone might use if they need to do something more specialized than how we're checking things out.
Certainly true. It's only intended to smooth out the "typical" flow which for us is "my code is in git and i'm going to tell you the git repo". If you want to clone from svn manually, we don't offer to help you (but you can use buildsecrets manually at that point which again is similar to what you're proposing, except for us it's very explicit: you tell us what secrets you want to consume in your build, rather than indirect via a service account+secrettype)
I'm assuming the filtering is done before the credential is mounted, vs. synthesizing a blended credential configuration?
correct. When a build is requested we do a bunch of preprocessing to determine the shape of the build pod we will create, including what secrets need to be mounted into that pod+where.
I also waffled on whether the credentials should be made available to just Source or also Steps and biased towards both to enable folks to potentially do things like cut release branches
We have mechanism to make secrets available to the equivalent of both of those steps (secrets for cloning, secrets that the build step itself(e.g. maven) can consume). We've found our users have a need for both. Some users have crazy proxy setups that require credentials for npm to pull dependencies, for example, so they want to inject the proxy credentials as a secret. A secret for pulling dependencies from a private maven repo also be an example.
This (custom .gitconfig) actually seems pretty similar to what I'm describing here, but the examples aren't as detailed as other places, so hard to say for sure?
yes i think that's accurate. It's our grand punt option for "tell us how to authenticate when cloning from your repo"
Something I still don't like about this is that it doesn't (easily) enable a container (e.g. docker) to provide its own configuration for installed credential helpers as a fallback when there's no matching secret.
I'd separate "how secrets get provided to the container" from "how the step/container decides which secrets(if any) to consume". I'd probably even separate it out to:
1) how to pick which secrets are applicable 2) how to provide them to each step/container 3) how each step/container picks from the available secrets(and/or content baked into the image) to perform its work
2+3 are slightly related. 1 should ideally have no influence on 2+3.
The pain point we had which lead to creating the secret annotation matching mechanism was that for our users, they were frustrated at having to create a buildconfig and then update it to ref the git clone secret they needed, and doing that for every buildconfig in their project when they often used the same secret. I think as long as you're attempting to solve that problem for users (allow them to create the secret once, and automatically consume it many times), you're on the right track.
I'd separate "how secrets get provided to the container" from "how the step/container decides which secrets(if any) to consume". ...
I think for the level of specificity you outline that folks should be going through Volume[Mount]
, which I believe gives this level of control: I can access a secret and provide it by a configurable method to precise steps.
This is perhaps naive, but under what circumstances is it beneficial to have an alternate system with this level of control?
It's only intended to smooth out the "typical" flow which for us is "my code is in git and i'm going to tell you the git repo"
This! I'd characterize this issue as going after a convention for what we expect will be a sweet spot for Builds, sort of like /workspace
where we know that most steps will want to mount in some working tree.
I think that I could live with limiting magic Git auth to the GitSourceSpec
using similar logic to your annotations.
I don't think the same is true for Docker auth because my expectation is that we will have a wide variety of steps that potentially publish images based on some sort of Docker auth: docker push
, "FTL", bazel
, JIB, ...
How do you guys deal with image push secrets?
This is perhaps naive, but under what circumstances is it beneficial to have an alternate system with this level of control?
sorry not sure what you're referring to as the alternate system. You mean the annotation mechanism vs the explicit reference mechanism? It's in part just a product of evolution. We had the explcit mechanism, then we later added the annotation mechanism. If we'd had the annotation mechanism the whole time we might not have used the explicit mechanism. That said, there are probably cases where someone wants to use two different credentials for two different builds, against the same git repo (in which case the annotation mechanism isn't sufficient). I'm always amazed at the use cases people have :)
How do you guys deal with image push secrets?
we interrogate all the "docker-type" secrets associated w/ the builder service account, and find one that applies to the image registry in question. We then mount that secret into the build and use it as needed (we actually handle multiple image secrets since we have the builder image, input images, and output image..so we can have a different secret for each)
sorry not sure what you're referring to as the alternate system.
I was just suggesting that besides the corev1.Container
Volume[Mount]
that this already supports, is more needed for what you were suggesting?
find one that applies to the image registry in question
For the breadth of options I described above, this doesn't really work for us. :(
I was just suggesting that besides the corev1.Container Volume[Mount] that this already supports, is more needed for what you were suggesting?
no, i think that is sufficient
For the breadth of options I described above, this doesn't really work for us. :(
Sure, more flexible use cases makes it hard to do "magic" for users.
Sure, more flexible use cases makes it hard to do "magic" for users.
Ack :) I'm simply wondering if: by biasing towards mounting all associated secrets in a way that the builder can do selection (e.g. [credential "<url>"]
in .gitconfig
, or the domain in .docker/config.json
) we can capture a broader audience with our magic? e.g. still handling Custom
(but git
), or more arbitrary tooling compatible with the Docker keychain.
For "Least Privilege", fine control can still be established at two levels:
I can try and write something up (in excruciating detail) as above before we next meet, and then maybe we can beat that up (I like examples).
Ack :) I'm simply wondering if: by biasing towards mounting all associated secrets in a way that the builder can do selection
yes, you're just solving the same problem we solved(how to pick the right secret to access something), but at a different level. We solved it in our build controller before creating the build pod, you're proposing asking the individual builders to solve it.
There is perhaps a slight security concern (do i want to expose all my secrets to all build steps?). I guess you might be able to address that w/ appropriate mount configurations in each build step, depending who/what controls which secrets get mounted.
you're proposing asking the individual builders to solve it.
More precisely, I'd expect the builders to work with their native auth. I'd expect some prestep that's spun up by the controller as the first init container to solve it. (I need to write this up, so that it's completely clear)
do i want to expose all my secrets to all build steps?
Exactly right, and here I think the answer is that for tighter intra-build security you should use explicitly volumes for maximum control. I don't think "least privilege" mixes well with implicit magic (at least for unstructured step specifications like this has).
This tool sets up credentials for the builder images that run as part of
a Build
. That process is outlined in detail here.
One of the fundamental differences between this proposal, and what's outlined
above is the use of K8s-native (vs. custom) Secret
"types" (thanks to @bparees
for the pointer).
In particular, the following types will be supported initially:
kubernetes.io/basic-auth
kubernetes.io/ssh-auth
These will be made available to the Build
by attaching them to the
ServiceAccount
as which it runs.
Having one of these secret types in insufficient for turning it into a usable secret. e.g. basic auth (username / password) is usable with both Git and Docker repositories, and I may have N Git secrets and M Docker secrets.
The answer to this (learning from Openshift's solution) is to guide it with
annotations on the Secret
objects, which will take the form:
metadata:
annotations:
cloudbuild.dev/git-0: https://github.com
cloudbuild.dev/git-1: https://gitlab.com
cloudbuild.dev/docker-0: https://gcr.io
type: kubernetes.io/basic-auth
...
Or for SSH:
metadata:
annotations:
cloudbuild.dev/git-0: github.com
type: kubernetes.io/ssh-auth
...
In their native form, these credentials are unsuitable for consumption by Git
and Docker. For Git, they need to be turned into (some form of) .gitconfig
.
For Docker, they need to be turned into a ~/.docker/config.json
file. Also,
while each of these supports having multiple credentials for multiple domains,
those credentials typically need to be blended into a single canonical keyring.
To solve this, prior to even the Source
step, we will run a credential
initialization process that:
/var/build-secrets/<secret name>
$HOME
.
basic-auth
Given URLs, usernames, and passwords of the form: https://url{n}.com
,
user{n}
, and pass{n}
. We will generate the following for Docker:
=== ~/.docker/config.json ===
{
"auths": {
"https://url1.com": {
"auth": "$(echo -n user1:pass1 | base64)",
"email": "not@val.id",
},
"https://url2.com": {
"auth": "$(echo -n user2:pass2 | base64)",
"email": "not@val.id",
},
...
}
}
Docker doesn't support kubernetes.io/ssh-auth
, so annotations on these types
will be ignored.
basic-auth
Given URLs, usernames, and passwords of the form: https://url{n}.com
,
user{n}
, and pass{n}
. We will generate the following for Git:
=== ~/.gitconfig ===
[credential "https://url1.com"]
username = "user1"
[credential "https://url2.com"]
username = "user2"
...
=== ~/.git-credentials ===
https://user1:pass1@url1.com
https://user2:pass2@url2.com
...
ssh-auth
Given hostnames, and private keys of the form: url{n}.com
, and key{n}
. We
will generate the following for Git:
=== ~/.ssh/id_key1 ===
{contents of key1}
=== ~/.ssh/id_key2 ===
{contents of key2}
...
=== ~/.ssh/config ===
Host url1.com
HostName url1.com
IdentityFile ~/.ssh/id_key1
Host url2.com
HostName url2.com
IdentityFile ~/.ssh/id_key2
...
=== ~/.ssh/known_hosts ===
$(ssh-keyscan -H url1.com)
$(ssh-keyscan -H url2.com)
...
The secrets as outlined here will be stored into $HOME
(by convention the
volume: /builder/home
), and will be available to Source
and all Steps
.
For sensitive credentials that should not be made available to some steps, the
mechanisms outlined here should not be used. Instead the user should declare an
explicit Volume
from the Secret
and manually VolumeMount
it into the
Step
.
👍 I really like this.
Some clarifying questions:
Docker doesn't support kubernetes.io/ssh-auth, so annotations on these types will be ignored.
Can this be an error instead? Silently ignoring invalid things can lead to the false sense that it's working.
basic-auth
and ssh-auth
in the future?Anyway, I definitely think this is a good direction. Thanks for writing this up.
Can this be an error instead?
Yeah, I flipped a coin (in my head). :)
Do we anticipate any other types of auth
Probably, but I'd imagine these will go a looooong way.
Still an open question: What is the best way to configure containers to use a credential helper to leverage the cluster's nodes' identity for registry access?
For Git, the tiered config model affords the builder container a fallback mechanism. For Docker, we are kind of stuck. I wonder if we should allow a cluster operator to configure a default for the contents of ~/.docker/config.json
if no auth is specified?
This runs the risk that we configure a credential helper (e.g. "gcr"
), which isn't in the builder container resulting in a failure, but auth might not even have been necessary (e.g. public image)! This has got me wondering about volume mounting cred helpers in, etc. A slippery slope. :-/
this looks promising. I do wonder if it's worth making the credential initializer pluggable, such as by making it effectively a step/initcontainer itself where a user can provide their own image? (if it weren't for the fact the source step runs before any user steps, i'd say this capability already exists because you could just define your own step that manipulates the credentials).
Maybe the more generic answer is to allow for "pre-source" steps and "post-source" steps?
If we can find something clean, I'd be open to pre-/post-steps. I think their interaction with templating needs careful thought.
I think that this is a fairly specialized case because setting up the step itself requires coordination with the Build controller mounting stuff (the init step itself can't necessarily access the K8s API directly). One cool thought to explore would be that if we has Pre-/Post-steps, a mutating webhook could be configured to implement this convention on top of (vs. inside of) the Build CRD.
However, I don't know that we should go to this level of configurability without more use cases in hand. Perhaps we should start another issue to talk about Pre-/Post-Steps?
I think that this is a fairly specialized case because setting up the step itself requires coordination with the Build controller mounting stuff
Well in my mind the pre-step would only be doing manipulation of mounted content (in the same way that you are defining manipulations), like suppose I wanted to further tweak how the .gitconfig is created, or something. So it wouldn't be controlling what got mounted, per se, though certainly I could see that being desirable as well (but again at that point you've provided that mechanism, users define mounts explicitly).
Perhaps we should start another issue to talk about Pre-/Post-Steps?
to be clear, in my mind all the "steps" today are post-steps. But sure, it can be deferred out of this discussion.
I just know i've never regretted having a user exit in an api :)
suppose I wanted to further tweak how the .gitconfig is created, or something
So if you want to adhere to the same contract (e.g. flags) as the container we'd use for this, I expect the container that'd be used would be a flag on the Build controller (this is e.g. how //cmd/git-init
works), and therefore configurable at that level (but not necessarily per-build).
I'm certainly open to exploring more configurability here, but... What I'm trying to balance is:
to be clear, in my mind all the "steps" today are post-steps
Yeah, it felt weird saying it. I left it because I could see "Post" as Build provided steps that run after a Template, but that's a very specific manifestation of this.
But sure, it can be deferred out of this discussion.
I didn't mean to be dismissive, apologies if it came off that way. What I'd meant was that this feels like a more general class of problem, so perhaps a dedicated (and more easily discoverable) discussion is warranted.
I just know i've never regretted having a user exit in an api
Big +1, this is precisely why Source
supports taking any corev1.Container
:)
I didn't mean to be dismissive, apologies if it came off that way.
not at all. Just agreeing that it's relatively orthogonal and we can revisit it.
I have some thoughts on this, will follow up with that.