Open TJM opened 3 years ago
So far, I am running vault agent, and dumping the gitlab-runner-sa.json (key.json) file out... then use the following command to "create" the secret... the .dockerconfigjson is a PITA ... embedding that entire mess of json into the "password" field seems really wrong to me.
template:
{{ with secret "path/to/gcp/key/gitlab-runner-sa" }}
{{ base64Decode .Data.private_key_data }}
{{ end }}
relevant part of agent config template:
command = "bash -c 'EMAIL=$(jq -r .client_email gitlab-runner-sa.json); kubectl create secret docker-registry gcr-secret --docker-server gcr.io --docker-username _json_key --docker-email \"$EMAIL\" --docker-password \"$(cat gitlab-runner-sa.json)\" --dry-run -o yaml | kubectl apply --namespace gitlab-runner -f -'"
Hi @TJM, I think adding support for the Google Cloud Secrets Engine would be the best way to support your use case. Unfortunately I do not have that much experience with GCP and Vault.
If you or someone else wants to add support for the Google Cloud Secrets Engine, I think it could be done in a similar way like it was done for Azure: https://github.com/ricoberger/vault-secrets-operator/pull/114
That PR looks like an authentication mechanism, rather than a secrets engine. The GCP Secrets Engine would replace KV or KVV2, which hold manual/static secrets, with dynamic secrets (like AWS it creates the service account and returns the credentials). I have a vault agent based solution (hack?) working, but I would consider it to be very alpha. https://github.com/TJM/vault-gcr-secrets
I think it would be better to tie it into VSO at some point. I just need to wrap my head around the secrets engine interfaces and see how hard it would be to add GCP. :)
~tommy
Ah ok, sorry for misunderstanding, hopefully I got it now:
vault read gcp/roleset/my-key-roleset/key
, so that the private_key_data
can we used within a Kubernetes secret?Did I get it right now?
If this is correct I think we can support this. We already have a secretEngine
field in the CRD, which can be set to gcp
.
If this is the case we can check for the secretEngine in the GetSecret
function and try to get the secret in a similar way as for the KV engine.
Does this makes sense?
Yes, sir! This sounds right.
To make things a little more complicated, we would have to do some trickery with the data returned because of the way that kubernetes formats the "docker-registry" type certificates, or rather the fact that Google wants the entire key.json as the docker-password
option (JSON in JSON). I think we can probably handle it, but the "escaping" of all the quotes in the json strong had me stumped, so I just used kubectl create secret .... --dry-run=client -o yaml | kubectl apply -f
to get the secret added/updated.
~tommy
Can you maybe share the output of the following command vault read -format=json gcp/roleset/my-key-roleset/key
without any confidential data?
Edit: Maybe also the output of vault read -output-curl-string gcp/roleset/my-key-roleset/key
can be helpful.
@ricoberger The output is pretty boring... I can tell you that the part that we care about is base64 encoded in the json key private_key_data
.
We use the vault template:
{{ with secret "path/to/gcp/key/gitlab-runner-sa" }}
{{ base64Decode .Data.private_key_data }}
{{ end }}
Here is the censored version:
{
"request_id": "2b0f5057-411d-aa4f-8df7-8b6b91b849de",
"lease_id": "path/to/gcp/key/gitlab-runner-sa/(CENSORED-SOME-ID)",
"lease_duration": 2592000,
"renewable": true,
"data": {
"key_algorithm": "KEY_ALG_RSA_2048",
"key_type": "TYPE_GOOGLE_CREDENTIALS_FILE",
"private_key_data": "(CENSORED-BASE64-ENCODED-keyfile)="
},
"warnings": null
}
the last one is pretty straightforward too, as it doesn't really connect to vault...
[tmcneely@den3l1cliqa74077 vault-agent]$ vault read -output-curl-string gcp/roleset/my-key-roleset/key
curl -H "X-Vault-Request: true" https://127.0.0.1:8200/v1/gcp/roleset/my-key-roleset/key
Hi @TJM, I tried to implement support for GCP.
I would need some help to verify that it is working. Can you maybe take a look at #130 and test the Docker image with the dev
tag?
This would be very helpful.
@ricoberger THANKS! ... I saw this yesterday, but was in meetings all day ;( ... I need to get some project work completed today, but will jump on testing this next week. Sorry for the delay
Looks like this was released a couple days ago? I will hopefully get a chance to try it out :) I am running into an issue finding an environment that I could put a development version of VSO... now that it is released, maybe I can talk them into it :-/
Hi, this is only available via the dev
tag yet. I would like to verify that it is working like expected before an official release.
This is looking OK, so far...
NOTE: We already had the kubernetes auth setup...
Created the prerequisites: vault_gcp_secret_roleset, kubernetes_service_account, vault_kubernetes_auth_backend_role, vault_policy ...
Then the helm chart:
resource "helm_release" "a0000_vault_secrets_operator" {
name = "vault-secrets-operator"
namespace = kubernetes_namespace.a0000_anthos_namespace.metadata.0.name
repository = "https://ricoberger.github.io/helm-charts"
chart = "vault-secrets-operator"
values = [
templatefile("${path.module}/files/vault_secrets_operator.yaml.tpl", {
KUBERNETES_PATH = "auth/${vault_auth_backend.kubernetes.path}"
KUBERNETES_ROLE = vault_kubernetes_auth_backend_role.a0000_vault_secrets_operator.role_name
KUBERNETES_SA = kubernetes_service_account.a0000_vault_secrets_operator.metadata.0.name
RBAC_NAMESPACED = false # Limit VSO to its own namespace
NAMESPACES = [kubernetes_namespace.a0000_anthos_namespace.metadata.0.name]
})
]
depends_on = [
vault_kubernetes_auth_backend_role.a0000_vault_secrets_operator,
vault_policy.a0000,
kubernetes_service_account.a0000_vault_secrets_operator,
]
// FIXME: Create VaultSecrets for each of the three service account keys
}
with the following yaml tmpl mentioned above:
### WARNING! Changing this file changes *all* VaultSecretOperators
### Use an instance variable whenever possible
image:
repository: docker-remote.artifactory.company.com/ricoberger/vault-secrets-operator
tag: dev
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
vault:
address: https://vault.company.com
authMethod: kubernetes
kubernetesPath: ${KUBERNETES_PATH}
kubernetesRole: ${KUBERNETES_ROLE}
namespaces: ${join(",", NAMESPACES)}
rbac:
namespaced: ${RBAC_NAMESPACED}
serviceAccount:
create: false
name: ${KUBERNETES_SA}
Then, the secret looks like:
apiVersion: ricoberger.de/v1alpha1
kind: VaultSecret
metadata:
name: a0000-bq-secret
spec:
isBinary: true
keys:
- private_key_data
path: ai/np/gcp/key/a0000-bq
secretEngine: gcp
type: Opaque
... which I had to apply manually because apparently raw yaml (CRDs) are not well liked by terraform... I have created a helm chart in the past to get around this, still thinking on how to manage these in terraform.
Logs:
{"level":"info","ts":1638403415.9145195,"logger":"vault","msg":"Reconciliation is enabled.","ReconciliationTime":0}
I1202 00:03:37.569527 1 request.go:665] Waited for 1.04742487s due to client-side throttling, not priority and fairness, request: GET:https://10.202.208.1:443/apis/node.k8s.io/v1beta1?timeout=32s
{"level":"info","ts":1638403419.1731763,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1638403419.1738043,"logger":"setup","msg":"starting manager"}
I1202 00:03:39.174152 1 leaderelection.go:248] attempting to acquire leader lease a0000-ro-feat-v-8430/vaultsecretsoperator.ricoberger.de...
{"level":"info","ts":1638403419.1742282,"msg":"starting metrics server","path":"/metrics"}
I1202 00:04:02.307290 1 leaderelection.go:258] successfully acquired lease a0000-ro-feat-v-8430/vaultsecretsoperator.ricoberger.de
{"level":"info","ts":1638403442.3074925,"logger":"controller.vaultsecret","msg":"Starting EventSource","reconciler group":"ricoberger.de","reconciler kind":"VaultSecret","source":"kind source: /, Kind="}
{"level":"info","ts":1638403442.307584,"logger":"controller.vaultsecret","msg":"Starting EventSource","reconciler group":"ricoberger.de","reconciler kind":"VaultSecret","source":"kind source: /, Kind="}
{"level":"info","ts":1638403442.3075905,"logger":"controller.vaultsecret","msg":"Starting Controller","reconciler group":"ricoberger.de","reconciler kind":"VaultSecret"}
{"level":"info","ts":1638403442.4085374,"logger":"controller.vaultsecret","msg":"Starting workers","reconciler group":"ricoberger.de","reconciler kind":"VaultSecret","worker count":1}
{"level":"info","ts":1638403442.4087484,"logger":"controllers.VaultSecret","msg":"Use shared client to get secret from Vault","vaultsecret":"a0000-ro-feat-v-8430/a0000-bq-secret"}
{"level":"info","ts":1638403442.4087753,"logger":"vault","msg":"Read secret ai/np/gcp/key/a0000-bq"}
{"level":"info","ts":1638403442.798935,"logger":"controllers.VaultSecret","msg":"Updating a Secret","vaultsecret":"a0000-ro-feat-v-8430/a0000-bq-secret","Secret.Namespace":"a0000-ro-feat-v-8430","Secret.Name":"a0000-bq-secret"}
{"level":"info","ts":1638403442.8157718,"logger":"controllers.VaultSecret","msg":"Use shared client to get secret from Vault","vaultsecret":"a0000-ro-feat-v-8430/a0000-bq-secret"}
{"level":"info","ts":1638403442.8158085,"logger":"vault","msg":"Read secret ai/np/gcp/key/a0000-bq"}
{"level":"info","ts":1638403443.4009356,"logger":"controllers.VaultSecret","msg":"Updating a Secret","vaultsecret":"a0000-ro-feat-v-8430/a0000-bq-secret","Secret.Namespace":"a0000-ro-feat-v-8430","Secret.Name":"a0000-bq-secret"}
... though I am a little concerned that it appeared to update the secret twice. Every time the secret is "read" a new key is created, max 10 ... so it can't "read" the secret to reconcile, it needs to pay attention to the lease time. I have not had a chance to look at the code yet to see if it does that...
AND NOW, other than the secret having the "wrong" key name (they were expecting a0000-bq.json
, but it's using the private_key_data
(as expected), the value of the secret is "good" !!! :)
One thing we found is that it is a little too aggressive on retries... I messed up the path and it looked like it tried ~45 times in just a few minutes. I can send the logs if you want, but I think it needs to back off when it gets permission denied (403) - I dont think this is "gcp" specific?
FYI: We are using https://github.com/TJM/vault-secret-helmchart to create our VaultSecret objects through Terraform ;)
Hi @TJM, thanks for testing.
Hi @TJM, thanks for testing.
- Yes the secret will be read twice, when it doesn't already exists. Thats the current behavior of the operator. I will check if we can somehow omit the second reconciliation after the secret was created.
That would be best if it was only "read" once, since each time it is "read" it creates a new key (lease).
- If I get you right, the returned secret for GCP has a lease time? The operator currently only supports a global setting for this, but not on a per secret basis. Does this already help?
Oh, Vault leases :)
I think with both types of leases, you can "request" a certain length, but vault policies may limit that.
- For the expected key name, maybe this can be changed to the expected value by the user via https://github.com/ricoberger/vault-secrets-operator#using-templated-secrets
Yep, I am keeping templating in my back pocket, in case there is push back on using the name that vault/google call the secret. I think it should be ok, or even "better" to use a "common" name like private_key_data
. Time will tell :)
- After an error the operator tries to reconcile the secret immediately. I'm not sure if we can change this behavior. Maybe we can save the number of failed reconciliations in the status field and use this for an exponential backoff logic.
Some sort of backoff does seem like a good idea, to reduce the impactful load against Vault. Or I could stop making mistakes? Nope. That doesn't sound like it will happen. :)
Hi all,
I was looking at #54 to create imagePullSecrets, and that looks like it might work, but the secret that I am trying to access is not a "kv" type. The credential comes from the gcp secrets engine. So, as my goal is to get a secret (
imagePullSecrets
) to access GCR, would it be better to try to hack at this code to use the GCP secrets engine in Vault, or to hack at something else like the vault agent to create a kubernetes secret?