hashicorp / vault-csi-provider

HashiCorp Vault Provider for Secret Store CSI Driver
Other
312 stars 53 forks source link

not a general issue. Couldn't find any docs. #87

Closed hanem100k closed 3 years ago

hanem100k commented 3 years ago

Apologies for raising this here, (it might be nothing wrong with the provider) however, I couldn't find any documentation of how to set up TLS support properly for vault-csi-provider.

Inside a Kubernetes cluster I have vault set up via helm chart with the following config:


# Vault Helm Chart Value Overrides
global:
  enabled: true
  tlsDisable: false

csi:
  enabled: true
  image:
    repository: hashicorp/vault-csi-provider
    tag: latest
  volumes:
     - name: tls
       secret:
         secretName: vault-csi-tls

  volumeMounts:
    - name: tls
      mountPath: /vault/tls
      readOnly: true

  resources:
    requests:
      cpu: 50m
      memory: 128Mi
    limits:
      cpu: 50m
      memory: 128Mi

  daemonSet:
    resources:
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        cpu: 100m
        memory: 256Mi

injector:
  enabled: true
  # Use the Vault K8s Image https://github.com/hashicorp/vault-k8s/
  image:
    repository: "hashicorp/vault-k8s"
    tag: "latest"

  resources:
      requests:
        memory: 128Mi
        cpu: 125m
      limits:
        memory: 256Mi
        cpu: 250m

server:
  image:
    repository: "vault"
    tag: "1.7.0"
    # Overrides the default Image Pull Policy
    pullPolicy: IfNotPresent
  resources:
    requests:
      memory: 128Mi
      cpu: 125m
    limits:
      memory: 512Mi
      cpu: 250m

  # For HA configuration and because we need to manually init the vault,
  # we need to define custom readiness/liveness Probe settings
  readinessProbe:
    enabled: true
    path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
  livenessProbe:
    enabled: true
    path: "/v1/sys/health?standbyok=true"
    initialDelaySeconds: 60

  # extraEnvironmentVars is a list of extra environment variables to set with the stateful set. These could be
  # used to include variables required for auto-unseal.
  extraEnvironmentVars:
    VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
    GOOGLE_APPLICATION_CREDENTIALS: /vault/userconfig/kms-creds/<project-name>-adeb5c46bc2b.json

  # extraVolumes is a list of extra volumes to mount. These will be exposed
  # to Vault in the path .
  extraVolumes:
    - type: secret
      name: vault-server-tls
    - type: secret
      name: 'kms-creds'

  # This configures the Vault Statefulset to create a PVC for audit logs.
  # See https://www.vaultproject.io/docs/audit/index.html to know more
  auditStorage:
    enabled: true

  standalone:
    enabled: false

  # Run Vault in "HA" mode.
  ha:
    enabled: true
    replicas: 5
    raft:
      enabled: true
      setNodeId: true

      config: |
        ui = true
        listener "tcp" {
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
          tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          tls_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
        }

        seal "gcpckms" {
          project     = "test"
          region      = "global"
          key_ring    = "test"
          crypto_key  = "test-key"
        }

        storage "raft" {
          path = "/vault/data"
            retry_join {
            leader_api_addr = "https://vault-0.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
          retry_join {
            leader_api_addr = "https://vault-1.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
          retry_join {
            leader_api_addr = "https://vault-2.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
          retry_join {
              leader_api_addr = "https://vault-3.vault-internal:8200"
              leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
              leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
              leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
          retry_join {
              leader_api_addr = "https://vault-4.vault-internal:8200"
              leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
              leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
              leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
        }

        service_registration "kubernetes" {}

I did set up TLS for both vault and csi provider following this guide:

https://www.vaultproject.io/docs/platform/k8s/helm/examples/standalone-tls

In another namespace where I have my Secret provider manifest defined:

apiVersion: secrets-store.csi.x-k8s.io/v1alpha1
kind: SecretProviderClass
metadata:
  namespace: heimdall-dev
  name: heimdall-config
spec:
  provider: vault
  secretObjects:
  - secretName: heimdall-dev-secrets
    type: Opaque
    data:
    - objectName: DATABASE_URL # References dbUsername below
      key: DATABASE_URL          # Key within k8s secret for this value
    - objectName: test
      key: test
  parameters:
    roleName: "heimdall-dev"
    vaultAddress: "https://vault.vault:8200"
    vaultCACertPath: "/vault/tls/vault.crt"
    objects: |
      - objectName: "DATABASE_URL"
        secretPath: "heimdall-dev/config/env"
        secretKey: "DATABASE_URL"
      - objectName: "test"
        secretPath: "heimdall-dev/config/env"
        secretKey: "test"

Volume mount in the Deployment:

                  volumeMounts:                     
                      - name: heimdall-config
                        mountPath: /mnt/secrets-store
                        readOnly: true

Volume

            volumes:
                - name: heimdall-config
                  csi:
                    driver: secrets-store.csi.k8s.io
                    readOnly: true
                    volumeAttributes:
                        providerName: vault
                        secretProviderClass: heimdall-config

When firing the above up I get this error from the deployment:

MountVolume.SetUp failed for volume "heimdall-config" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod heimdall-dev/heimdall-5c86564fcb-qq5px, err: rpc error: code = Unknown desc = error making mount request: failed to login: Post "https://vault.vault:8200/v1/auth/kubernetes/login": x509: certificate signed by unknown authority

and these from the vault:

2021-04-23T10:54:17.831Z [INFO] http: TLS handshake error from 10.20.2.24:60484: remote error: tls: bad certificate
2021-04-23T10:54:20.144Z [INFO] http: TLS handshake error from 10.20.2.24:60538: remote error: tls: bad certificate

My cluster is hosted inside GCP, I was thinking that the leaf certificates should be good because they are based on the Kubernetes intermediary CA and we send the car to that, which has a proper root, and everything should work like in the movies! However, it's a bad movie by now. We have cert manager inside our cluster, is there a way to point that to csi provider? or vault for that matter?

Could you guys point me to a general guide/direction?

Thanks a lot!

tomhjp commented 3 years ago

Thanks for the detailed configs. A couple of issues I've spotted first:

Otherwise it looks like you're mostly on the right track. In particular, you've recognised that the SecretProviderClass TLS parameters are referring to files on the CSI provider pod's file system, which is a bit of a gotcha.

Currently, the best worked example is in the e2e tests in this repo, but I'm definitely planning to add a documentation example that addresses TLS specifically.

hanem100k commented 3 years ago

thanks, @eyenx @tomhjp

Hi, you need to set the issuer correctly when creating the vault-auth binding (auth/kubernetes) as done here:

Double checked this, no help, unfortunately, however, I think if this would be the issue, the API would just return a 403 or something along those lines, I mean the request would not be short-circuited right?

  • vaultCACertPath: "/vault/tls/vault.crt" should be vaultCACertPath: "/vault/tls/vault.ca"; this field is setting the trusted CA, which should be the Kubernetes CA if you have created your certificates as per the linked docs.

The manifest i copied above was actually outdated, I had the right reference to the CA, however, I had the key pair in addition. Fixed all that and this is the current state.

apiVersion: secrets-store.csi.x-k8s.io/v1alpha1
kind: SecretProviderClass
metadata:
  namespace: heimdall-dev
  name: heimdall-config
spec:
  provider: vault
  secretObjects:
  - secretName: heimdall-dev-secrets
    type: Opaque
    data:
    - objectName: DATABASE_URL # References dbUsername below
      key: DATABASE_URL          # Key within k8s secret for this value
    - objectName: test
      key: test
  parameters:
    roleName: "heimdall-dev"
    vaultAddress: "https://vault.vault:8200"
    vaultCACertPath: /vault/tls/vault.ca

    objects: |
      - objectName: "DATABASE_URL"
        secretPath: "heimdall-dev/config/env"
        secretKey: "DATABASE_URL"
      - objectName: "test"
        secretPath: "heimdall-dev/config/env"
        secretKey: "test"

Via this setting I'm still getting 500 and the same error. image

Some other details. For the csi provider certificate I have the following CSR

[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names
[alt_names]
DNS.1 = vault-csi
DNS.2 = vault-csi-provider
DNS.3 = vault.csi-*

IP.1 = 127.0.0.1

image

Im thinking it might be a dns issue at this point. Csr for vault is almost the same except the DNS values ofc.

Is there a way I could say more verbose logs from the vault? Or what would be a next step solving this issue? (now vault just logs tls handshake error)

Also here is the error from the csi provider image

Thanks a lot guys for the support!

Edit:

Found the issue. One of the policies issuer had a typo in the cluster name. It was really annoying to track it down, because it could be so many things with all the certs and policies. (maybe we have separate errors for all these possibilities)

Still, getting more and more familiar with vault and its ecosystem, thank you guys for all of your hard work. Closing the issue.

tomhjp commented 3 years ago

Well done tracking down the issue. For anyone else stumbling across similar, when you get a * claim "iss" is invalid error, it most likely means the kubernetes auth mount configuration needs updating, in particular issuer.

Default service account tokens are created by the Kubernetes Service Account admission controller, and typically the JWTs that creates have kubernetes/serviceaccount as an issuer, which is also the default value for issuer in the k8s auth method on Vault, so everything normally "just works" when using pre-existing tokens. However, the service account tokens created and used by the Vault CSI provider for auth use the value of kube-apiserver's --service-account-issuer flag as the issuer, and the k8s auth mount needs to have a matching value for issuer validation. I noted a few common values in the tests, but I think we probably should create a more prominent note about this in the docs as well.

EDIT: The simpler alternative of course is to set disable_iss_validation=true, but that's not recommended.