external-secrets / external-secrets

External Secrets Operator reads information from a third-party service like AWS Secrets Manager and automatically injects the values as Kubernetes Secrets.
https://external-secrets.io/main
Apache License 2.0
4.1k stars 748 forks source link

AWS SecretsManager requiring AWS_CA_BUNDLE #3009

Closed j-wozniack closed 6 days ago

j-wozniack commented 5 months ago

Describe the bug When I provide a CA bundle, my deployment appears all healthy, but no secret is ultimately created. If I don't provide the CA, I see x509 TLS errors as I would otherwise expect.

To Reproduce Steps to reproduce the behavior: I've deployed the latest (0.9.11) helm chart to a cluster with values like

extraVolumes:
  - name: aws-ca-bundle
    secret:
      secretName: aws-ca-bundle
extraVolumeMounts:
  - name: aws-ca-bundle
    mountPath: "/tmp/ca-bundle"
extraEnv:
  - name: AWS_CA_BUNDLE
    value: "/tmp/ca-bundle/tls-ca-bundle.pem"

certController:
  extraVolumes:
    - name: aws-ca-bundle
      secret:
        secretName: aws-ca-bundle
  extraVolumeMounts:
    - name: aws-ca-bundle
      mountPath: "/tmp/ca-bundle"
  extraEnv:
    - name: AWS_CA_BUNDLE
      value: "/tmp/ca-bundle/tls-ca-bundle.pem"

webhook:
  extraVolumes:
    - name: aws-ca-bundle
      secret:
        secretName: aws-ca-bundle
  extraVolumeMounts:
    - name: aws-ca-bundle
      mountPath: "/tmp/ca-bundle"
  extraEnv:
    - name: AWS_CA_BUNDLE
      value: "/tmp/ca-bundle/tls-ca-bundle.pem"

I have the following ClusterSecretStore which reports a healthy status

apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
  name: aws-secretsmanager
spec:
  provider:
    aws:
      service: SecretsManager
      region: <my region>

And the following ClusterExternalSecret, also reporting a healthy status

apiVersion: external-secrets.io/v1beta1
kind: ClusterExternalSecret
metadata:
  name: test-secret
spec:
  namespaceSelector:
    matchLabels:
      kubernetes.io/metadata.name: default
  refreshTime: "15s"
  externalSecretSpec:
    refreshInterval: 1m
    secretStoreRef:
      name: aws-secretsmanager
      kind: ClusterSecretStore
    target:
      name: test-secret
    data:
    - secretKey: test-key
      remoteRef:
        key: test-secret

I can see that the ClusterExternalSecret properly creates an ExternalSecret in the default namespace.

Each of the pod's logs look healthy as well, but ultimately no secret is created. I've also tried setting the loglevel for each pod to debug but the additional logs don't indicate any problems.

When I remove the CA bundle, the external-secrets pod throws x509 TLS errors as expected (indicating it's being respected).

Expected behavior The service should pull the secret from SecretsManager and utilize the AWS_CA_BUNDLE in order to utilize TLS.

Screenshots None

Additional context N/A

moolen commented 5 months ago

Do you have some sort of proxy running between AWS and ESO? The default trust chain is enough to connect with AWS unless there's a man in the middle like a proxy.

j-wozniack commented 5 months ago

We are not using a proxy between AWS and ESO. For additional context, this is within AWS Govcloud and the nodes are rke2 instances. Each node is an EC2 instance, with the proper IAM role to pull from SecretsManager.

moolen commented 5 months ago

:thinking: What's the CA of the certificate presented?

ESO uses distroless, which comes with a set of CA certificates, see their repo: https://github.com/GoogleContainerTools/distroless

I guess it's an issue with Govcloud that it uses a different CA, hence you'll have to work around it. I don't think this is a bug. Are you able to verify my assumption?

j-wozniack commented 5 months ago

When I get into the office tomorrow I will verify the CA and the distroless repo.

The issue in my opinion isn't that we are using a different CA, it is I should be able to set the AWS_CA_BUNLDLE environment variable, and ESO acknowledge that. However, when we set that environment variable, it blocks ESO from pulling any secrets. I was digging through pkg/providers/AWS/auth and see it is using the aws-go-sdk-sessions which recognizes and sets that environment variable. I'm just confused why that prevents ESO from pulling secrets.

If we are unable to set AWS_CA_BUNDLE, can we configure ESO to use additional CA's some other way?

moolen commented 5 months ago

You can mount custom CA certs on the operator, by default golang looks at these locations.

People have reported that this has worked for them.

j-wozniack commented 5 months ago

I will note, another one of our developers referenced that go documentation and we tried mounting them there. (We do that for other charts that require it) I will try that second recommendation. I'm just more concerned, even if we put a dumby cert or real cert and set the path with AWS_CA_BUNDLE , it in a sense bricks ESO from working.

We have used the CA in question to resolve other TLS x509 issues with other go projects using the aws-sdk so from our perspective it isn't necessarily a certificate issue.

I appreciate all the help and support! Thank you.

j-wozniack commented 5 months ago

We verified our CA is a private CA, and even when we mounted it to those suggested areas, it still fails. In the issue you sent, they are still able to point to that private certificate (Gitlab CRD provider has the option to point to a privatCA), where as, we are unable to set AWS_CA_BUNDLE to point to the private CA.

moolen commented 5 months ago

I dug the aws go sdk, it looks like that the AWS_CA_BUNDLE should indeed be loaded. I think its worth to investigate why ESO doesn't create the secret. ESO should show a reconciled secret log message when it has reconciled an ExternalSecret. Do you see that line?

Further suggestions for debugging:

Can you do a kubectl describe es <name> and share the output? Is ESO able to connect with AWS SM on the network level? Maybe try to kubectl debug and attach a debug pod to ESO and verify that you're able to talk to Secrets Manager. Do you use VPC endpoints? Do NACL or security groups prevent egress towards AWS SM?

j-wozniack commented 5 months ago

Sorry for the delay, we were able to get it to work. We mounted the AWS_CA_BUNDLE certs to /etc/ssl/certs and we unset AWS_CA_BUNDLE. For some reason, when you enable it, it stops ESO from handling secrets properly. However, by just mounting the cert and unsetting the environment variable we were able to resolve this.

I know that is not ideal, but that was our fix!

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] commented 6 days ago

This issue was closed because it has been stalled for 30 days with no activity.