GoogleCloudPlatform / cloud-sql-proxy-operator

A Kubernetes Operator to automatically configure secure connections to Cloud SQL
Apache License 2.0
99 stars 11 forks source link

QUESTION: Why is cert-manager required or is it really? #603

Closed mike-pt closed 5 months ago

mike-pt commented 5 months ago

This is really just a question mostly out curiosity, and honestly because I would prefer to keep deps to the bare minimum especially in PCI environments. Those introduce certain requirements including scanning of internal CVE scanning and patching and the more images the more tedious this gets ofc :D

I currently deploy cloud-sql-proxy as a side car in our helm charts, something like this:

containers:
....
      - name: cloud-sql-proxy
        image: {{ .Values.db.image.name }}:{{ .Values.db.image.tag }}
        command:
          - "/cloud_sql_proxy"
          - "-ip_address_types=PRIVATE"
          - "-instances={{ .Values.project }}:{{ .Values.region }}:{{ .Values.db.server }}=tcp:{{ .Values.db.localPort }}"
          - "-credential_file={{ .Values.db.serviceAccount.mountPath }}"

However I started using operators for some things and found this which seems very handy in getting cloud-sql-proxy on the pods started by the operators, but I was surprise to see the requirement on cert-manager, and come to thing of it there's also no support for credential_file, although this is actually not required for GKE, its a left over I have and should probably remove, but I can see how it would be handy for non-GKE.

But anyway I just want to understand the cert-manager requirement better and if its possible to have this w/o it, or if its really a core component.

Thanks

hessjcg commented 5 months ago

Hi @mike-pt,

The cert manager helps the operator communicate with the kubernetes API. It is technically possible to run the operator without cert-manager, but cert-manager it makes it a lot easier.

The operator has it's own REST API that it serves with a SSL certificate so that it can receive webhook events from the Kubernetes API server. The cert-manager automatically manages the Cloud SQL Operator's SSL certificates.

The certificates need to be configured on the operator's Deployment and on the AuthProxyWorkload CRD definition. It's complex. Cert-manager is not strictly requried, but it makes the job a lot easier. Instead of installing the cert-manager in your cluster, you could maintain the operator's SSL certificates yourself. I once had a script that used OpenSSL and modified the operator installation YAMLs so that cert-manager was not necessary. But to make it production-worthy, that script would become too complex to maintain.

The Cloud SQL Proxy containers do not interact with cert-manager.

mike-pt commented 5 months ago

I see its for the webhook, validation etc?

In GCP what about using google certificate manager? I have used Google Cert Manager, with dns authorizations to get auto cert gen/regen for ingress... but I would need to see if this is possible with InternalDNS

Actually how is cert-manager issuing those? this is all local to our K8s cluster after all sorry if its a dumb question but I was just not aware cert-manager could work in interDNS too. But perhaps this is supported.

hessjcg commented 5 months ago

That's correct: the certificate is created by cert-manager, used by the operator as it's serving certificate for the webhook REST endpoint. The K8s control plane validates the server certificate when it calls the operator's webhooks.

The Google Cert Manager, in the context of GKE, will generate a certificate for an Ingress Controller with a Load Balancer. Google Cert Manager cannot generate a certificate for a Service used within the cluster. Since the K8s control plane reaches the operator using a Service with no Ingress Controller, it can't use Google Cert Manager.

The cert-manager reads the annotations on the AuthProxyWorkload CRD and the operator's Deployment. It will create a secret with a key and a self-signed certificate, and then add the certificate details to the correct places so that the operator's Deployment reads the secret, and the AuthProxyWorkload CRD webhooks include the certificate.

mike-pt commented 5 months ago

I though as much, I suppose another option would be long lived certs we create but I suppose that's the approach you tried before and is hard to maintain.

Since I don't want to add more dependencies in this env I decided to not use this but just add the cloud-sql-proxy as as sidecar (most operators allow this) this i.e.:

  unsupported:
    podTemplate:
      spec:
        initContainers:
          - name: cloud-sql-proxy
            image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.11-alpine
            restartPolicy: Always
            args:
              - "--structured-logs"
              - "--private-ip"
              - "project:us-central1:alpha-postgres-main"

For other projects I might still use this, but I noticed you state we need to use a specific version of cert-manager, so i.e using cert-manager's own operator wont work?

hessjcg commented 5 months ago

Roll-your-own sidecar is a good way to go if you don't want to use the operator. It will work fine. Be sure to follow best practices in securing your secrets.

The operator installation specifies a specific cert-manager version to ensure that we have tested it. We also keep the cert-manager version up to date with the latest released cert-manager at each release of this operator.

The cert-manager api is not entirely stable across minor versions. They have upgraded frequently, and occasionally we have needed to adjust the Cloud SQL Operator's configuration when they do.