GoogleCloudPlatform / alloydb-auth-proxy

A utility for connecting securely to your AlloyDB instances
https://cloud.google.com/alloydb/docs/auth-proxy/overview?hl=hu
Apache License 2.0
58 stars 10 forks source link

Health check endpoint when running as sidecar #662

Closed a-martynovich closed 2 months ago

a-martynovich commented 2 months ago

Question

I'm running alloydb-auth-proxy as a "native" sidecar container in Kubernetes. My main container needs to connect to AlloyDB immediately at startup. The obvious solution is to wait until alloydb-auth-proxy is ready and then start the main container.

Kubernetes provides a startupProbe which can open a TCP connection to the specified port, and I can use that to check if alloydb-auth-proxy has started. But the probe requires the pod to expose the port, and I don't want to expose the port since nobody else needs to connect to it other than the main container in the pod (all traffic between sidecar and main should stay inside the pod).

Is there any other way to make a readiness check for alloydb-auth-proxy container? A /healthz endpoint maybe? There's no shell so scripting isn't an option.

Code

spec:
  initContainers:
  - name: alloydb-proxy
    image: gcr.io/alloydb-connectors/alloydb-auth-proxy
    restartPolicy: Always
    envFrom:
    - configMapRef:
        name: ${CONFIGMAP_REF}
    args:
      - "--auto-iam-authn"
      - "--port=5432"
      - "$(DATABASE_URI)"
    ports:
      - containerPort: 5432
        protocol: TCP
  containers:
  - name: main
    env:
      - name: DATABASE_URL
        value: postgresql://localhost:5432/postgres

Additional Details

No response

enocom commented 2 months ago

Reading through the docs on sidecar containers, I wonder have you just tried your deployment as written above? Sidecar containers (and init containers) start before the main container (see the docs). So your application should be able to just connect and the Proxy will be ready.

If you do have problems with this approach, though, I'd be curious to hear about it.

The Auth Proxy has a built-in wait command that might also be useful -- although it was built prior to first class sidecar container support. The best documentation is in the help message (run ./alloydb-auth-proxy wait --help), but you can see that same message here: https://github.com/GoogleCloudPlatform/alloydb-auth-proxy/blob/4a501f6c10563cb2fbf30f0eed14d214af4c3525/cmd/root.go#L400-L427

If your deployment above doesn't work (please let me know), then you could use wait like this:

a-martynovich commented 2 months ago

Oh wow, this actually worked! Here's what I used:

spec:
  initContainers:
  - name: alloydb-proxy
    image: gcr.io/alloydb-connectors/alloydb-auth-proxy
    restartPolicy: Always
    startupProbe:
      exec:
        command: ["/alloydb-auth-proxy", "wait"]
    args:
      - "--auto-iam-authn"
      - "--port=5432"
      - "--health-check"
      - "$(DATABASE_URI)"
    ports:
      - containerPort: 5432
        protocol: TCP
    ...
  containers:
  - name: postgres
     image: "postgres",
    env:
      - name: DATABASE_URL
        value: postgresql://localhost:5432/postgres
...

I can see the connection to health check port in the logs, meaning that it's actually happening.

As for the reasoning, and in response to

I wonder have you just tried your deployment as written above? Sidecar containers (and init containers) start before the main container (see the docs). So your application should be able to just connect and the Proxy will be ready.

The issue is that without the health/readiness check of an init container it is considered ready immediately when its process starts up, but it takes another few milliseconds to actually establish the AlloyDB connection. The main container may start up faster and start connecting to alloydb-proxy container before that. This is what I had and this is why I needed a health check endpoint. So, thanks for solving this!

enocom commented 2 months ago

Nice -- the startup probe configuration looks great.

FWIW the Auth Proxy establishes a connection to AlloyDB lazily so I'd expect once the process was up, it would be ready to receive connection attempts. But in any case, if wait is working for you, then I'm happy to hear it.