PrefectHQ / prefect-operator

A Kubernetes operator for managing Prefect servers and work pools
10 stars 0 forks source link

Add initContainer to wait for Postgres to be ready #95

Closed mitchnielsen closed 1 month ago

mitchnielsen commented 1 month ago

Summary

This ensures that workloads depending on the database first check to see if the database is ready. This should help avoid crash loop backoffs and can, in certain cases, improve overall spin-up time.

This is mostly effective in scenarios where the database is not running yet (fresh instances or database upgrades), but still worth pursuing - especially because it's clear what a Pod's dependencies are and helps us avoid CrashLoopBackOff problems.

Related to https://linear.app/prefect/issue/PLA-358/optimize-the-time-it-takes-for-the-prefect-operator-to-create-a-new

Testing

First, confirm the unit tests still pass. Additionally, you can manually check the logs for the new initContainer:

$ kubectl logs -f prefect-postgres-migration-3232e736-cmh79 -c wait-for-database
Waiting for PostgreSQL...
postgres:5432 - no response
Waiting for PostgreSQL...
postgres:5432 - no response
Waiting for PostgreSQL...
postgres:5432 - no response
Waiting for PostgreSQL...
postgres:5432 - accepting connections

I also ran a fairly unscientific test to compare how long the total time it took for the PrefectServer to become Ready:

#!/bin/bash

# usage:
#   time ./test.sh

# Create instance
kubectl apply -f deploy/samples/v1_prefectserver_postgres.yaml

# Wait for the instance to be ready
kubectl wait --for=jsonpath='{.status.ready}'=true prefectserver/prefect-postgres

# Clean up
kubectl delete -f deploy/samples/v1_prefectserver_postgres.yaml
kubectl delete pvc postgres-database-postgres-0

Results:

Pretty significant difference here - mostly because the Prefect Server and Migrations Pods aren't crash looping while the database comes up.