reactive-tech / kubegres

Kubegres is a Kubernetes operator allowing to deploy one or many clusters of PostgreSql instances and manage databases replication, failover and backup.
https://www.kubegres.io
Apache License 2.0
1.32k stars 74 forks source link

backup job doesn't start cause: Not starting job because prior execution is running and concurrency policy is Forbid #143

Open antonioluzzi opened 1 year ago

antonioluzzi commented 1 year ago

Hi, I have a simple cluster: 1 master and 2 replicas. I have added to the deployment of my cluster backup configuration: backup: schedule: "0 /1 " pvcName: my-backup-pvc volumeMount: /var/lib/backup But, It doesn't work well, because if I try to describe my cronjob I get this message: Normal JobAlreadyActive 112s (x5 over 121m) cronjob-controller Not starting job because prior execution is running and concurrency policy is Forbid

And inside the backup cronjob pod I have the following log:

[root@worker-sgpd postgis]# kubectl logs -f backup-mypostgres-27851640-tlqcx 15/12/2022 14:54:37 - Starting DB backup of Kubegres resource mypostgres into file: /var/lib/backup/mypostgres-backup-15_12_2022_14_54_37.gz 15/12/2022 14:54:37 - Running: pg_dumpall -h mypostgres-replica -U postgres -c | gzip > /var/lib/backup/mypostgres-backup-15_12_2022_14_54_37.gz pg_dumpall: error: connection to server at "mypostgres-replica" (192.168.108.126), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? connection to server at "mypostgres-replica" (192.168.108.100), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections?

If I use a Postgres client like this: kubectl run postgresql-dev-client --rm --tty -i --restart='Never' --namespace default --image docker.io/bitnami/postgresql:14.1.0-debian-10-r80 --env="PGPASSWORD=admin" -- bash

and inside the container I can connect to my postgres: psql --host mypostgres -U postgres -d postgres -p 5432

postgres-# \conninfo You are connected to database "postgres" as user "postgres" on host "mypostgres" (address "192.168.108.123") at port "5432".

I don't know why I can connect inside a pod to my Postgres, but the connection from cronjob failed. Another strange thing, If I delete the corresponding pod of my backup cron job. The first backup work and the second fails to cause connection timeout. Can you help me? Regards Antonio

alazycoder101 commented 1 year ago

You can kubectl get cronjob to get the cronjob and then kubectl create job --from=cronjob/{you cronjob} mybackup kubeclt get pod to watch the pod and debug it with kubectl exec

It seems that you are backing up from replica and testing the connection to primary is working.