reactive-tech / kubegres

Kubegres is a Kubernetes operator allowing to deploy one or many clusters of PostgreSql instances and manage databases replication, failover and backup.
https://www.kubegres.io
Apache License 2.0
1.31k stars 74 forks source link

WAL segment ... has already been removed #115

Open nicraMarcin opened 2 years ago

nicraMarcin commented 2 years ago

Hi, in my instances I've got often errors that wal was removed

postgres-8 2022-05-26 06:19:49.228 GMT [25675] ERROR:  requested WAL segment 0000000400000001000000B1 has already been removed                                                   
postgres-8 2022-05-26 06:19:49.228 GMT [25675] STATEMENT:  START_REPLICATION 1/B1000000 TIMELINE 3                                                                               
postgres-8 2022-05-26 06:19:54.233 GMT [25676] ERROR:  requested WAL segment 0000000400000001000000B1 has already been removed                                                   
postgres-8 2022-05-26 06:19:54.233 GMT [25676] STATEMENT:  START_REPLICATION 1/B1000000 TIMELINE 3                                                                               
setup-replica-data-directory 26/05/2022 00:07:24 - Attempting to promote a Replica PostgreSql to Primary...                                                                      
setup-replica-data-directory 26/05/2022 00:07:24 - Promoting by creating the promotion trigger file: '/var/lib/postgresql/data/pgdata/promote_replica_to_primary.lo

In this situation replica was promoted to primary by manager. I don't know why previous master pod (postgres-7) was killed :( When I manually restart for example postgres-2, the new instance posrtgres-4 (because I've set 3 replicas) is started isnstead restart second instance.

themicster commented 1 year ago

I have the same problem. But the way I got there is a little different. I'm running on openshift and I went to do a cluster upgrade which resulted in psql-3-0 and psql-4-0 pods. Since this was in the middle of the night and I know no data that I care about has changed I just deleted the deployment and redeploy resulting in psql-1-0 and psql-2-0 as normal and I expect it to be using a PVC from earlier as well. The database is up, but I'm getting the same error: 2023-05-31 08:15:54.124 GMT [1] LOG: starting PostgreSQL 13.10 (Debian 13.10-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 6 2023-05-31 08:15:54.126 GMT [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 7 2023-05-31 08:15:54.126 GMT [1] LOG: listening on IPv6 address "::", port 5432 8 2023-05-31 08:15:54.158 GMT [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 9 2023-05-31 08:15:54.238 GMT [14] LOG: database system was shut down at 2023-05-31 07:53:13 GMT 10 2023-05-31 08:15:54.291 GMT [1] LOG: database system is ready to accept connections 11 2023-05-31 08:16:09.781 GMT [29] ERROR: requested WAL segment 0000000100000002000000C4 has already been removed 12 2023-05-31 08:16:09.781 GMT [29] STATEMENT: START_REPLICATION 2/C4000000 TIMELINE 1 13 2023-05-31 08:16:09.800 GMT [30] ERROR: requested WAL segment 0000000100000002000000C4 has already been removed 14 2023-05-31 08:16:09.800 GMT [30] STATEMENT: START_REPLICATION 2/C4000000 TIMELINE 1 15 2023-05-31 08:16:14.803 GMT [38] ERROR: requested WAL segment 0000000100000002000000C4 has already been removed 16 2023-05-31 08:16:14.803 GMT [38] STATEMENT: START_REPLICATION 2/C4000000 TIMELINE 1