zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.24k stars 968 forks source link

pg: cannot set transaction read write mode during recovery #2354

Open lkgGitHub opened 1 year ago

lkgGitHub commented 1 year ago

I have two pod in kubernetes, postgres-service-0 and postgres-service-1. First postgres-service-1 is master. After PostgreSQL master-slave failover, Application client connection PostgreSQL error:"pg: cannot set transaction read write mode during recovery". But PostgreSQL pods are all running. When I restart the application, it can be restored.

postgres-service-1

2023-06-13 07:31:47,890 INFO: Lock owner: postgres-service-1; I am postgres-service-1
2023-06-13 07:31:52,898 ERROR: Request to server https://10.96.0.1:443 failed: ReadTimeoutError(\"HTTPSConnectionPool(host='10.96.0.1', port=443): Read timed out. (read timeout=4.9869120344519615)\",)
2023-06-13 07:31:53,858 WARNING: Concurrent update of postgres-service
2023-06-13 07:31:54,172 INFO: starting after demotion in progress
2023-06-13 07:31:54,174 INFO: Lock owner: postgres-service-0; I am postgres-service-1
2023-06-13 07:31:54,174 INFO: establishing a new patroni connection to the postgres cluster
2023-06-13 07:31:54,181 INFO: Local timeline=4 lsn=11/EC60FAD8
2023-06-13 07:31:54,212 INFO: master_timeline=5
2023-06-13 07:31:54,213 INFO: master: history=1\u00090/570000A0\u0009no recovery target specified
2\u00094/2248FB28\u0009no recovery target specified
3\u0009A/D3FE9300\u0009no recovery target specified
4\u000911/EC60FAD8\u0009no recovery target specified
server signaled
2023-06-13 07:31:54,323 INFO: no action. I am (postgres-service-1), a secondary, and following a leader (postgres-service-0)
2023-06-13 07:31:54,325 INFO: Lock owner: postgres-service-0; I am postgres-service-1
2023-06-13 07:31:54,330 INFO: Local timeline=4 lsn=11/EC60FAD8
2023-06-13 07:31:54,361 INFO: master_timeline=5
2023-06-13 07:31:54,362 INFO: master: history=1\u00090/570000A0\u0009no recovery target specified
2\u00094/2248FB28\u0009no recovery target specified
3\u0009A/D3FE9300\u0009no recovery target specified
4\u000911/EC60FAD8\u0009no recovery target specified
2023-06-13 07:31:54,372 INFO: no action. I am (postgres-service-1), a secondary, and following a leader (postgres-service-0)

postgres-service-0 log:

Got response from postgres-service-1 http://10.244.1.77:8008/patroni: {"state": "running", "postmaster_start_time": "2023-06-13 07:31:47.021588+00:00", "role": "replica", "server_version": 140005, "xlog": {"received_location": 76980222680, "replayed_location": 76980222680, "replayed_timestamp": "2023-06-13 07:31:41.062583+00:00", "paused": false}, "timeline": 4, "replication": [{"usename": "standby", "application_name": "postgres-service-0", "client_addr": "10.244.2.24", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1686641507, "database_system_identifier": "7215965247549096005", "patroni": {"version": "2.1.4", "scope": "postgres-service"}}
2023-06-13 07:30:43,392 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"
2023-06-13 07:30:43,522 INFO: promoted self to leader by acquiring session lock
server promoting
2023-06-13 07:30:43,549 INFO: cleared rewind state after becoming the leader
2023-06-13 07:30:43,524 INFO: Lock owner: postgres-service-0; I am postgres-service-0
2023-06-13 07:30:43,893 INFO: updated leader lock during promote

Some general remarks when posting a bug report:

noahge commented 1 year ago

same problem

gtejasvi commented 1 year ago

Observed the exact same issue more than once on an instance, resolves if the read replica is restarted

obitoquilt commented 7 months ago

same problem

meltingrock commented 6 months ago

same issue

danpe commented 4 months ago

same here

Danieloni1 commented 2 months ago

Still reproducible on latest operator v1.12.2. Any updates about this issue?

Bohooslav commented 1 month ago

Same problem

ddh27 commented 1 month ago

Same problem