vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.29k stars 352 forks source link

eviction problem #170

Closed orlakwahr closed 1 year ago

orlakwahr commented 2 years ago

Hi.

I setup a cluster with 3 nodes with haproxy, keepalived and etcd - that works. then I add a node as per documentation with add_pgnode.yml and later setup etcd, haproxy and keepalived on it manually now I have 4 nodes in the cluster - works. ok now I start sending load to the cluster I get all around very good RPS.

now I disconnect network on node 4 - in 10 seconds or so the node gets removed from the list /usr/local/bin/patronictl list postgres-cluster even before 10 seconds haproxy stops balancing traffic to the node and I get 0 errors and node nicely gets removed from the cluster. If I later on network on node 4 again - it gets in sync and the appears in /usr/local/bin/patronictl list postgres-cluster

now one mode test I disconnect network from node 4 and - it gets removed from the cluster automatically. and then I disconnect network on any other read replica - now things get funny. the node is not getting removed from the /usr/local/bin/patronictl list postgres-cluster

it actually sits there with state running even though it is not even in the network.

things will get even more funny if I disconnect network on the master - it also never gets removed from the cluster and actually new master is not getting assigned.

with both latter scenarios a lot of errors and RPS degraded dramatically to an unusable extent in case with master disconnection.

any pointer on how I fix this?

thanks

vitabaks commented 2 years ago

Hi @orlakwahr

Run the patronictl utility on the server that has everything in order with the network)

vitabaks commented 2 years ago

and later setup etcd, haproxy and keepalived on it manually now I have 4 nodes in the cluster

I don't understand why you do it manually, the load balancer can also be added automatically

Use the add_balancer.ymlplaybook for this.

https://github.com/vitabaks/postgresql_cluster#cluster-scaling

vitabaks commented 2 years ago

Also, note:

if you want the cluster to survive the failure of two servers, then you need a etcd cluster of 5 nodes (N/2+1). In a cluster of three nodes, it is possible to lose only 1 server, if more nodes are unavailable then the etcd cluster is not healthy because there is no quorum.

image

See how raft works: http://thesecretlivesofdata.com/raft/

orlakwahr commented 2 years ago

thanks. but the problem is that a cluster of 3 nodes can not automatically fix loss of 1 node. I am trying to understand how to make it handle the loss automatically.

vitabaks commented 2 years ago

Patroni does a great job of handling auto failover.

Please show examples of your problem, and attach patroni logs, maybe I didn't understand the question.

orlakwahr commented 2 years ago
Apr 28 13:49:01 localhost patroni[988]: 2022-04-28 13:48:58,543 INFO: Lock owner: pg-db3; I am pg-db1
Apr 28 13:49:01 localhost patroni[988]: 2022-04-28 13:49:01,048 INFO: Selected new etcd server http://10.16.18.135:2379
Apr 28 13:49:01 localhost patroni[988]: 2022-04-28 13:49:01,048 ERROR: Request to server http://10.16.18.133:2379 failed: MaxRetryError('HTTPConnectionPool(host=\'10.16.18.133\', port=2379): Max retries exceeded with url: /v2/keys/service/postgres-cluster/members/pg-db1 (Caused by ReadTimeoutError("HTTPConnectionPool(host=\'10.16.18.133\', port=2379): Read timed out. (read timeout=2.499780117010232)",))',)
Apr 28 13:49:01 localhost patroni[988]: 2022-04-28 13:49:01,048 INFO: Reconnection allowed, looking for another server.
Apr 28 13:49:01 localhost patroni[988]: 2022-04-28 13:49:01,048 ERROR:
Apr 28 13:49:01 localhost patroni[988]: Traceback (most recent call last):
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 449, in _make_request
Apr 28 13:49:01 localhost patroni[988]:    six.raise_from(e, None)
Apr 28 13:49:01 localhost patroni[988]:  File "<string>", line 3, in raise_from
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 444, in _make_request
Apr 28 13:49:01 localhost patroni[988]:    httplib_response = conn.getresponse()
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/lib64/python3.6/http/client.py", line 1361, in getresponse
Apr 28 13:49:01 localhost patroni[988]:    response.begin()
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/lib64/python3.6/http/client.py", line 311, in begin
Apr 28 13:49:01 localhost patroni[988]:    version, status, reason = self._read_status()
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/lib64/python3.6/http/client.py", line 272, in _read_status
Apr 28 13:49:01 localhost patroni[988]:    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/lib64/python3.6/socket.py", line 586, in readinto
Apr 28 13:49:01 localhost patroni[988]:    return self._sock.recv_into(b)
Apr 28 13:49:01 localhost patroni[988]: socket.timeout: timed out
Apr 28 13:49:01 localhost patroni[988]: During handling of the above exception, another exception occurred:
Apr 28 13:49:01 localhost patroni[988]: Traceback (most recent call last):
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 710, in urlopen
Apr 28 13:49:01 localhost patroni[988]:    chunked=chunked,
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 451, in _make_request
Apr 28 13:49:01 localhost patroni[988]:    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 341, in _raise_timeout
Apr 28 13:49:01 localhost patroni[988]:    self, url, "Read timed out. (read timeout=%s)" % timeout_value
Apr 28 13:49:01 localhost patroni[988]: urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='10.16.18.133', port=2379): Read timed out. (read timeout=2.499780117010232)
Apr 28 13:49:01 localhost patroni[988]: During handling of the above exception, another exception occurred:
Apr 28 13:49:01 localhost patroni[988]: Traceback (most recent call last):
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 211, in _do_http_request
Apr 28 13:49:01 localhost patroni[988]:    response = request_executor(method, base_uri + path, **kwargs)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/request.py", line 79, in request
Apr 28 13:49:01 localhost patroni[988]:    method, url, fields=fields, headers=headers, **urlopen_kw
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/request.py", line 170, in request_encode_body
Apr 28 13:49:01 localhost patroni[988]:    return self.urlopen(method, url, **extra_kw)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/poolmanager.py", line 376, in urlopen
Apr 28 13:49:01 localhost patroni[988]:    response = conn.urlopen(method, u.request_uri, **kw)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 786, in urlopen
Apr 28 13:49:01 localhost patroni[988]:    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/urllib3/util/retry.py", line 592, in increment
Apr 28 13:49:01 localhost patroni[988]:    raise MaxRetryError(_pool, url, error or ResponseError(cause))
Apr 28 13:49:01 localhost patroni[988]: urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='10.16.18.133', port
=2379): Max retries exceeded with url: /v2/keys/service/postgres-cluster/members/pg-db1 (Caused by ReadTimeoutError("HTTPConnectionPool(host='10.16.18.133', port=2379): Read timed out. (read timeout=2.499780117010232)",))
Apr 28 13:49:01 localhost patroni[988]: During handling of the above exception, another exception occurred:
Apr 28 13:49:01 localhost patroni[988]: Traceback (most recent call last):
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 566, in wrapper
Apr 28 13:49:01 localhost patroni[988]:    retval = func(self, *args, **kwargs) is not None
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 659, in touch_member
Apr 28 13:49:01 localhost patroni[988]:    return self._client.set(self.member_path, data, None if permanent else self._ttl)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/etcd/client.py", line 721, in set
Apr 28 13:49:01 localhost patroni[988]:    return self.write(key, value, ttl=ttl)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/etcd/client.py", line 500, in write
Apr 28 13:49:01 localhost patroni[988]:    response = self.api_execute(path, method, params=params)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 256, in api_execute
Apr 28 13:49:01 localhost patroni[988]:    response = self._do_http_request(retry, machines_cache, request_executor, method, path, **kwargs)
Apr 28 13:49:01 localhost patroni[988]:  File "/usr/local/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 230, in _do_http_request
Apr 28 13:49:01 localhost patroni[988]:    raise etcd.EtcdException('{0} {1} request failed'.format(method, path))
Apr 28 13:49:01 localhost patroni[988]: etcd.EtcdException: PUT /v2/keys/service/postgres-cluster/members/pg-db1 request failed
Apr 28 13:49:01 localhost patroni[988]: 2022-04-28 13:49:01,050 INFO: no action. I am (pg-db1), a secondary, and following a leader (pg-db3)
Apr 28 13:49:01 localhost etcd[1078]: got unexpected response error (etcdserver: request timed out)
Apr 28 13:49:02 localhost etcd[1078]: 33070dbf2451ad42 [term: 261] received a MsgVote message with higher term from 99de4181ac8b022 [term: 262]
Apr 28 13:49:02 localhost etcd[1078]: 33070dbf2451ad42 became follower at term 262
Apr 28 13:49:02 localhost etcd[1078]: 33070dbf2451ad42 [logterm: 203, index: 16576203, vote: 0] cast MsgVote for 99de4181ac8b022 [logterm: 203, index: 16576203] at term 262
Apr 28 13:49:03 localhost etcd[1078]: health check for peer a5b9a4993cb72fad could not connect: dial tcp 10.16.18.134:2380: connect: no route to host (prober "ROUND_TRIPPER_RAFT_MESSAGE")
Apr 28 13:49:03 localhost etcd[1078]: health check for peer f12d636a65b65d2f could not connect: dial tcp 10.16.18.141:2380: connect: no route to host (prober "ROUND_TRIPPER_RAFT_MESSAGE")
Apr 28 13:49:03 localhost etcd[1078]: got unexpected response error (etcdserver: request timed out)
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 is starting a new election at term 262
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 became candidate at term 263
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 received MsgVoteResp from 33070dbf2451ad42 at term 263
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 [logterm: 203, index: 16576203] sent MsgVote request to a5b9a4993cb72fad at term 263
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 [logterm: 203, index: 16576203] sent MsgVote request to f12d636a65b65d2f at term 263
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 [logterm: 203, index: 16576203] sent MsgVote request to 99de4181ac8b022 at term 263
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 received MsgVoteResp from 99de4181ac8b022 at term 263
Apr 28 13:49:07 localhost etcd[1078]: 33070dbf2451ad42 [quorum:3] has received 2 MsgVoteResp votes and 0 vote rejections
Apr 28 13:49:07 localhost etcd[1078]: got unexpected response error (etcdserver: request timed out)
Apr 28 13:49:08 localhost etcd[1078]: health check for peer a5b9a4993cb72fad could not connect: dial tcp 10.16.18.134:2380: connect: no route to host (prober "ROUND_TRIPPER_RAFT_MESSAGE")
Apr 28 13:49:08 localhost etcd[1078]: health check for peer f12d636a65b65d2f could not connect: dial tcp 10.16.18.141:2380: connect: no route to host (prober "ROUND_TRIPPER_RAFT_MESSAGE")
Apr 28 13:49:08 localhost patroni[988]: 2022-04-28 13:49:08,539 INFO: Selected new etcd server http://10.16.18.133:2379
Apr 28 13:49:11 localhost patroni[988]: 2022-04-28 13:49:08,542 INFO: Lock owner: pg-db3; I am pg-db1
Apr 28 13:49:11 localhost patroni[988]: 2022-04-28 13:49:11,047 INFO: Selected new etcd server http://10.16.18.135:2379
Apr 28 13:49:11 localhost patroni[988]: 2022-04-28 13:49:11,047 ERROR: Request to server http://10.16.18.133:2379 fail

root@pg-db1:keepalived# /usr/local/bin/patronictl list postgres-cluster
2022-04-28 14:42:52,915 - ERROR - Request to server http://10.16.18.141:2379 failed: MaxRetryError("HTTPConnectionPool(host='10.16.18.141', port=2379): Max retries exceeded with url: /v2/keys/service/postgres-cluster/?recursive=true (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fe4ce701d68>, 'Connection to 10.16.18.141 timed out. (connect timeout=1.25)'))",)

+--------------+----------------+---------+---------+----+-----------+
| Member       | Host           | Role    | State   | TL | Lag in MB |
+ Cluster: postgres-cluster (7068655480869002509) --+----+-----------+
| pg-db1 | 10.16.18.133 | Replica | running | 13 |         0 |
| pg-db2 | 10.16.18.134 | Replica | running | 13 |         0 |
| pg-db3 | 10.16.18.135 | Leader  | running | 13 |           |
+--------------+----------------+---------+---------+----+-----------+
orlakwahr commented 2 years ago

and so when these 3 nodes up and running - the cluster is OK. but if I disconnect network on any one of them - it still says running but obviously a lot of errors

vitabaks commented 2 years ago

INFO: no action. I am (pg-db1), a secondary, and following a leader (pg-db3)

1) If there is no network on the replica then nothing happens the replica will continue to work, the replication lag will simply accumulate since there is no access to primary. when the network is restored, the replica will try to catch up with the primary by restoring all the WALs (if they are still available or are in the archive). In a load balancing scheme (haproxy), such a lagging replica will be excluded from balancing if its replication lag exceeds maximum_lag_on_failover.

2) if the network disappears on primary (leader), then it will no longer be able to update the leader key in DCS (etcd) and after a while, usually 30 seconds (ttl), a new leader (primary) will be selected. And the former leader will be restarted as a replica.

orlakwahr commented 2 years ago

I have made one more cluster with patroni 2.1.1 and there indeed all works as expected. bit this one is 2.1.3 and basically it does not work as expected. I guess the python trace hints on the cause of the problem - I do not have such a trace on 2.1.1

vitabaks commented 2 years ago

Try to open the issue on the Patroni project repository

rkazak07 commented 1 year ago

Hello, a cluster structure with 3 nodes has been created 3 pcs etc 3 versions of postgresql 14.5 I have set up a cluster structure for 3 Haproxy. I'm getting the following error. I have set up 0 presentations again because I think it is related to my organization and I am getting the same error again. how can I follow a path related to etcd synchronization disorder

etcd service log:

Nov 11 12:50:18 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:22 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:22 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:25 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:29 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:32 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:35 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:38 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:42 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:45 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:47 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:52 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:53 etcd[1234]: got unexpected response error (etcdserver: request timed out) [merged 1 repeated lines in 1.88s] Nov 11 12:50:55 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:51:02 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:03 etcd[1234]: f067dab16206035b [logterm: 44, index: 11414, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:51:04 etcd[1234]: got unexpected response error (etcdserver: request timed out) [merged 1 repeated lines in 1.93s] Nov 11 12:51:09 etcd[1234]: f067dab16206035b [logterm: 44, index: 11414, vote: fff1b3af9b1bfc49] ignored MsgVote from fff1b3af9b1bfc49 [logterm: 44, index> Nov 11 12:51:09 etcd[1234]: f067dab16206035b [term: 44] received a MsgApp message with higher term from fff1b3af9b1bfc49 [term: 64] Nov 11 12:51:09 etcd[1234]: f067dab16206035b became follower at term 64 Nov 11 12:51:12 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:13 etcd[1234]: got unexpected response error (etcdserver: request timed out) [merged 1 repeated lines in 1.83s] Nov 11 12:51:18 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:22 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:28 etcd[1234]: f067dab16206035b [logterm: 64, index: 11415, vote: 0] ignored MsgVote from be385cae4113fc0b [logterm: 64, index: 11458] at ter> Nov 11 12:51:36 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:37 etcd[1234]: f067dab16206035b [logterm: 64, index: 11415, vote: 0] ignored MsgVote from be385cae4113fc0b [logterm: 64, index: 11458] at ter> Nov 11 12:51:42 etcd[1234]: got unexpected response error (etcdserver: request timed out)

Patroni.yml:

bootstrap: method: initdb dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300

vitabaks commented 1 year ago

hello

Please see a similar issue https://github.com/etcd-io/etcd/issues/11809

And my recommendations - https://github.com/vitabaks/postgresql_cluster#recommendations

пт, 11 нояб. 2022 г. в 13:14, ramazan @.***>:

Hello, a cluster structure with 3 nodes has been created 3 pcs etc 3 versions of postgresql 14.5 I have set up a cluster structure for 3 Haproxy. I'm getting the following error. I have set up 0 presentations again because I think it is related to my organization and I am getting the same error again. how can I follow a path related to etcd synchronization disorder

etcd service log:

Nov 11 12:50:18 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:22 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:22 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:25 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:29 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:32 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:35 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:38 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:42 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:45 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:47 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:50:52 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:50:53 etcd[1234]: got unexpected response error (etcdserver: request timed out) [merged 1 repeated lines in 1.88s] Nov 11 12:50:55 etcd[1234]: f067dab16206035b [logterm: 44, index: 11008, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:51:02 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:03 etcd[1234]: f067dab16206035b [logterm: 44, index: 11414, vote: fff1b3af9b1bfc49] ignored MsgVote from be385cae4113fc0b [logterm: 44, index> Nov 11 12:51:04 etcd[1234]: got unexpected response error (etcdserver: request timed out) [merged 1 repeated lines in 1.93s] Nov 11 12:51:09 etcd[1234]: f067dab16206035b [logterm: 44, index: 11414, vote: fff1b3af9b1bfc49] ignored MsgVote from fff1b3af9b1bfc49 [logterm: 44, index> Nov 11 12:51:09 etcd[1234]: f067dab16206035b [term: 44] received a MsgApp message with higher term from fff1b3af9b1bfc49 [term: 64] Nov 11 12:51:09 etcd[1234]: f067dab16206035b became follower at term 64 Nov 11 12:51:12 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:13 etcd[1234]: got unexpected response error (etcdserver: request timed out) [merged 1 repeated lines in 1.83s] Nov 11 12:51:18 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:22 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:28 etcd[1234]: f067dab16206035b [logterm: 64, index: 11415, vote: 0] ignored MsgVote from be385cae4113fc0b [logterm: 64, index: 11458] at ter> Nov 11 12:51:36 etcd[1234]: got unexpected response error (etcdserver: request timed out) Nov 11 12:51:37 etcd[1234]: f067dab16206035b [logterm: 64, index: 11415, vote: 0] ignored MsgVote from be385cae4113fc0b [logterm: 64, index: 11458] at ter> Nov 11 12:51:42 etcd[1234]: got unexpected response error (etcdserver: request timed out)

Patroni.yml:

bootstrap: method: initdb dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300

— Reply to this email directly, view it on GitHub https://github.com/vitabaks/postgresql_cluster/issues/170#issuecomment-1311499775, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2LV7WK5AFPLD3GDYV5PNDWHYMBVANCNFSM5UUMHRZQ . You are receiving this because you modified the open/close state.Message ID: @.***>