Open BenchmarkingBuffalo opened 4 years ago
After 25 more seconds, the node started to promote itself. Is there a way I can reduce this amount of time? What I basically want is the former standby to promote itself to the master as soon as the master does not renew his lock.
The promotion happens when the leader doesn't update the lock for 30 seconds (ttl). It is possible to reduce ttl, but I would hardly advise you not to do that. What if it is a temporary network glitch that will resolve itself soon?
Hi, I changed the ttl to two seconds (see the manifest above), so that can’t be it or am I wrong?
Please, don't do it! No one could beat laws of physics!
If one changes ttl, loop_wait and retry_timeout also should be adjusted.
There is a formula which must hold: loop_wait + 2*retry_timeout <= ttl
: https://patroni.readthedocs.io/en/latest/SETTINGS.html
Besides that, there are some hardcoded timeouts, it is absolutely unsafe to go below 20 sec!
Ok, thank you for your reply. I had already adjusted the other values as well, but I did not know about the hardcoded timeouts.
Please, answer some short questions which should help us to understand your problem / question better?
I want to check, what happens, when I disconnect the network from the worker node the master is running on. What I saw, was this: Logs from former standby:
2020-09-22 13:34:23,786 INFO: Lock owner: acid-minimal-cluster-0; I am acid-minimal-cluster-1 2020-09-22 13:34:23,786 INFO: does not have lock 2020-09-22 13:34:23,860 INFO: no action. i am a secondary and i am following a leader 2020-09-22 13:34:23,862 WARNING: Loop time exceeded, rescheduling immediately. 2020-09-22 13:34:25,386 WARNING: Request failed to acid-minimal-cluster-0: GET http://10.36.0.1:8008/patroni (HTTPConnectionPool(host='10.36.0.1', port=8008): Max retries exceeded with url: /patroni (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe1b8fe3828>: Failed to establish a new connection: [Errno 113] No route to host',))) 2020-09-22 13:34:25,520 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'" 2020-09-22 13:34:25,582 INFO: promoted self to leader by acquiring session lock 2020-09-22 13:34:25,584 WARNING: Loop time exceeded, rescheduling immediately. 2020-09-22 13:34:25,584 INFO: Lock owner: acid-minimal-cluster-1; I am acid-minimal-cluster-1 2020-09-22 13:34:25,634 INFO: updated leader lock during promote server promoting 2020-09-22 13:34:25,671 INFO: cleared rewind state after becoming the leader