cybertec-postgresql / vip-manager

Manages a virtual IP based on state kept in etcd or Consul
BSD 2-Clause "Simplified" License
207 stars 41 forks source link

IP-Address not switched on Hard-Shutdown #194

Closed mnietz closed 9 months ago

mnietz commented 9 months ago

We have a three node PostgreSQL / Patroni / Etcd / Vip-Manager Cluster. When we do a planned Switchover or Shutdown, vip-manager works as expected and move the ip to the new leader. But when we kill the leader-vm (proxmox - stop) for testing purposes, vip-manager does not start the ip on the new leader even though the patroni failover works fine.

{"level":"warn","ts":"2024-01-11T16:09:30.601759+0100","logger":"etcd-client","caller":"v3@v3.5.11/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00014bc00/db1:2379","attempt":1,"error":"rpc error: code = Unavailable desc = transport is closing"}

ETCD: 3.5.11 using APIv3 VIP-Manager: 2.2 (same behaviour with 2.1)

config:

interval: 1000 
trigger-key: "/patroni/cluster-name/leader"
trigger-value: "db1"
ip: 192.168.1.1
netmask: 24
interface: ens18
hosting-type: basic

dcs-type: etcd
dcs-endpoints:
  - http://db1:2379
  - http://db2:2379
  - http://db3:2379

retry-num: 3
retry-after: 250
dreamingdeer commented 9 months ago

Vip-Manager on the new leader still reports: IP address 192.168.1.1/24 state is false, desired false becouse

        resp, err := e.kapi.Get(ctx, e.key)

wait 15 min

dreamingdeer commented 9 months ago

https://github.com/etcd-io/etcd/issues/8905 - nice

mnietz commented 9 months ago

We did some testing and it now works as expected. Thanks for fixing!

pashagolub commented 9 months ago

Kudos to @dreamingdeer! 👍

mnietz commented 9 months ago

@pashagolub when do you plan to release it?

pashagolub commented 9 months ago

We are reviewing one more PR touching this functionality. We can publish new release after testing it

pashagolub commented 9 months ago

@mnietz would you, please, try #199? If everything is OK, we can release a new version this week.

Thanks in advance!

mnietz commented 9 months ago

@pashagolub Looks good for me. Tested with switchover, reboot and hard shutdown and it works like expected. The additional output (watch and current leader from dcs) are helpful as well.

pashagolub commented 9 months ago

Thanks for letting us know!