sorintlab / stolon

PostgreSQL cloud native High Availability and more.
https://talk.stolon.io
Apache License 2.0
4.66k stars 447 forks source link

Stolon proxy is flapping ip address of master after docker upgrade #826

Closed warmanton closed 3 years ago

warmanton commented 3 years ago

First of all sorry but https://talk.stolon.io last updated 20 may 2020 (General discussion - most probable place for my issue there).

What happened: after docker upgrade from 19.03.13 to 20.10.5 on docker swarm cluster (stolon and etcd are running there as two stacks) proxy becomes flapping between two ip addresses. First address is master keeper addres and the second ip address is a master keeper service address. Both lead to the same postgres instance but when proxy is flapping then it drops existing client connections. Here is a part docker proxy log:

2021-03-11T08:57:53.455Z INFO cmd/proxy.go:268 master address {“address”: “10.0.17.93:5432”} 2021-03-11T08:57:53.499Z INFO cmd/proxy.go:286 proxying to master address {“address”: “10.0.17.93:5432”} 2021-03-11T08:57:58.525Z INFO cmd/proxy.go:268 master address {“address”: “10.0.17.57:5432”} 2021-03-11T08:57:58.580Z INFO cmd/proxy.go:286 proxying to master address {“address”: “10.0.17.57:5432”} 2021-03-11T08:58:03.586Z INFO cmd/proxy.go:268 master address {“address”: “10.0.17.93:5432”} 2021-03-11T08:58:04.052Z INFO cmd/proxy.go:286 proxying to master address {“address”: “10.0.17.93:5432”} 2021-03-11T08:58:09.059Z INFO cmd/proxy.go:268 master address {“address”: “10.0.17.57:5432”} 2021-03-11T08:58:09.074Z INFO cmd/proxy.go:286 proxying to master address {“address”: “10.0.17.57:5432”}

What you expected to happen: Expected no flapping

How to reproduce it (as minimally and precisely as possible): upgrade docker swarm cluster from from 19.03.13 to 20.10.5 ? It is too complex for reproduction :-(.

Anything else we need to know?:

Environment:

Most probably this is a docker new behavior or some new feature that reports ip address to stolon component. Is any way to fix it ? Going to rollback docker version tomorrow since have no other ides. Please help.

warmanton commented 3 years ago

For Your Information:

Rollback to docker 19.03.15 resolved this issue. Now proxy wants to use keeper service ip address only (logs show ip address of corresponding docker swarm service and not the keeper's container ip address).

Something was changed in docker 20.10.5 about networking.

sgotti commented 3 years ago

First of all sorry but https://talk.stolon.io last updated 20 may 2020 (General discussion - most probable place for my issue there).

Open a discussion here so the last update time will change :smile:

Anyway I'm not sure this is something related to stolon.