Closed pavelkogen closed 3 years ago
I'm used default pip packages from RedHat.yml file
Are you using installation_method: "file" or "repo" ? patroni_installation_type: "pip" or "rpm"?
Jan 18 12:47:26 2th_node patroni[21439]: yaml.scanner.ScannerError: mapping values are not allowed here Jan 18 12:47:26 2th_node patroni[21439]: in "/etc/patroni/patroni.yml", line 3, column 4
Please share your patroni.yml
Are you using installation_method: "file" or "repo" ? patroni_installation_type: "pip" or "rpm"?
Yes, of course. I'm using installation method from file.
Solution: since I was using postgres version 12, I did not have the postgresql12-devel package installed due to the missing llvm-toolset-7-clang dependency (this package is missing from our repository). I chose to use Postgres version 10 and it worked.
However, I ran into another problem. Cluster installation stops at the moment of restarting the vip-manager service and waiting for a response from the VIP address:
RUNNING HANDLER [vip-manager : Restart vip-manager service]
changed: [XX.XX.XX.XX]
changed: [YY.YY.YY.YY]
changed: [NN.NN.NN.NN]
RUNNING HANDLER [vip-manager : Wait for the cluster ip address (VIP) "VV.VV.VV.VV" is running]
fatal: [XX.XX.XX.XX]: FAILED! => {"changed": false, "elapsed": 60, "msg": "Timeout when waiting for VV.VV.VV.VV:22"}
fatal: [YY.YY.YY.YY]: FAILED! => {"changed": false, "elapsed": 60, "msg": "Timeout when waiting for VV.VV.VV.VV:22"}
fatal: [NN.NN.NN.NN]: FAILED! => {"changed": false, "elapsed": 60, "msg": "Timeout when waiting for VV.VV.VV.VV:22"}
I am using a VIP address that has firewall restrictions. On the network equipment, I only allow requests to the VIP address on port 5432 from machines that need access to the database.
The question is, do I need to additionally allow requests to the VIP address, if so, on which ports and to which addresses?
The vip-manager service is running on the servers, but it looks like this:
{RHEL7.9}{N/A}{1th_node}[root@~]$ systemctl status vip-manager.service
● vip-manager.service - Manages Virtual IP for Patroni
Loaded: loaded (/etc/systemd/system/vip-manager.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Wed 2021-01-20 11:36:27 MSK; 3min 2s ago
Process: 18386 ExecStopPost=/sbin/ip addr del VV.VV.VV.VV/23 dev ens192 (code=exited, status=2)
Process: 18381 ExecStart=/usr/bin/vip-manager --config=/etc/patroni/vip-manager.yml (code=exited, status=1/FAILURE)
Main PID: 18381 (code=exited, status=1/FAILURE)
Jan 20 11:36:26 1th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:26 1th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 1th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:27 1th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:27 1th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:27 1th_node systemd[1]: start request repeated too quickly for vip-manager.service
Jan 20 11:36:27 1th_node systemd[1]: Failed to start Manages Virtual IP for Patroni.
Jan 20 11:36:27 1th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:27 1th_node systemd[1]: vip-manager.service failed.
{RHEL7.9}{N/A}{2th_node}[root@~]$ systemctl status vip-manager.service
● vip-manager.service - Manages Virtual IP for Patroni
Loaded: loaded (/etc/systemd/system/vip-manager.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Wed 2021-01-20 11:36:26 MSK; 15min ago
Process: 15816 ExecStopPost=/sbin/ip addr del VV.VV.VV.VV/23 dev ens192 (code=exited, status=2)
Process: 15810 ExecStart=/usr/bin/vip-manager --config=/etc/patroni/vip-manager.yml (code=exited, status=1/FAILURE)
Main PID: 15810 (code=exited, status=1/FAILURE)
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:26 2th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: start request repeated too quickly for vip-manager.service
Jan 20 11:36:26 2th_node systemd[1]: Failed to start Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
{RHEL7.9}{N/A}{3th_node}[root@~]$ systemctl status vip-manager.service
● vip-manager.service - Manages Virtual IP for Patroni
Loaded: loaded (/etc/systemd/system/vip-manager.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Wed 2021-01-20 11:36:27 MSK; 15min ago
Process: 2148 ExecStopPost=/sbin/ip addr del VV.VV.VV.VV/23 dev ens192 (code=exited, status=2)
Process: 2143 ExecStart=/usr/bin/vip-manager --config=/etc/patroni/vip-manager.yml (code=exited, status=1/FAILURE)
Main PID: 2143 (code=exited, status=1/FAILURE)
Jan 20 11:36:27 3th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:27 3th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:27 3th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:27 3th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:27 3th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:27 3th_node systemd[1]: start request repeated too quickly for vip-manager.service
Jan 20 11:36:27 3th_node systemd[1]: Failed to start Manages Virtual IP for Patroni.
Jan 20 11:36:27 3th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:27 3th_node systemd[1]: vip-manager.service failed.
Solution: since I was using postgres version 12, I did not have the postgresql12-devel package installed due to the missing llvm-toolset-7-clang dependency (this package is missing from our repository). I chose to use Postgres version 10 and it worked.
The llvm-toolset-7-clang package can be found in the Software Collections (SCL) repository. See this commit: https://github.com/vitabaks/postgresql_cluster/commit/cc24028962b30ba7cc4bd59c6defdb17af2545a5 If for some reason you cannot upload a package to your repository, you can download it and specify package file (as well as all dependent packages) in the packages_from_file variable.
The question is, do I need to additionally allow requests to the VIP address, if so, on which ports and to which addresses?
vip-manager must have access to DCS, if you use etcd, then this is port 2379. To access the VIP address from the application side, you need to open "pgbouncer_listen_port" or if you do not use pgbouncer, then access via the "postgresql_port" is required.
Jan 20 11:36:27 3th_node systemd[1]: vip-manager.service failed.
It's not entirely clear yet.
Are there any other errors in the vip-manager log?
sudo journalctl -u vip-manager
The llvm-toolset-7-clang package can be found in the Software Collections (SCL) repository. See this commit: cc24028 If for some reason you cannot upload a package to your repository, you can download it and specify package file (as well as all dependent packages) in the packages_from_file variable.
Ok, thanks! I will try.
vip-manager must have access to DCS, if you use etcd, then this is port 2379.
I use etcd by default. All cluster nodes use port 2379 on a regular address (not virtual).
To access the VIP address from the application side, you need to open "pgbouncer_listen_port" or if you do not use pgbouncer, then access via the "postgresql_port" is required.
Yes exactly. I missed a moment with pgbouncer, port 6432 is not set in firewall for regular and virtual address.
Are there any other errors in the vip-manager log?
{RHEL7.9}{N/A}{2th_node}[root@~]$ journalctl -u vip-manager
-- Logs begin at Tue 2021-01-19 21:18:20 MSK, end at Wed 2021-01-20 14:04:34 MSK. --
Jan 20 11:36:25 2th_node systemd[1]: Started Manages Virtual IP for Patroni.
Jan 20 11:36:25 2th_node vip-manager[15626]: 2021/01/20 11:36:25 reading config from /etc/patroni/vip-manager.yml
Jan 20 11:36:25 2th_node vip-manager[15626]: 2021/01/20 11:36:25 Setting network interface is mandatory
Jan 20 11:36:25 2th_node systemd[1]: vip-manager.service: main process exited, code=exited, status=1/FAILURE
Jan 20 11:36:25 2th_node ip[15631]: RTNETLINK answers: Cannot assign requested address
Jan 20 11:36:25 2th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:25 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:25 2th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:26 2th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: Started Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node vip-manager[15662]: 2021/01/20 11:36:26 reading config from /etc/patroni/vip-manager.yml
Jan 20 11:36:26 2th_node vip-manager[15662]: 2021/01/20 11:36:26 Setting network interface is mandatory
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: main process exited, code=exited, status=1/FAILURE
Jan 20 11:36:26 2th_node ip[15667]: RTNETLINK answers: Cannot assign requested address
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:26 2th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: Started Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node vip-manager[15726]: 2021/01/20 11:36:26 reading config from /etc/patroni/vip-manager.yml
Jan 20 11:36:26 2th_node vip-manager[15726]: 2021/01/20 11:36:26 Setting network interface is mandatory
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: main process exited, code=exited, status=1/FAILURE
Jan 20 11:36:26 2th_node ip[15732]: RTNETLINK answers: Cannot assign requested address
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:26 2th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: Started Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: main process exited, code=exited, status=1/FAILURE
Jan 20 11:36:26 2th_node vip-manager[15796]: 2021/01/20 11:36:26 reading config from /etc/patroni/vip-manager.yml
Jan 20 11:36:26 2th_node vip-manager[15796]: 2021/01/20 11:36:26 Setting network interface is mandatory
Jan 20 11:36:26 2th_node ip[15803]: RTNETLINK answers: Cannot assign requested address
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:26 2th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: Started Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node vip-manager[15810]: 2021/01/20 11:36:26 reading config from /etc/patroni/vip-manager.yml
Jan 20 11:36:26 2th_node vip-manager[15810]: 2021/01/20 11:36:26 Setting network interface is mandatory
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: main process exited, code=exited, status=1/FAILURE
Jan 20 11:36:26 2th_node ip[15816]: RTNETLINK answers: Cannot assign requested address
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service: control process exited, code=exited status=2
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service holdoff time over, scheduling restart.
Jan 20 11:36:26 2th_node systemd[1]: Stopped Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: start request repeated too quickly for vip-manager.service
Jan 20 11:36:26 2th_node systemd[1]: Failed to start Manages Virtual IP for Patroni.
Jan 20 11:36:26 2th_node systemd[1]: Unit vip-manager.service entered failed state.
Jan 20 11:36:26 2th_node systemd[1]: vip-manager.service failed.
@pavelkogen
What package version do you have specified in the vip_manager_package_file
variable?
Must be at least 1.0 version
try vip_manager_package_file: "vip-manager_1.0.1-1_amd64.rpm"
download file here: https://github.com/cybertec-postgresql/vip-manager/releases/download/v1.0.1/vip-manager_1.0.1-1_amd64.rpm
@pavelkogen What package version do you have specified in the
vip_manager_package_file
variable? Must be at least 1.0 versiontry
vip_manager_package_file: "vip-manager_1.0.1-1_amd64.rpm"
download file here: https://github.com/cybertec-postgresql/vip-manager/releases/download/v1.0.1/vip-manager_1.0.1-1_amd64.rpm
Yeah, i saw your issue in vip-manager repository and used the most recent version 1.0.1 just in case. In my comments above, I am already using this version.
From the log I see that the ens192
interface is specified
If your server has several network interfaces, make sure that you have specified the correct interface name in the vip_interface
variable (or vip_manager_iface
).
From the log I see that the
ens192
interface is specifiedIf your server has several network interfaces, make sure that you have specified the correct interface name in the
vip_interface
variable (orvip_manager_iface
).
Yes, this interface name was created automatically during server installation. Ansible assigned the vip_manager_iface variable from the ansible_default_ipv4.interface variable. I am using one interface on these servers.
{RHEL7.9}{N/A}{1th_node}[root@~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:bf:a1:72 brd ff:ff:ff:ff:ff:ff
inet XX.XX.XX.XX/23 brd NETMASK scope global ens192
valid_lft forever preferred_lft forever
I have a suspicion that the vip-manager cannot configure the network service to add a VIP address. However, I cannot check this, because I use all the default parameters, and I do not see anything unusual.
Perhaps I should manually make changes to the server's network configuration?
My vip-manager config:
$ cat /etc/patroni/vip-manager.yml
interval: 1000
trigger-key: "/service/postgres-cluster/leader"
trigger-value: "1th_node"
ip: VV.VV.VV.VV # the virtual ip address to manage
netmask: 23 # netmask for the virtual ip
interface: ens192 # interface to which the virtual ip will be added
hosting-type: basic # possible values: basic, or hetzner.
dcs-type: etcd # etcd or consul
dcs-endpoints:
- http://XX.XX.XX.XX:2379
- http://YY.YY.YY.YY:2379
- http://NN.NN.NN.NN:2379
retry-num: 2
retry-after: 250 #in milliseconds
verbose: false
When trying to run manually:
{RHEL7.9}{N/A}{1th_node}[root@~]$ vip-manager --config=/etc/patroni/vip-manager.yml
2021/01/21 16:44:05 reading config from /etc/patroni/vip-manager.yml
2021/01/21 16:44:05 Setting network interface is mandatory
Jan 20 11:36:26 2th_node ip[15816]: RTNETLINK answers: Cannot assign requested address
Perhaps the problem is not related to vip-manager. There are some problems when adding a second IP address for the network card.
Try to add the VIP manually, will there be an error?
Example:
ip addr add VV.VV.VV.VV/23 dev ens192
VV.VV.VV.VV - your VIP address
Try to add the VIP manually, will there be an error?
There are no problems with adding a VIP address manually. I can add and remove it, everything is successful. I have not been able to determine the reason why vip-manager does not work.
But, I tried plan A. So, the cluster seems to be ready to go. But services like keepalived and confd were not started. I started them manually and found that keepalived does not translate the VIP address to other servers (VIP-address is enable on all servers at once). Perhaps the problem is that I have blocked multicast on NSX (I use VMware to virtualize my servers).
I also wanted to clarify why the keepalived configuration on three nodes is the same, they all have the BACKUP status and the same weight. This is normal?
keepalived.conf.j2:
vrrp_instance VI_1 {
priority 100
state BACKUP
I also wanted to clarify why the keepalived configuration on three nodes is the same, they all have the BACKUP status and the same weight. This is normal?
Yes.
In the TypeA scheme, the VIP address is not tied to the role of the master. In our configuration keepalived checks the status of the HAProxy service and in case of a failure delegates the VIP to another balancer server.
If necessary, you can manually increase the weight on one of the servers to move the VIP address.
I have not been able to determine the reason why vip-manager does not work.
While this remains a mystery.
Let me know if you can fix this problem. I haven't been able to reproduce it yet, everything works fine.
Hi everyone!
I'm very grateful to vitabaks on building and working on such a large project to automate the creation of a postgres cluster! 🥇 I use postgresql_cluster role for deployment productivity Postgres-cluster in closed (offline) infrastructure. Unfortunately, I ran into a problem of starting the patroni service.
I'm used default pip packages from RedHat.yml file:
Role deployment ends at a moment:
Going to 2th node and see:
What could be the reason for this?
Red Hat 7.9 with all latest updates. All role settings by default.