Closed fatmaAliGamal closed 10 months ago
Make sure that you have specified only private addresses in the inventory, not public.
Use of Internal and External IP Addresses in Ansible Inventory: https://github.com/vitabaks/postgresql_cluster/issues/358#issuecomment-1580650911
i use ec2 at aws and i use private_ip_address ansible_host=public_ip_address but still error appears TASK [sysctl : Build a sysctl_conf dynamic variable] **** ok: [10.0.4.214] => (item=etcd_cluster) ok: [10.0.4.214] => (item=master) ok: [10.0.4.27] => (item=etcd_cluster) ok: [10.0.4.214] => (item=postgres_cluster) ok: [10.0.4.27] => (item=postgres_cluster) ok: [10.0.5.212] => (item=etcd_cluster) ok: [10.0.4.27] => (item=replica) ok: [10.0.5.212] => (item=postgres_cluster) ok: [10.0.5.212] => (item=replica)
TASK [sysctl : Setting kernel parameters] *** fatal: [10.0.4.27]: FAILED! => {"msg": "Failed to connect to the host via ssh: "} ...ignoring fatal: [10.0.4.214]: FAILED! => {"msg": "Failed to connect to the host via ssh: "} ...ignoring fatal: [10.0.5.212]: FAILED! => {"msg": "Failed to connect to the host via ssh: "} ...ignoring
TASK [etcd : Make sure the unzip/tar packages are present] ** fatal: [10.0.4.214]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true} fatal: [10.0.4.27]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true} fatal: [10.0.5.212]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true}
Problem with access to servers by ssh.
Try ansible all -m ping
10.0.4.97 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python3" }, "changed": false, "ping": "pong" } 10.0.5.175 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python3" }, "changed": false, "ping": "pong" } 10.0.4.224 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python3" }, "changed": false, "ping": "pong" } before run ansible-playbook deploy_pgcluster.yml
after run it appear TASK [sysctl : Build a sysctl_conf dynamic variable] ***** ok: [10.0.4.97] => (item=etcd_cluster) ok: [10.0.4.97] => (item=master) ok: [10.0.4.224] => (item=etcd_cluster) ok: [10.0.4.97] => (item=postgres_cluster) ok: [10.0.4.224] => (item=postgres_cluster) ok: [10.0.5.175] => (item=etcd_cluster) ok: [10.0.4.224] => (item=replica) ok: [10.0.5.175] => (item=postgres_cluster) ok: [10.0.5.175] => (item=replica)
when i run ansible all -m ping again
10.0.4.97 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: Connection closed by 184.73.144.179 port 22",
"unreachable": true
}
10.0.4.224 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: Connection closed by 3.89.251.124 port 22",
"unreachable": true
}
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: MemoryError
10.0.5.175 | FAILED! => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"module_stderr": "Shared connection to 34.228.22.118 closed.\r\n",
"module_stdout": "Traceback (most recent call last):\r\n File \"/home/ubuntu/.ansible/tmp/ansible-tmp-1688324331.0731466-102147-27517743771944/AnsiballZ_ping.py\", line 107, in
note i use ec2 aws ubuntu 22.04 instance_type = "t2.micro"
note i use ec2 aws ubuntu 22.04 instance_type = "t2.micro"
This is the problem the server is too small memory resources to service ansible modules.
Try a server with at least 2 or 4 GiB of memory
i use 8 GB of memory but i appear an other error fatal: [10.0.4.132]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'postgres_cluster_nodes' is undefined\n\nThe error appears to be in '/mnt/396A486035A35D5E/soa/task-cluster-db/soa-db/postgresql_cluster/roles/deploy-finish/tasks/main.yml': line 162, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: PostgreSQL Cluster connection info\n ^ here\n"} ...ignoring
'postgres_cluster_nodes' is undefined
This may be the reason that set_fact was not executed for the variable "postgres_cluster_nodes", which is based on the list of hosts (inventory_hostname) defined in the postgres_cluster group. Code here
Please show the result of the "Create list of nodes" task
And please show your inventory file.
TASK [deploy-finish : Virtual IP Address (VIP) info] ***** skipping: [10.0.4.27] skipping: [10.0.4.246] skipping: [10.0.5.227]
TASK [deploy-finish : Create list of nodes] ** fatal: [10.0.4.27]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'balancers'\n\nThe error appears to be in '/mnt/396A486035A35D5E/soa/task-cluster-db/soa-db/postgresql_cluster/roles/deploy-finish/tasks/main.yml': line 128, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block: # if cluster_vip is not defined\n - name: Create list of nodes\n ^ here\n"} ...ignoring
TASK [deploy-finish : PostgreSQL Cluster connection info] **** skipping: [10.0.4.27]
TASK [deploy-finish : PostgreSQL Cluster connection info] **** skipping: [10.0.4.27]
TASK [deploy-finish : PostgreSQL Cluster connection info] **** fatal: [10.0.4.27]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'postgres_cluster_nodes' is undefined\n\nThe error appears to be in '/mnt/396A486035A35D5E/soa/task-cluster-db/soa-db/postgresql_cluster/roles/deploy-finish/tasks/main.yml': line 162, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: PostgreSQL Cluster connection info\n ^ here\n"} ...ignoring
TASK [deploy-finish : PostgreSQL Cluster connection info] **** skipping: [10.0.4.27]
TASK [deploy-finish : PostgreSQL Cluster connection info] **** skipping: [10.0.4.27]
PLAY RECAP ***
10.0.4.246 : ok=94 changed=61 unreachable=0 failed=0 skipped=308 rescued=0 ignored=0
10.0.4.27 : ok=106 changed=62 unreachable=0 failed=0 skipped=326 rescued=0 ignored=2
10.0.5.227 : ok=94 changed=61 unreachable=0 failed=0 skipped=308 rescued=0 ignored=0
localhost : ok=0 changed=0 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
hosts.ini
[etcd_cluster] 10.0.4.27 ansible_host=3.87.225.7 10.0.4.246 ansible_host=44.211.199.162 10.0.5.227 ansible_host=54.157.19.230 [master] 10.0.4.27 ansible_host=3.87.225.7 [replica] 10.0.4.246 ansible_host=44.211.199.162 10.0.5.227 ansible_host=54.157.19.230 [postgres_cluster:children] master replica [all:vars] ansible_connection='ssh' ansible_ssh_port='22' ansible_ssh_user=ubuntu ansible_ssh_private_key_file=./postgres-db-key which ip of server i put at cluster_vip= "" proxy_env: {}
@fatmaAliGamal Why did you decide to remove the required group "balancers" from the inventory file?
Please use the suggested version of the inventory file.
P.S. I will make the "balancers" group optional.
because i create it dynamic when i create 3 ec2 what is wrong at inventory this is only difference between them
[master] 10.128.64.140 hostname=pgnode01 postgresql_exists=false
[replica] 10.128.64.142 hostname=pgnode02 postgresql_exists=false 10.128.64.143 hostname=pgnode03 postgresql_exists=false hostname and postgresql_exists=false this must after 10.0.5.227 ansible_host=54.157.19.230, if there must be add , any code at playbook you change hostname P.S. I will make the "balancers" group optional ==> i use type B so i can ignore this right or not.
I will try it now but you can reply please for this question i can't know this is optional or mandatory hostname and postgresql_exists=false this must after 10.0.5.227 ansible_host=54.157.19.230, if there must be add , any code at playbook you change hostname
This is optional, you don't have to define these variables in the inventory
# "postgresql_exists='true'" if PostgreSQL is already exists and running
# "hostname=" variable is optional (used to change the server name)
thanks for your support but when i test [Type B] PostgreSQL High-Availability only using sudo systemctl stop postgresql.service at primary server no replica change from secondary to primary to cover failover
no replica change from secondary to primary to cover failover
Please create a separate issue and describe the details, and please attach Patroni logs.
okey thanks again for your faster response
TASK [sysctl : Setting kernel parameters] ** changed: [54.163.22.109] => (item={'name': 'net.ipv4.ip_nonlocal_bind', 'value': '1'}) changed: [3.95.217.76] => (item={'name': 'net.ipv4.ip_nonlocal_bind', 'value': '1'}) changed: [18.234.60.224] => (item={'name': 'net.ipv4.ip_nonlocal_bind', 'value': '1'}) changed: [54.163.22.109] => (item={'name': 'net.ipv4.ip_forward', 'value': '1'}) changed: [3.95.217.76] => (item={'name': 'net.ipv4.ip_forward', 'value': '1'}) changed: [18.234.60.224] => (item={'name': 'net.ipv4.ip_forward', 'value': '1'}) changed: [3.95.217.76] => (item={'name': 'net.ipv4.ip_local_port_range', 'value': '10000 65535'}) changed: [54.163.22.109] => (item={'name': 'net.ipv4.ip_local_port_range', 'value': '10000 65535'}) changed: [18.234.60.224] => (item={'name': 'net.ipv4.ip_local_port_range', 'value': '10000 65535'}) changed: [3.95.217.76] => (item={'name': 'net.core.netdev_max_backlog', 'value': '10000'}) changed: [54.163.22.109] => (item={'name': 'net.core.netdev_max_backlog', 'value': '10000'}) changed: [18.234.60.224] => (item={'name': 'net.core.netdev_max_backlog', 'value': '10000'}) changed: [3.95.217.76] => (item={'name': 'net.ipv4.tcp_max_syn_backlog', 'value': '8192'}) changed: [54.163.22.109] => (item={'name': 'net.ipv4.tcp_max_syn_backlog', 'value': '8192'}) changed: [18.234.60.224] => (item={'name': 'net.ipv4.tcp_max_syn_backlog', 'value': '8192'}) changed: [3.95.217.76] => (item={'name': 'net.core.somaxconn', 'value': '65535'}) changed: [54.163.22.109] => (item={'name': 'net.core.somaxconn', 'value': '65535'}) changed: [18.234.60.224] => (item={'name': 'net.core.somaxconn', 'value': '65535'}) changed: [3.95.217.76] => (item={'name': 'net.ipv4.tcp_tw_reuse', 'value': '1'}) changed: [54.163.22.109] => (item={'name': 'net.ipv4.tcp_tw_reuse', 'value': '1'}) changed: [18.234.60.224] => (item={'name': 'net.ipv4.tcp_tw_reuse', 'value': '1'}) fatal: [3.95.217.76]: FAILED! => {"msg": "Failed to connect to the host via ssh: "} ...ignoring fatal: [54.163.22.109]: FAILED! => {"msg": "Failed to connect to the host via ssh: "} ...ignoring fatal: [18.234.60.224]: FAILED! => {"msg": "Failed to connect to the host via ssh: "} ...ignoring
TASK [etcd : Make sure the unzip/tar packages are present] ***** fatal: [3.95.217.76]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true} fatal: [54.163.22.109]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true} fatal: [18.234.60.224]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true}
NO MORE HOSTS LEFT *****
PLAY RECAP ***** 18.234.60.224 : ok=8 changed=4 unreachable=1 failed=0 skipped=29 rescued=0 ignored=1
3.95.217.76 : ok=8 changed=4 unreachable=1 failed=0 skipped=29 rescued=0 ignored=1
54.163.22.109 : ok=8 changed=4 unreachable=1 failed=0 skipped=29 rescued=0 ignored=1