Juniper / contrail-charms

Juju charms for Contrail services.
Apache License 2.0
13 stars 22 forks source link

Contrail-agent stuck in hook failed: "tls-certificates-relation-joined" #148

Closed kashif-nawaz closed 4 years ago

kashif-nawaz commented 4 years ago

Contrail-agent stuck in "hook failed: "tls-certificates-relation-joined"" once easra [ "contrail-agent:tls-certificates", "easyrsa:client" ] is added in bundle file

kashif-nawaz commented 4 years ago

ubuntu@jumphost:~$ juju status Model Controller Cloud/Region Version SLA Timestamp default maas-controller maas/default 2.7.3 unsupported 23:04:29Z

App Version Status Scale Charm Store Rev OS Notes contrail-agent error 2 contrail-agent jujucharms 13 ubuntu
contrail-analytics 1912-32 active 1 contrail-analytics jujucharms 11 ubuntu
contrail-analyticsdb 1912-32 active 1 contrail-analyticsdb jujucharms 11 ubuntu
contrail-controller 1912-32 active 1 contrail-controller jujucharms 12 ubuntu
contrail-haproxy active 1 haproxy jujucharms 55 ubuntu
contrail-keepalived active 1 keepalived jujucharms 28 ubuntu
contrail-keystone-auth active 1 contrail-keystone-auth jujucharms 11 ubuntu
contrail-openstack 1912-32 active 4 contrail-openstack jujucharms 12 ubuntu
dashboard-hacluster active 1 hacluster jujucharms 62 ubuntu
easyrsa 3.0.1 active 1 easyrsa jujucharms 254 ubuntu
external-policy-routing waiting 0 policy-routing jujucharms 3 ubuntu
glance 16.0.1 active 1 glance jujucharms 290 ubuntu
glance-hacluster active 1 hacluster jujucharms 62 ubuntu
heat 10.0.2 active 1 heat jujucharms 271 ubuntu
heat-hacluster active 1 hacluster jujucharms 62 ubuntu
keystone 13.0.2 active 1 keystone jujucharms 309 ubuntu
keystone-hacluster active 1 hacluster jujucharms 62 ubuntu
memcached active 1 memcached jujucharms 26 ubuntu
mysql 5.7.20 active 1 percona-cluster jujucharms 281 ubuntu
mysql-hacluster active 1 hacluster jujucharms 62 ubuntu
ncc-hacluster active 1 hacluster jujucharms 62 ubuntu
neutron-api 12.1.0 active 1 neutron-api jujucharms 281 ubuntu
neutron-hacluster active 1 hacluster jujucharms 62 ubuntu
nova-cloud-controller 17.0.12 active 1 nova-cloud-controller jujucharms 339 ubuntu
nova-compute 17.0.12 active 2 nova-compute jujucharms 309 ubuntu
ntp 3.2 active 8 ntp jujucharms 36 ubuntu
openstack-dashboard 13.0.2 active 1 openstack-dashboard jujucharms 295 ubuntu
rabbitmq-server 3.6.10 active 1 rabbitmq-server jujucharms 97 ubuntu
ubuntu 18.04 active 1 ubuntu jujucharms 15 ubuntu

Unit Workload Agent Machine Public address Ports Message contrail-analytics/0 active idle 0/kvm/0 192.168.24.126 Unit is ready ntp/6 active idle 192.168.24.126 123/udp chrony: Ready contrail-analyticsdb/0 active idle 0/kvm/1 192.168.24.119 Unit is ready ntp/4 active idle 192.168.24.119 123/udp chrony: Ready contrail-controller/0 active idle 0/kvm/2 192.168.24.120 Unit is ready ntp/7 active idle 192.168.24.120 123/udp chrony: Ready contrail-haproxy/0 active idle 0/lxd/0 192.168.24.130 8081/tcp,8082/tcp,8143/tcp,10000/tcp Unit is ready contrail-keepalived/0 active idle 192.168.24.130 VIP ready contrail-keystone-auth/0 active idle 0/lxd/1 192.168.24.121 Unit is ready easyrsa/0 active idle 0/lxd/2 192.168.24.125 Certificate Authority connected. glance/0 active idle 0/lxd/3 192.168.24.131 9292/tcp Unit is ready glance-hacluster/0 active idle 192.168.24.131 Unit is ready and clustered heat/0 active idle 0/kvm/3 192.168.24.122 8000/tcp,8004/tcp Unit is ready contrail-openstack/3 active idle 192.168.24.122 Unit is ready heat-hacluster/0 active idle 192.168.24.122 Unit is ready and clustered ntp/5 active idle 192.168.24.122 123/udp chrony: Ready keystone/0 active idle 0/lxd/4 192.168.24.127 5000/tcp Unit is ready keystone-hacluster/0 active idle 192.168.24.127 Unit is ready and clustered memcached/0 active idle 0/lxd/5 192.168.24.124 11211/tcp Unit is ready mysql/0 active idle 0/lxd/6 192.168.24.132 3306/tcp Unit is ready mysql-hacluster/0 active idle 192.168.24.132 Unit is ready and clustered neutron-api/0 active idle 0/kvm/4 192.168.24.118 9696/tcp Unit is ready contrail-openstack/2 active idle 192.168.24.118 Unit is ready neutron-hacluster/0 active idle 192.168.24.118 Unit is ready and clustered ntp/3 active idle 192.168.24.118 123/udp chrony: Ready nova-cloud-controller/0 active idle 0/lxd/7 192.168.24.123 8774/tcp,8778/tcp Unit is ready ncc-hacluster/0 active idle 192.168.24.123 Unit is ready and clustered nova-compute/0 active idle 1 192.168.24.116 Unit is ready contrail-agent/0 error idle 192.168.24.116 hook failed: "tls-certificates-relation-joined" contrail-openstack/0 active idle 192.168.24.116 Unit is ready ntp/1 active idle 192.168.24.116 123/udp chrony: Ready nova-compute/1 active idle 2 192.168.24.117 Unit is ready contrail-agent/1 error idle 192.168.24.117 hook failed: "tls-certificates-relation-joined" contrail-openstack/1 active idle 192.168.24.117 Unit is ready ntp/2 active idle 192.168.24.117 123/udp chrony: Ready openstack-dashboard/0 active idle 0/lxd/8 192.168.24.128 80/tcp,443/tcp Unit is ready dashboard-hacluster/0 active idle 192.168.24.128 Unit is ready and clustered rabbitmq-server/0 active idle 0/lxd/9 192.168.24.129 5672/tcp Unit is ready ubuntu/0 active idle 0 192.168.24.115 ready ntp/0 active idle 192.168.24.115 123/udp chrony: Ready

kashif-nawaz commented 4 years ago

root@compute-1:/var/log/juju# tail -f unit-contrail-agent-0.log 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined File "/var/lib/juju/agents/unit-contrail-agent-0/charm/hooks/tls-certificates-relation-joined", line 98, in tls_certificates_relation_joined 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined settings = common_utils.get_tls_settings(utils.get_vhost_ip()) 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined File "/var/lib/juju/agents/unit-contrail-agent-0/charm/hooks/common_utils.py", line 184, in get_tls_settings 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined res = check_output(['getent', 'hosts', control_ip]).decode('UTF-8') 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined File "/usr/lib/python3.6/subprocess.py", line 356, in check_output 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined **kwargs).stdout 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined File "/usr/lib/python3.6/subprocess.py", line 438, in run 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined output=stdout, stderr=stderr) 2020-03-22 12:16:02 DEBUG tls-certificates-relation-joined subprocess.CalledProcessError: Command '['getent', 'hosts', '192.168.5.27']' returned non-zero exit status 2. 2020-03-22 12:16:02 ERROR juju.worker.uniter.operation runhook.go:132 hook "tls-certificates-relation-joined" failed: exit status 1 ^C

kashif-nawaz commented 4 years ago

root@compute-2:~# getent hosts 127.0.1.1 compute-2.maas compute-2 127.0.0.1 localhost 127.0.0.1 ip6-localhost ip6-loopback

root@compute-1:~# getent hosts 127.0.1.1 compute-1.maas compute-1 127.0.0.1 localhost 127.0.0.1 ip6-localhost ip6-loopback

kashif-nawaz commented 4 years ago

is it normal that getent hosts is returning 2 host names e.g "127.0.1.1 compute-2.maas compute-2"

if I look at contrail-controller node then getent is returning single host name

getent hosts 192.168.24.144 juju-fe9b4d-0-kvm-2-contrail-rmq 127.0.0.1 localhost 127.0.0.1 ip6-localhost ip6-loopback

kashif-nawaz commented 4 years ago

problem was due to rdns_mode=0 on vrouter transport network. Due to this problem compute nodes were not resolving names on vrouter transport network and probably csr was not generated properly.

Problem fixed by following any of below given option option1- I added manual entry in /etc/hosts e.g 192.168.5.27 compute-1.maas and and made /etc/hosts immutable

option2- which is more logical.. just rdns_mode=2 on vrouter transport network (this is more natural way)