Closed pradvara closed 8 years ago
Though its 6 node cluster, docker info
command is only showing 2 nodes.
@mapuri @vvb
@gaurav-dalvi @pradvara
the node contiv-b3
is probably missing the latest cisco enic driver. The VIP 10.106.240.121
should be only on one of the master nodes.
[stack@contiv-b3 ~]$ ip a | grep "\.121"
inet 10.106.240.121/32 scope global enp133s0_0
[stack@contiv-b3 ~]$
UCP failed to start with the below error,
INFO[0000] Unable to connect to 10.106.240.121:443: dial tcp 10.106.240.121:443: getsockopt: connection refused
FATA[0000] Post https://10.106.240.121:443/auth/login: dial tcp 10.106.240.121:443: getsockopt: connection refused
@gaurav-dalvi @vvb let me try to update the enic driver on all the nodes
@gaurav-dalvi @vvb : i updated the enic driver for that Blade(kmod-enic-2.3.0.20-rhel7u2.el7.x86_64.rpm)
I still see service VIP getting assigned to that node and UCP is failing to start
[stack@contiv-b3 ~]$ systemctl status ucp.service ● ucp.service - Ucp Loaded: loaded (/etc/systemd/system/ucp.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2016-08-31 22:56:29 IST; 32s ago Main PID: 15033 (code=exited, status=1/FAILURE)
Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-auth-worker-data Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-client-root-ca Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-cluster-root-ca Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-controller-client-certs Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-controller-server-certs Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-kv Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-kv-certs Aug 31 22:56:29 contiv-b3 ucp.sh[17163]: ucp-node-certs Aug 31 22:56:29 contiv-b3 systemd[1]: Unit ucp.service entered failed state. Aug 31 22:56:29 contiv-b3 systemd[1]: ucp.service failed.
Issue not seen with latest VNIC driver "kmod-enic-2.3.0.30-rhel7u2.el7.x86_64". All the nodes in cluster are getting detected
[stack@contiv-b1 ucp-bundle-admin]$ docker info
Containers: 47
Running: 45
Paused: 0
Stopped: 2
Images: 80
Server Version: swarm/1.2.3
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 6
contiv-b1: 10.106.240.108:12376
└ ID: ZTKV:22OG:WGLB:X646:EJLO:4CFZ:UPUU:E3A2:LKBY:KGWN:IJ5C:GYJH
└ Status: Healthy
└ Containers: 12
└ Reserved CPUs: 0 / 33
└ Reserved Memory: 0 B / 107.2 GiB
└ Labels: executiondriver=, kernelversion=3.10.0-327.22.2.el7.x86_64, operatingsystem=Storage, storagedriver=devicemapper
└ UpdatedAt: 2016-09-16T10:14:38Z
└ ServerVersion: 1.11.1
contiv-b2: 10.106.240.111:12376
└ ID: IF65:LLRY:GJCQ:USO4:XFCA:UTSX:Y3SN:5LRB:BQDS:EEI2:BANB:6I2Z
└ Status: Healthy
└ Containers: 14
└ Reserved CPUs: 0 / 25
└ Reserved Memory: 0 B / 98.9 GiB
└ Labels: executiondriver=, kernelversion=3.10.0-327.22.2.el7.x86_64, operatingsystem=Storage, storagedriver=devicemapper
└ UpdatedAt: 2016-09-16T10:14:35Z
└ ServerVersion: 1.11.1
contiv-b3: 10.106.240.112:12376
└ ID: W2MY:VPZN:7WZD:GMNM:NDU2:IDSJ:523Y:REJS:456X:75YS:LDWZ:65UK
└ Status: Healthy
└ Containers: 12
└ Reserved CPUs: 0 / 8
└ Reserved Memory: 0 B / 98.9 GiB
└ Labels: executiondriver=, kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=Red Hat Enterprise Linux, storagedriver=devicemapper
└ UpdatedAt: 2016-09-16T10:14:45Z
└ ServerVersion: 1.11.1
contiv-b4: 10.106.240.110:12376
└ ID: GKIP:EC4B:X3Q7:YGJQ:WO3A:DF66:PC6Y:5QCM:3HPL:EZRX:SI6W:ZKCI
└ Status: Healthy
└ Containers: 3
└ Reserved CPUs: 0 / 8
└ Reserved Memory: 0 B / 115.4 GiB
└ Labels: executiondriver=, kernelversion=3.10.0-327.22.2.el7.x86_64, operatingsystem=Storage, storagedriver=devicemapper
└ UpdatedAt: 2016-09-16T10:15:12Z
└ ServerVersion: 1.11.1
contiv-b5: 10.106.240.109:12376
└ ID: KLOJ:VIYP:Q3I5:QNSR:OWDQ:L6S3:LFLZ:2JF7:25HU:3WO6:ELMX:MTRX
└ Status: Healthy
└ Containers: 3
└ Reserved CPUs: 0 / 8
└ Reserved Memory: 0 B / 107.2 GiB
└ Labels: executiondriver=, kernelversion=3.10.0-327.22.2.el7.x86_64, operatingsystem=Storage, storagedriver=devicemapper
└ UpdatedAt: 2016-09-16T10:14:51Z
└ ServerVersion: 1.11.1
contiv-b6: 10.106.240.116:12376
└ ID: IH5V:TLFD:Q4LW:FZQT:7NVX:BDYF:2YFD:N56Z:XVQQ:MQ3T:UZ6T:VFGU
└ Status: Healthy
└ Containers: 3
└ Reserved CPUs: 0 / 8
└ Reserved Memory: 0 B / 65.83 GiB
└ Labels: executiondriver=, kernelversion=3.10.0-327.22.2.el7.x86_64, operatingsystem=Storage, storagedriver=devicemapper
└ UpdatedAt: 2016-09-16T10:14:35Z
└ ServerVersion: 1.11.1
Cluster Managers: 3
10.106.240.108: Healthy
└ Orca Controller: https://10.106.240.108:443
└ Swarm Manager: tcp://10.106.240.108:2376
└ KV: etcd://10.106.240.108:12379
10.106.240.111: Healthy
└ Orca Controller: https://10.106.240.111:443
└ Swarm Manager: tcp://10.106.240.111:2376
└ KV: etcd://10.106.240.111:12379
10.106.240.112: Healthy
└ Orca Controller: https://10.106.240.112:443
└ Swarm Manager: tcp://10.106.240.112:2376
└ KV: etcd://10.106.240.112:12379
Plugins:
Volume:
Network:
Kernel Version: 3.10.0-327.22.2.el7.x86_64
Operating System: linux
Architecture: amd64
CPUs: 90
Total Memory: 593.4 GiB
Name: ucp-controller-contiv-b1
ID: LTYN:X5ZL:MZLP:EJIO:N5SG:NAVJ:ZZXU:VFIS:5YS2:SEEC:F567:XNRT
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support
Labels:
com.docker.ucp.license_key=IBuElytqSzSQ35i-ef5o80aupB2NmxBX6TJQVrsZ6Njq
com.docker.ucp.license_max_engines=10
com.docker.ucp.license_expires=2017-03-05 18:30:59 +0000 UTC
Thanks @pradvara . Closing this issue now .
Imported the global variables and Commissioned the nodes:
The setup has three master and three worker nodes, the first master acts as the controller too.
Set the Docker Host: