Closed Vincenttoolate closed 5 years ago
Hi,
I am also getting the same issue "FAILED! => {"changed": false, "msg": "Control plane pods didn't come up"}"
Any solution for this issue ?
Please let me know how to fix it ?
You need to change your inventory.download to use ${OKD_MASTER_HOSTNAME}
instead of ${OKD_MASTER_IP}
.
The same goes to worker nodes and infra nodes. Make sure you're using the FQDN (hostname.domain.com) and not just the hostname.
Your inventory.download file should look like this:
[masters]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_schedulable=true
[etcd]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP}
[nodes]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_node_group_name='node-config-master'
${OKD_WORKER_NODE_1_HOSTNAME} openshift_ip=${OKD_WORKER_NODE_1_IP} openshift_node_group_name='node-config-compute'
${OKD_INFRA_NODE_1_HOSTNAME} openshift_ip=${OKD_INFRA_NODE_1_IP} openshift_node_group_name='node-config-infra'
You need to change your inventory.download to use
${OKD_MASTER_HOSTNAME}
instead of${OKD_MASTER_IP}
. The same goes to worker nodes and infra nodes. Make sure you're using the FQDN (hostname.domain.com) and not just the hostname.Your inventory.download file should look like this:
[masters] ${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_schedulable=true [etcd] ${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} [nodes] ${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_node_group_name='node-config-master' ${OKD_WORKER_NODE_1_HOSTNAME} openshift_ip=${OKD_WORKER_NODE_1_IP} openshift_node_group_name='node-config-compute' ${OKD_INFRA_NODE_1_HOSTNAME} openshift_ip=${OKD_INFRA_NODE_1_IP} openshift_node_group_name='node-config-infra'
This resolve my issue. Thanks zilmarr.
Since we are now using HOSTNAME at the inventory.ini file, I also added the SSH to use hostname:
cat ~/.ssh/id_rsa.pub | ssh root@okd-master-node "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys" cat ~/.ssh/id_rsa.pub | ssh root@okd-worker-node-1 "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys" cat ~/.ssh/id_rsa.pub | ssh root@okd-infra-node-1 "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
You need to change your inventory.download to use
${OKD_MASTER_HOSTNAME}
instead of${OKD_MASTER_IP}
. The same goes to worker nodes and infra nodes. Make sure you're using the FQDN (hostname.domain.com) and not just the hostname.Your inventory.download file should look like this:
[masters] ${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_schedulable=true [etcd] ${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} [nodes] ${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_node_group_name='node-config-master' ${OKD_WORKER_NODE_1_HOSTNAME} openshift_ip=${OKD_WORKER_NODE_1_IP} openshift_node_group_name='node-config-compute' ${OKD_INFRA_NODE_1_HOSTNAME} openshift_ip=${OKD_INFRA_NODE_1_IP} openshift_node_group_name='node-config-infra'
Hi Zilmarr,
i'm trying to install openshift in 3 node. one as master, worker and infra. i reffer to this tutorial https://github.com/SubhakarKotta/okd-installation-centos
i got stuck condition in step TASK [openshift_control_plane : Wait for all control plane pods to come up and become ready] Process always retry and looping process.
anyone can help me to solve the problem or have same conditoin?
here the output of journalctl -xe
Jan 04 12:27:00 okd-master-node origin-node[73416]: ] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:master-config ReadOnly:false MountPath:/etc/origin/master/ SubPath: MountPropagation:<nil>} {Name:master-cloud-provider ReadOnly:false MountPath:/etc/origi Jan 04 12:27:00 okd-master-node origin-node[73416]: I0104 12:27:00.972686 73416 kuberuntime_manager.go:757] checking backoff for container "api" in pod "master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)" Jan 04 12:27:00 okd-master-node origin-node[73416]: I0104 12:27:00.972880 73416 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=api pod=master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464) Jan 04 12:27:00 okd-master-node origin-node[73416]: E0104 12:27:00.972966 73416 pod_workers.go:186] Error syncing pod b24b15710309f0062b93e07af49cb464 ("master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)"), skipping: failed to "StartContainer" for "api" with CrashLoopBackOff: "Back-o Jan 04 12:27:02 okd-master-node origin-node[73416]: W0104 12:27:02.436153 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 04 12:27:02 okd-master-node origin-node[73416]: E0104 12:27:02.436791 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 04 12:27:06 okd-master-node origin-node[73416]: E0104 12:27:06.624443 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://master-node:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.11 Jan 04 12:27:06 okd-master-node origin-node[73416]: E0104 12:27:06.625229 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:455: Failed to list *v1.Service: Get https://master-node:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 04 12:27:06 okd-master-node origin-node[73416]: E0104 12:27:06.626837 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:464: Failed to list *v1.Node: Get https://master-node:8443/api/v1/nodes?fieldSelector=metadata.name%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.114.4.24 Jan 04 12:27:06 okd-master-node origin-node[73416]: W0104 12:27:06.718432 73416 status_manager.go:482] Failed to get status for pod "master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)": Get https://master-node:8443/api/v1/namespaces/kube-system/pods/master-api-okd-master-node: dial t Jan 04 12:27:07 okd-master-node origin-node[73416]: E0104 12:27:07.202887 73416 eviction_manager.go:243] eviction manager: failed to get get summary stats: failed to get node info: node "okd-master-node" not found Jan 04 12:27:07 okd-master-node origin-node[73416]: W0104 12:27:07.438296 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 04 12:27:07 okd-master-node origin-node[73416]: E0104 12:27:07.438541 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 04 12:27:07 okd-master-node origin-node[73416]: E0104 12:27:07.636190 73416 event.go:212] Unable to write event: 'Post https://master-node:8443/api/v1/namespaces/default/events: dial tcp 40.114.4.244:8443: i/o timeout' (may retry after sleeping) Jan 04 12:27:12 okd-master-node origin-node[73416]: W0104 12:27:12.439963 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 04 12:27:12 okd-master-node origin-node[73416]: E0104 12:27:12.440318 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.663237 73416 kubelet_node_status.go:269] Setting node annotation to enable volume controller attach/detach Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670168 73416 kubelet_node_status.go:441] Recording NodeHasSufficientDisk event message for node okd-master-node Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670204 73416 kubelet_node_status.go:441] Recording NodeHasSufficientMemory event message for node okd-master-node Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670221 73416 kubelet_node_status.go:441] Recording NodeHasNoDiskPressure event message for node okd-master-node Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670233 73416 kubelet_node_status.go:441] Recording NodeHasSufficientPID event message for node okd-master-node
here the output of cat /var/log/message
Jan 4 12:35:48 master-node origin-node: exec openshift start master api --config=/etc/origin/master/master-config.yaml --loglevel=${DEBUG_LOGLEVEL:-2} Jan 4 12:35:48 master-node origin-node: ] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:master-config ReadOnly:false MountPath:/etc/origin/master/ SubPath: MountPropagation:<nil>} {Name:master-cloud-provider ReadOnly:false MountPath:/etc/origin/cloudprovider/ SubPath: MountPropagation:<nil>} {Name:master-data ReadOnly:false MountPath:/var/lib/origin/ SubPath: MountPropagation:<nil>} {Name:master-pki ReadOnly:false MountPath:/etc/pki SubPath: MountPropagation:<nil>} {Name:host-localtime ReadOnly:false MountPath:/etc/localtime SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:healthz,Port:8443,Host:,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:45,TimeoutSeconds:10,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:healthz/ready,Port:8443,Host:,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:10,TimeoutSeconds:10,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it. Jan 4 12:35:48 master-node origin-node: I0104 12:35:48.972452 73416 kuberuntime_manager.go:757] checking backoff for container "api" in pod "master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)" Jan 4 12:35:48 master-node origin-node: I0104 12:35:48.972772 73416 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=api pod=master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464) Jan 4 12:35:48 master-node origin-node: E0104 12:35:48.972816 73416 pod_workers.go:186] Error syncing pod b24b15710309f0062b93e07af49cb464 ("master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)"), skipping: failed to "StartContainer" for "api" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=api pod=master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)" Jan 4 12:35:49 master-node origin-node: E0104 12:35:49.206288 73416 certificate_manager.go:299] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post https://master-node:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:51 master-node origin-node: I0104 12:35:51.205402 73416 certificate_manager.go:287] Rotating certificates Jan 4 12:35:52 master-node origin-node: W0104 12:35:52.631136 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 4 12:35:52 master-node origin-node: E0104 12:35:52.631850 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 4 12:35:53 master-node origin-node: E0104 12:35:53.637772 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://master-node:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:53 master-node origin-node: E0104 12:35:53.638613 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:455: Failed to list *v1.Service: Get https://master-node:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:53 master-node origin-node: E0104 12:35:53.639958 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:464: Failed to list *v1.Node: Get https://master-node:8443/api/v1/nodes?fieldSelector=metadata.name%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:57 master-node origin-node: E0104 12:35:57.218678 73416 eviction_manager.go:243] eviction manager: failed to get get summary stats: failed to get node info: node "okd-master-node" not found Jan 4 12:35:57 master-node origin-node: W0104 12:35:57.633342 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 4 12:35:57 master-node origin-node: E0104 12:35:57.634064 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.663091 73416 kubelet_node_status.go:269] Setting node annotation to enable volume controller attach/detach Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670927 73416 kubelet_node_status.go:441] Recording NodeHasSufficientDisk event message for node okd-master-node Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670962 73416 kubelet_node_status.go:441] Recording NodeHasSufficientMemory event message for node okd-master-node Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670977 73416 kubelet_node_status.go:441] Recording NodeHasNoDiskPressure event message for node okd-master-node Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670988 73416 kubelet_node_status.go:441] Recording NodeHasSufficientPID event message for node okd-master-node
and here my environment system
docker version
Docker version 1.13.1
ansible version
ansible 2.9.2
node master hostname : okd-master-node /etc/hosts ` 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 console console.
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1
40.86.80.127 infra-node-1
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1
40.86.80.127 infra-node-1
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1
40.86.80.127 infra-node-1
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1
40.86.80.127 infra-node-1`
/etc/resolv.conf ` # nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
search cluster.local fn5jzfpofi2ujjdt5gdtujcdkh.bx.internal.cloudapp.net
nameserver 172.16.0.4`
inventori.ini `[OSEv3:children] masters nodes etcd
[masters]
master-node openshift_ip=40.114.4.244 openshift_schedulable=true
[etcd]
master-node openshift_ip=40.114.4.244
[nodes]
master-node openshift_ip=40.114.4.244 openshift_node_group_name='node-config-master'
worker-node-1 openshift_ip=104.45.157.104 openshift_node_group_name='node-config-compute'
infra-node-1 openshift_ip=40.86.80.127 openshift_node_group_name='node-config-infra'
[OSEv3:vars]
openshift_additional_repos=[{'id': 'centos-paas', 'name': 'centos-paas', 'baseurl' :'https://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin311', 'gpgcheck' :'0', 'enabled' :'1'}]
ansible_ssh_user=root
enable_excluders=False
enable_docker_excluder=False
ansible_service_broker_install=False
containerized=True
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability
deployment_type=origin
openshift_deployment_type=origin
template_service_broker_selector={"region":"infra"}
openshift_metrics_image_version="v3.11"
openshift_logging_image_version="v3.11"
openshift_logging_elasticsearch_proxy_image_version="v1.0.0"
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra":"true"}
logging_elasticsearch_rollout_override=false
osm_use_cockpit=true
openshift_metrics_install_metrics=False
openshift_logging_install_logging=False
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
openshift_master_htpasswd_file='/etc/origin/master/htpasswd'
openshift_public_hostname=console.okd.nip.io
openshift_master_default_subdomain=apps.okd.nip.io
openshift_master_api_port=8443
openshift_master_console_port=8443`
inventori.download ` [OSEv3:children] masters nodes etcd
[masters]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_schedulable=true
[etcd]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP}
[nodes]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_node_group_name='node-config-master'
${OKD_WORKER_NODE_1_HOSTNAME} openshift_ip=${OKD_WORKER_NODE_1_IP} openshift_node_group_name='node-config-compute'
${OKD_INFRA_NODE_1_HOSTNAME} openshift_ip=${OKD_INFRA_NODE_1_IP} openshift_node_group_name='node-config-infra'
[OSEv3:vars]
openshift_additional_repos=[{'id': 'centos-paas', 'name': 'centos-paas', 'baseurl' :'https://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin311', 'gpgcheck' :'0', 'enabled' :'1'}]
ansible_ssh_user=root
enable_excluders=False
enable_docker_excluder=False
ansible_service_broker_install=False
containerized=True
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability
deployment_type=origin
openshift_deployment_type=origin
template_service_broker_selector={"region":"infra"}
openshift_metrics_image_version="v${OKD_VERSION}"
openshift_logging_image_version="v${OKD_VERSION}"
openshift_logging_elasticsearch_proxy_image_version="v1.0.0"
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra":"true"}
logging_elasticsearch_rollout_override=false
osm_use_cockpit=true
openshift_metrics_install_metrics=${INSTALL_METRICS}
openshift_logging_install_logging=${INSTALL_LOGGING}
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
openshift_master_htpasswd_file='/etc/origin/master/htpasswd'
openshift_public_hostname=console.${DOMAIN}
openshift_master_default_subdomain=apps.${DOMAIN}
openshift_master_api_port=${API_PORT}
openshift_master_console_port=${API_PORT}
`
Pls help me to solved the problem
Hi Subhakar,
Followed your instructions. But got error complaining the control plane is not showing up.
Any idea how to fix this?
`TASK [openshift_control_plane : Report control plane errors] **** fatal: [192.168.56.110]: FAILED! => {"changed": false, "msg": "Control plane pods didn't come up"}
NO MORE HOSTS LEFT **
PLAY RECAP ** 192.168.56.110 : ok=310 changed=139 unreachable=0 failed=1 skipped=245 rescued=0 ignored=4 192.168.56.111 : ok=104 changed=56 unreachable=0 failed=0 skipped=99 rescued=0 ignored=0 192.168.56.112 : ok=104 changed=56 unreachable=0 failed=0 skipped=99 rescued=0 ignored=0 localhost : ok=11 changed=0 unreachable=0 failed=0 skipped=5 rescued=0 ignored=0
INSTALLER STATUS **** Initialization : Complete (0:00:21) Health Check : Complete (0:00:05) Node Bootstrap Preparation : Complete (0:04:25) etcd Install : Complete (0:00:33) Master Install : In Progress (0:17:17) This phase can be restarted by running: playbooks/openshift-master/config.yml
Failure summary:
Many thanks.
Cheers, Vincent