s-u-b-h-a-k-a-r / okd-installation-centos

This repository is used to create OKD 3.11 Cluster with 9 simple steps on Bare VM's
23 stars 45 forks source link

Stuck on Step TASK [openshift_control_plane : Wait for all control plane pods to come up and become ready] #6

Open adesurya opened 4 years ago

adesurya commented 4 years ago

Hi,

i'm trying to install openshift in 3 node. one as master, worker and infra. i reffer to this tutorial ---xxxx---

i got stuck condition in step TASK [openshift_control_plane : Wait for all control plane pods to come up and become ready] Process always retry and looping process.

anyone can help me to solve the problem or have same conditoin?

here the output of journalctl -xe Jan 04 12:27:00 okd-master-node origin-node[73416]: ] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:master-config ReadOnly:false MountPath:/etc/origin/master/ SubPath: MountPropagation:<nil>} {Name:master-cloud-provider ReadOnly:false MountPath:/etc/origi Jan 04 12:27:00 okd-master-node origin-node[73416]: I0104 12:27:00.972686 73416 kuberuntime_manager.go:757] checking backoff for container "api" in pod "master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)" Jan 04 12:27:00 okd-master-node origin-node[73416]: I0104 12:27:00.972880 73416 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=api pod=master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464) Jan 04 12:27:00 okd-master-node origin-node[73416]: E0104 12:27:00.972966 73416 pod_workers.go:186] Error syncing pod b24b15710309f0062b93e07af49cb464 ("master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)"), skipping: failed to "StartContainer" for "api" with CrashLoopBackOff: "Back-o Jan 04 12:27:02 okd-master-node origin-node[73416]: W0104 12:27:02.436153 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 04 12:27:02 okd-master-node origin-node[73416]: E0104 12:27:02.436791 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 04 12:27:06 okd-master-node origin-node[73416]: E0104 12:27:06.624443 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://master-node:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.11 Jan 04 12:27:06 okd-master-node origin-node[73416]: E0104 12:27:06.625229 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:455: Failed to list *v1.Service: Get https://master-node:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 04 12:27:06 okd-master-node origin-node[73416]: E0104 12:27:06.626837 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:464: Failed to list *v1.Node: Get https://master-node:8443/api/v1/nodes?fieldSelector=metadata.name%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.114.4.24 Jan 04 12:27:06 okd-master-node origin-node[73416]: W0104 12:27:06.718432 73416 status_manager.go:482] Failed to get status for pod "master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)": Get https://master-node:8443/api/v1/namespaces/kube-system/pods/master-api-okd-master-node: dial t Jan 04 12:27:07 okd-master-node origin-node[73416]: E0104 12:27:07.202887 73416 eviction_manager.go:243] eviction manager: failed to get get summary stats: failed to get node info: node "okd-master-node" not found Jan 04 12:27:07 okd-master-node origin-node[73416]: W0104 12:27:07.438296 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 04 12:27:07 okd-master-node origin-node[73416]: E0104 12:27:07.438541 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 04 12:27:07 okd-master-node origin-node[73416]: E0104 12:27:07.636190 73416 event.go:212] Unable to write event: 'Post https://master-node:8443/api/v1/namespaces/default/events: dial tcp 40.114.4.244:8443: i/o timeout' (may retry after sleeping) Jan 04 12:27:12 okd-master-node origin-node[73416]: W0104 12:27:12.439963 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 04 12:27:12 okd-master-node origin-node[73416]: E0104 12:27:12.440318 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.663237 73416 kubelet_node_status.go:269] Setting node annotation to enable volume controller attach/detach Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670168 73416 kubelet_node_status.go:441] Recording NodeHasSufficientDisk event message for node okd-master-node Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670204 73416 kubelet_node_status.go:441] Recording NodeHasSufficientMemory event message for node okd-master-node Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670221 73416 kubelet_node_status.go:441] Recording NodeHasNoDiskPressure event message for node okd-master-node Jan 04 12:27:13 okd-master-node origin-node[73416]: I0104 12:27:13.670233 73416 kubelet_node_status.go:441] Recording NodeHasSufficientPID event message for node okd-master-node

here the output of cat /var/log/message Jan 4 12:35:48 master-node origin-node: exec openshift start master api --config=/etc/origin/master/master-config.yaml --loglevel=${DEBUG_LOGLEVEL:-2} Jan 4 12:35:48 master-node origin-node: ] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:master-config ReadOnly:false MountPath:/etc/origin/master/ SubPath: MountPropagation:<nil>} {Name:master-cloud-provider ReadOnly:false MountPath:/etc/origin/cloudprovider/ SubPath: MountPropagation:<nil>} {Name:master-data ReadOnly:false MountPath:/var/lib/origin/ SubPath: MountPropagation:<nil>} {Name:master-pki ReadOnly:false MountPath:/etc/pki SubPath: MountPropagation:<nil>} {Name:host-localtime ReadOnly:false MountPath:/etc/localtime SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:healthz,Port:8443,Host:,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:45,TimeoutSeconds:10,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:healthz/ready,Port:8443,Host:,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:10,TimeoutSeconds:10,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it. Jan 4 12:35:48 master-node origin-node: I0104 12:35:48.972452 73416 kuberuntime_manager.go:757] checking backoff for container "api" in pod "master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)" Jan 4 12:35:48 master-node origin-node: I0104 12:35:48.972772 73416 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=api pod=master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464) Jan 4 12:35:48 master-node origin-node: E0104 12:35:48.972816 73416 pod_workers.go:186] Error syncing pod b24b15710309f0062b93e07af49cb464 ("master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)"), skipping: failed to "StartContainer" for "api" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=api pod=master-api-okd-master-node_kube-system(b24b15710309f0062b93e07af49cb464)" Jan 4 12:35:49 master-node origin-node: E0104 12:35:49.206288 73416 certificate_manager.go:299] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post https://master-node:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:51 master-node origin-node: I0104 12:35:51.205402 73416 certificate_manager.go:287] Rotating certificates Jan 4 12:35:52 master-node origin-node: W0104 12:35:52.631136 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 4 12:35:52 master-node origin-node: E0104 12:35:52.631850 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 4 12:35:53 master-node origin-node: E0104 12:35:53.637772 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://master-node:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:53 master-node origin-node: E0104 12:35:53.638613 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:455: Failed to list *v1.Service: Get https://master-node:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:53 master-node origin-node: E0104 12:35:53.639958 73416 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:464: Failed to list *v1.Node: Get https://master-node:8443/api/v1/nodes?fieldSelector=metadata.name%3Dokd-master-node&limit=500&resourceVersion=0: dial tcp 40.114.4.244:8443: i/o timeout Jan 4 12:35:57 master-node origin-node: E0104 12:35:57.218678 73416 eviction_manager.go:243] eviction manager: failed to get get summary stats: failed to get node info: node "okd-master-node" not found Jan 4 12:35:57 master-node origin-node: W0104 12:35:57.633342 73416 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d Jan 4 12:35:57 master-node origin-node: E0104 12:35:57.634064 73416 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.663091 73416 kubelet_node_status.go:269] Setting node annotation to enable volume controller attach/detach Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670927 73416 kubelet_node_status.go:441] Recording NodeHasSufficientDisk event message for node okd-master-node Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670962 73416 kubelet_node_status.go:441] Recording NodeHasSufficientMemory event message for node okd-master-node Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670977 73416 kubelet_node_status.go:441] Recording NodeHasNoDiskPressure event message for node okd-master-node Jan 4 12:35:57 master-node origin-node: I0104 12:35:57.670988 73416 kubelet_node_status.go:441] Recording NodeHasSufficientPID event message for node okd-master-node

and here my environment system

docker version Docker version 1.13.1

ansible version ansible 2.9.2

/etc/hosts `127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 console console.

40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1

40.86.80.127 infra-node-1
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1

40.86.80.127 infra-node-1
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1

40.86.80.127 infra-node-1
40.114.4.244 master-node console console.okd.nip.io
104.45.157.104 worker-node-1

40.86.80.127 infra-node-1`

inventori.ini `[OSEv3:children] masters nodes etcd

[masters]
master-node openshift_ip=40.114.4.244 openshift_schedulable=true

[etcd]
master-node openshift_ip=40.114.4.244

[nodes]
master-node openshift_ip=40.114.4.244 openshift_node_group_name='node-config-master'
worker-node-1 openshift_ip=104.45.157.104 openshift_node_group_name='node-config-compute'
infra-node-1 openshift_ip=40.86.80.127 openshift_node_group_name='node-config-infra'

[OSEv3:vars]
openshift_additional_repos=[{'id': 'centos-paas', 'name': 'centos-paas', 'baseurl' :'https://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin311', 'gpgcheck' :'0', 'enabled' :'1'}]

ansible_ssh_user=root
enable_excluders=False
enable_docker_excluder=False
ansible_service_broker_install=False

containerized=True
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability

deployment_type=origin
openshift_deployment_type=origin

template_service_broker_selector={"region":"infra"}
openshift_metrics_image_version="v3.11"
openshift_logging_image_version="v3.11"
openshift_logging_elasticsearch_proxy_image_version="v1.0.0"
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra":"true"}
logging_elasticsearch_rollout_override=false
osm_use_cockpit=true

openshift_metrics_install_metrics=False
openshift_logging_install_logging=False

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
openshift_master_htpasswd_file='/etc/origin/master/htpasswd'

openshift_public_hostname=console.okd.nip.io
openshift_master_default_subdomain=apps.okd.nip.io

openshift_master_api_port=8443
openshift_master_console_port=8443`

inventori.download `[OSEv3:children] masters nodes etcd

[masters]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_schedulable=true

[etcd]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP}

[nodes]
${OKD_MASTER_HOSTNAME} openshift_ip=${OKD_MASTER_IP} openshift_node_group_name='node-config-master'
${OKD_WORKER_NODE_1_HOSTNAME} openshift_ip=${OKD_WORKER_NODE_1_IP} openshift_node_group_name='node-config-compute'
${OKD_INFRA_NODE_1_HOSTNAME} openshift_ip=${OKD_INFRA_NODE_1_IP} openshift_node_group_name='node-config-infra'

[OSEv3:vars]
openshift_additional_repos=[{'id': 'centos-paas', 'name': 'centos-paas', 'baseurl' :'https://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin311', 'gpgcheck' :'0', 'enabled' :'1'}]

ansible_ssh_user=root
enable_excluders=False
enable_docker_excluder=False
ansible_service_broker_install=False

containerized=True
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability

deployment_type=origin
openshift_deployment_type=origin

template_service_broker_selector={"region":"infra"}
openshift_metrics_image_version="v${OKD_VERSION}"
openshift_logging_image_version="v${OKD_VERSION}"
openshift_logging_elasticsearch_proxy_image_version="v1.0.0"
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra":"true"}
logging_elasticsearch_rollout_override=false
osm_use_cockpit=true

openshift_metrics_install_metrics=${INSTALL_METRICS}
openshift_logging_install_logging=${INSTALL_LOGGING}

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
openshift_master_htpasswd_file='/etc/origin/master/htpasswd'

openshift_public_hostname=console.${DOMAIN}
openshift_master_default_subdomain=apps.${DOMAIN}

openshift_master_api_port=${API_PORT}
openshift_master_console_port=${API_PORT}`
tancou commented 4 years ago

I have the same problem with KVM could-init CentOS image.

cakhanif commented 4 years ago

I used to experience the same thing, and the problem was a problem in the Docker version, after I changed it to the Docker version 1.13, it could

tancou commented 4 years ago

Thanks @cakhanif. How did you manage to install Docker 1.13 and not 1.13.1 ?