openshift / openshift-ansible

Install and config an OpenShift 3.x cluster
https://try.openshift.com
Apache License 2.0
2.18k stars 2.31k forks source link

Openshift enterprise 3.11 deploy fails #11027

Closed vjernej closed 5 years ago

vjernej commented 5 years ago

Description

We can clean installation with 3 masters, 2 nodes and 1 load balancer server running on RHEL 7 - up to date.

Version

Please put the following version information in the code block indicated below.

ansible 2.6.11 config file = /etc/ansible/ansible.cfg configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, Sep 12 2018, 05:31:16) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

openshift-ansible-3.11.59-1.git.0.ba8e948.el7.noarch

Steps To Reproduce

ansible-playbook -i /etc/ansible/hosts /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml

Expected Results

Cluster deployed successfuly.

Observed Results

Deploy fails.

SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/55a7f9cf3f ctmosmn1.cdm.com '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245 `" && echo ansible-tmp-1547756440.22-71191695970245="` echo /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245 `" ) && sleep 0'"'"'' (0, 'ansible-tmp-1547756440.22-71191695970245=/root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245\n', '') Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py PUT /root/.ansible/tmp/ansible-local-77186Js0F0r/tmpSuauRn TO /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245/oc_obj.py SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/55a7f9cf3f '[ctmosmn1.cdm.com]' (0, 'sftp> put /root/.ansible/tmp/ansible-local-77186Js0F0r/tmpSuauRn /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245/oc_obj.py\n', '') ESTABLISH SSH CONNECTION FOR USER: root SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/55a7f9cf3f ctmosmn1.cdm.com '/bin/sh -c '"'"'chmod u+x /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245/ /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245/oc_obj.py && sleep 0'"'"'' (0, '', '') ESTABLISH SSH CONNECTION FOR USER: root SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/55a7f9cf3f -tt ctmosmn1.cdm.com '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245/oc_obj.py && sleep 0'"'"'' (0, '\r\n{"invocation": {"module_args": {"files": null, "kind": "daemonset", "force": false, "all_namespaces": null, "field_selector": null, "namespace": "openshift-node", "delete_after": false, "kubeconfig": "/etc/origin/master/admin.kubeconfig", "content": null, "state": "list", "debug": false, "selector": null, "name": "sync"}}, "state": "list", "changed": false, "results": {"returncode": 0, "cmd": "/usr/bin/oc get daemonset sync -o json -n openshift-node", "results": [{"status": {"numberReady": 3, "observedGeneration": 9, "numberAvailable": 3, "desiredNumberScheduled": 5, "numberUnavailable": 2, "currentNumberScheduled": 5, "numberMisscheduled": 0, "updatedNumberScheduled": 4}, "kind": "DaemonSet", "spec": {"revisionHistoryLimit": 10, "selector": {"matchLabels": {"app": "sync"}}, "templateGeneration": 9, "updateStrategy": {"rollingUpdate": {"maxUnavailable": "50%"}, "type": "RollingUpdate"}, "template": {"spec": {"priorityClassName": "system-node-critical", "dnsPolicy": "ClusterFirst", "securityContext": {}, "serviceAccountName": "sync", "schedulerName": "default-scheduler", "hostNetwork": true, "serviceAccount": "sync", "terminationGracePeriodSeconds": 1, "restartPolicy": "Always", "hostPID": true, "volumes": [{"hostPath": {"path": "/etc/origin/node", "type": ""}, "name": "host-config"}, {"hostPath": {"path": "/etc/sysconfig", "type": ""}, "name": "host-sysconfig-node"}, {"hostPath": {"path": "/var/run/dbus", "type": ""}, "name": "var-run-dbus"}, {"hostPath": {"path": "/run/systemd/system", "type": ""}, "name": "run-systemd-system"}], "tolerations": [{"operator": "Exists"}], "containers": [{"securityContext": {"privileged": true, "runAsUser": 0}, "name": "sync", "image": "registry.redhat.io/openshift3/ose-node:v3.11.59", "volumeMounts": [{"mountPath": "/etc/origin/node/", "name": "host-config"}, {"readOnly": true, "mountPath": "/etc/sysconfig", "name": "host-sysconfig-node"}, {"readOnly": true, "mountPath": "/var/run/dbus", "name": "var-run-dbus"}, {"readOnly": true, "mountPath": "/run/systemd/system", "name": "run-systemd-system"}], "terminationMessagePolicy": "File", "command": ["/bin/bash", "-c", "#!/bin/bash\\nset -euo pipefail\\n\\n# set by the node image\\nunset KUBECONFIG\\n\\ntrap \'kill $(jobs -p); exit 0\' TERM\\n\\n# track the current state of the config\\nif [[ -f /etc/origin/node/node-config.yaml ]]; then\\n md5sum /etc/origin/node/node-config.yaml > /tmp/.old\\nelse\\n touch /tmp/.old\\nfi\\n\\n# loop until BOOTSTRAP_CONFIG_NAME is set\\nwhile true; do\\n file=/etc/sysconfig/origin-node\\n if [[ -f /etc/sysconfig/atomic-openshift-node ]]; then\\n file=/etc/sysconfig/atomic-openshift-node\\n elif [[ -f /etc/sysconfig/origin-node ]]; then\\n file=/etc/sysconfig/origin-node\\n else\\n echo \\"info: Waiting for the node sysconfig file to be created\\" 2>&1\\n sleep 15 & wait\\n continue\\n fi\\n name=\\"$(sed -nE \'s|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\\\1|p\' \\"${file}\\" | head -1)\\"\\n if [[ -z \\"${name}\\" ]]; then\\n echo \\"info: Waiting for BOOTSTRAP_CONFIG_NAME to be set\\" 2>&1\\n sleep 15 & wait\\n continue\\n fi\\n # in the background check to see if the value changes and exit if so\\n pid=$BASHPID\\n (\\n while true; do\\n if ! updated=\\"$(sed -nE \'s|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\\\1|p\' \\"${file}\\" | head -1)\\"; then\\n echo \\"error: Unable to check for bootstrap config, exiting\\" 2>&1\\n kill $pid\\n exit 1\\n fi\\n if [[ \\"${updated}\\" != \\"${name}\\" ]]; then\\n echo \\"info: Bootstrap configuration profile name changed, exiting\\" 2>&1\\n kill $pid\\n exit 0\\n fi\\n sleep 15\\n done\\n ) &\\n break\\ndone\\nmkdir -p /etc/origin/node/tmp\\n# periodically refresh both node-config.yaml and relabel the node\\nwhile true; do\\n if ! oc extract \\"configmaps/${name}\\" -n openshift-node --to=/etc/origin/node/tmp --confirm --request-timeout=10s --config /etc/origin/node/node.kubeconfig \\"--token=$( cat /var/run/secrets/kubernetes.io/serviceaccount/token )\\" > /dev/null; then\\n echo \\"error: Unable to retrieve latest config for node\\" 2>&1\\n sleep 15 &\\n wait $!\\n continue\\n fi\\n\\n KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :\\n if ! [[ -z \\"$KUBELET_HOSTNAME_OVERRIDE\\" ]]; then\\n #Patching node-config for hostname override\\n echo \\"nodeName: $KUBELET_HOSTNAME_OVERRIDE\\" >> /etc/origin/node/tmp/node-config.yaml\\n fi\\n\\n # detect whether the node-config.yaml has changed, and if so trigger a restart of the kubelet.\\n if [[ ! -f /etc/origin/node/node-config.yaml ]]; then\\n cat /dev/null > /tmp/.old\\n fi\\n\\n md5sum /etc/origin/node/tmp/node-config.yaml > /tmp/.new\\n if [[ \\"$( cat /tmp/.old )\\" != \\"$( cat /tmp/.new )\\" ]]; then\\n mv /etc/origin/node/tmp/node-config.yaml /etc/origin/node/node-config.yaml\\n SYSTEMD_IGNORE_CHROOT=1 systemctl restart tuned || :\\n echo \\"info: Configuration changed, restarting kubelet\\" 2>&1\\n # TODO: kubelet doesn\'t relabel nodes, best effort for now\\n # https://github.com/kubernetes/kubernetes/issues/59314\\n if args=\\"$(openshift-node-config --config /etc/origin/node/node-config.yaml)\\"; then\\n labels=$(tr \' \' \'\\\\n\' <<<$args | sed -ne \'/^--node-labels=/ { s/^--node-labels=//; p; }\' | tr \',\\\\n\' \' \')\\n if [[ -n \\"${labels}\\" ]]; then\\n echo \\"info: Applying node labels $labels\\" 2>&1\\n if ! oc label --config=/etc/origin/node/node.kubeconfig \\"node/${NODE_NAME}\\" ${labels} --overwrite; then\\n echo \\"error: Unable to apply labels, will retry in 10\\" 2>&1\\n sleep 10 &\\n wait $!\\n continue\\n fi\\n fi\\n else\\n echo \\"error: The downloaded node configuration is invalid, retrying later\\" 2>&1\\n sleep 10 &\\n wait $!\\n continue\\n fi\\n if ! pkill -U 0 -f \'(^|/)hyperkube kubelet \'; then\\n echo \\"error: Unable to restart Kubelet\\" 2>&1\\n sleep 10 &\\n wait $!\\n continue\\n fi\\n fi\\n # annotate node with md5sum of the config\\n oc annotate --config=/etc/origin/node/node.kubeconfig \\"node/${NODE_NAME}\\" \\\\\\n node.openshift.io/md5sum=\\"$( cat /tmp/.new | cut -d\' \' -f1 )\\" --overwrite\\n cp -f /tmp/.new /tmp/.old\\n sleep 180 &\\n wait $!\\ndone\\n"], "env": [{"valueFrom": {"fieldRef": {"fieldPath": "spec.nodeName", "apiVersion": "v1"}}, "name": "NODE_NAME"}], "imagePullPolicy": "IfNotPresent", "terminationMessagePath": "/dev/termination-log", "resources": {}}]}, "metadata": {"labels": {"component": "network", "app": "sync", "openshift.io/component": "sync", "type": "infra"}, "creationTimestamp": null, "annotations": {"scheduler.alpha.kubernetes.io/critical-pod": ""}}}}, "apiVersion": "extensions/v1beta1", "metadata": {"name": "sync", "generation": 9, "labels": {"component": "network", "app": "sync", "openshift.io/component": "sync", "type": "infra"}, "namespace": "openshift-node", "resourceVersion": "50430", "creationTimestamp": "2019-01-17T12:07:32Z", "annotations": {"image.openshift.io/triggers": "[\\n {\\"from\\":{\\"kind\\":\\"ImageStreamTag\\",\\"name\\":\\"node:v3.11\\"},\\"fieldPath\\":\\"spec.template.spec.containers[?(@.name==\\\\\\"sync\\\\\\")].image\\"}\\n]\\n", "kubectl.kubernetes.io/last-applied-configuration": "{\\"apiVersion\\":\\"apps/v1\\",\\"kind\\":\\"DaemonSet\\",\\"metadata\\":{\\"annotations\\":{\\"image.openshift.io/triggers\\":\\"[\\\\n {\\\\\\"from\\\\\\":{\\\\\\"kind\\\\\\":\\\\\\"ImageStreamTag\\\\\\",\\\\\\"name\\\\\\":\\\\\\"node:v3.11\\\\\\"},\\\\\\"fieldPath\\\\\\":\\\\\\"spec.template.spec.containers[?(@.name==\\\\\\\\\\\\\\"sync\\\\\\\\\\\\\\")].image\\\\\\"}\\\\n]\\\\n\\",\\"kubernetes.io/description\\":\\"This daemon set provides dynamic configuration of nodes and relabels nodes as appropriate.\\\\n\\"},\\"name\\":\\"sync\\",\\"namespace\\":\\"openshift-node\\"},\\"spec\\":{\\"selector\\":{\\"matchLabels\\":{\\"app\\":\\"sync\\"}},\\"template\\":{\\"metadata\\":{\\"annotations\\":{\\"scheduler.alpha.kubernetes.io/critical-pod\\":\\"\\"},\\"labels\\":{\\"app\\":\\"sync\\",\\"component\\":\\"network\\",\\"openshift.io/component\\":\\"sync\\",\\"type\\":\\"infra\\"}},\\"spec\\":{\\"containers\\":[{\\"command\\":[\\"/bin/bash\\",\\"-c\\",\\"#!/bin/bash\\\\nset -euo pipefail\\\\n\\\\n# set by the node image\\\\nunset KUBECONFIG\\\\n\\\\ntrap \'kill $(jobs -p); exit 0\' TERM\\\\n\\\\n# track the current state of the config\\\\nif [[ -f /etc/origin/node/node-config.yaml ]]; then\\\\n md5sum /etc/origin/node/node-config.yaml \\\\u003e /tmp/.old\\\\nelse\\\\n touch /tmp/.old\\\\nfi\\\\n\\\\n# loop until BOOTSTRAP_CONFIG_NAME is set\\\\nwhile true; do\\\\n file=/etc/sysconfig/origin-node\\\\n if [[ -f /etc/sysconfig/atomic-openshift-node ]]; then\\\\n file=/etc/sysconfig/atomic-openshift-node\\\\n elif [[ -f /etc/sysconfig/origin-node ]]; then\\\\n file=/etc/sysconfig/origin-node\\\\n else\\\\n echo \\\\\\"info: Waiting for the node sysconfig file to be created\\\\\\" 2\\\\u003e\\\\u00261\\\\n sleep 15 \\\\u0026 wait\\\\n continue\\\\n fi\\\\n name=\\\\\\"$(sed -nE \'s|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\\\\\\\1|p\' \\\\\\"${file}\\\\\\" | head -1)\\\\\\"\\\\n if [[ -z \\\\\\"${name}\\\\\\" ]]; then\\\\n echo \\\\\\"info: Waiting for BOOTSTRAP_CONFIG_NAME to be set\\\\\\" 2\\\\u003e\\\\u00261\\\\n sleep 15 \\\\u0026 wait\\\\n continue\\\\n fi\\\\n # in the background check to see if the value changes and exit if so\\\\n pid=$BASHPID\\\\n (\\\\n while true; do\\\\n if ! updated=\\\\\\"$(sed -nE \'s|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\\\\\\\1|p\' \\\\\\"${file}\\\\\\" | head -1)\\\\\\"; then\\\\n echo \\\\\\"error: Unable to check for bootstrap config, exiting\\\\\\" 2\\\\u003e\\\\u00261\\\\n kill $pid\\\\n exit 1\\\\n fi\\\\n if [[ \\\\\\"${updated}\\\\\\" != \\\\\\"${name}\\\\\\" ]]; then\\\\n echo \\\\\\"info: Bootstrap configuration profile name changed, exiting\\\\\\" 2\\\\u003e\\\\u00261\\\\n kill $pid\\\\n exit 0\\\\n fi\\\\n sleep 15\\\\n done\\\\n ) \\\\u0026\\\\n break\\\\ndone\\\\nmkdir -p /etc/origin/node/tmp\\\\n# periodically refresh both node-config.yaml and relabel the node\\\\nwhile true; do\\\\n if ! oc extract \\\\\\"configmaps/${name}\\\\\\" -n openshift-node --to=/etc/origin/node/tmp --confirm --request-timeout=10s --config /etc/origin/node/node.kubeconfig \\\\\\"--token=$( cat /var/run/secrets/kubernetes.io/serviceaccount/token )\\\\\\" \\\\u003e /dev/null; then\\\\n echo \\\\\\"error: Unable to retrieve latest config for node\\\\\\" 2\\\\u003e\\\\u00261\\\\n sleep 15 \\\\u0026\\\\n wait $!\\\\n continue\\\\n fi\\\\n\\\\n KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :\\\\n if ! [[ -z \\\\\\"$KUBELET_HOSTNAME_OVERRIDE\\\\\\" ]]; then\\\\n #Patching node-config for hostname override\\\\n echo \\\\\\"nodeName: $KUBELET_HOSTNAME_OVERRIDE\\\\\\" \\\\u003e\\\\u003e /etc/origin/node/tmp/node-config.yaml\\\\n fi\\\\n\\\\n # detect whether the node-config.yaml has changed, and if so trigger a restart of the kubelet.\\\\n if [[ ! -f /etc/origin/node/node-config.yaml ]]; then\\\\n cat /dev/null \\\\u003e /tmp/.old\\\\n fi\\\\n\\\\n md5sum /etc/origin/node/tmp/node-config.yaml \\\\u003e /tmp/.new\\\\n if [[ \\\\\\"$( cat /tmp/.old )\\\\\\" != \\\\\\"$( cat /tmp/.new )\\\\\\" ]]; then\\\\n mv /etc/origin/node/tmp/node-config.yaml /etc/origin/node/node-config.yaml\\\\n SYSTEMD_IGNORE_CHROOT=1 systemctl restart tuned || :\\\\n echo \\\\\\"info: Configuration changed, restarting kubelet\\\\\\" 2\\\\u003e\\\\u00261\\\\n # TODO: kubelet doesn\'t relabel nodes, best effort for now\\\\n # https://github.com/kubernetes/kubernetes/issues/59314\\\\n if args=\\\\\\"$(openshift-node-config --config /etc/origin/node/node-config.yaml)\\\\\\"; then\\\\n labels=$(tr \' \' \'\\\\\\\\n\' \\\\u003c\\\\u003c\\\\u003c$args | sed -ne \'/^--node-labels=/ { s/^--node-labels=//; p; }\' | tr \',\\\\\\\\n\' \' \')\\\\n if [[ -n \\\\\\"${labels}\\\\\\" ]]; then\\\\n echo \\\\\\"info: Applying node labels $labels\\\\\\" 2\\\\u003e\\\\u00261\\\\n if ! oc label --config=/etc/origin/node/node.kubeconfig \\\\\\"node/${NODE_NAME}\\\\\\" ${labels} --overwrite; then\\\\n echo \\\\\\"error: Unable to apply labels, will retry in 10\\\\\\" 2\\\\u003e\\\\u00261\\\\n sleep 10 \\\\u0026\\\\n wait $!\\\\n continue\\\\n fi\\\\n fi\\\\n else\\\\n echo \\\\\\"error: The downloaded node configuration is invalid, retrying later\\\\\\" 2\\\\u003e\\\\u00261\\\\n sleep 10 \\\\u0026\\\\n wait $!\\\\n continue\\\\n fi\\\\n if ! pkill -U 0 -f \'(^|/)hyperkube kubelet \'; then\\\\n echo \\\\\\"error: Unable to restart Kubelet\\\\\\" 2\\\\u003e\\\\u00261\\\\n sleep 10 \\\\u0026\\\\n wait $!\\\\n continue\\\\n fi\\\\n fi\\\\n # annotate node with md5sum of the config\\\\n oc annotate --config=/etc/origin/node/node.kubeconfig \\\\\\"node/${NODE_NAME}\\\\\\" \\\\\\\\\\\\n node.openshift.io/md5sum=\\\\\\"$( cat /tmp/.new | cut -d\' \' -f1 )\\\\\\" --overwrite\\\\n cp -f /tmp/.new /tmp/.old\\\\n sleep 180 \\\\u0026\\\\n wait $!\\\\ndone\\\\n\\"],\\"env\\":[{\\"name\\":\\"NODE_NAME\\",\\"valueFrom\\":{\\"fieldRef\\":{\\"fieldPath\\":\\"spec.nodeName\\"}}}],\\"image\\":\\" \\",\\"name\\":\\"sync\\",\\"securityContext\\":{\\"privileged\\":true,\\"runAsUser\\":0},\\"volumeMounts\\":[{\\"mountPath\\":\\"/etc/origin/node/\\",\\"name\\":\\"host-config\\"},{\\"mountPath\\":\\"/etc/sysconfig\\",\\"name\\":\\"host-sysconfig-node\\",\\"readOnly\\":true},{\\"mountPath\\":\\"/var/run/dbus\\",\\"name\\":\\"var-run-dbus\\",\\"readOnly\\":true},{\\"mountPath\\":\\"/run/systemd/system\\",\\"name\\":\\"run-systemd-system\\",\\"readOnly\\":true}]}],\\"hostNetwork\\":true,\\"hostPID\\":true,\\"priorityClassName\\":\\"system-node-critical\\",\\"serviceAccountName\\":\\"sync\\",\\"terminationGracePeriodSeconds\\":1,\\"tolerations\\":[{\\"operator\\":\\"Exists\\"}],\\"volumes\\":[{\\"hostPath\\":{\\"path\\":\\"/etc/origin/node\\"},\\"name\\":\\"host-config\\"},{\\"hostPath\\":{\\"path\\":\\"/etc/sysconfig\\"},\\"name\\":\\"host-sysconfig-node\\"},{\\"hostPath\\":{\\"path\\":\\"/var/run/dbus\\"},\\"name\\":\\"var-run-dbus\\"},{\\"hostPath\\":{\\"path\\":\\"/run/systemd/system\\"},\\"name\\":\\"run-systemd-system\\"}]}},\\"updateStrategy\\":{\\"rollingUpdate\\":{\\"maxUnavailable\\":\\"50%\\"},\\"type\\":\\"RollingUpdate\\"}}}\\n", "kubernetes.io/description": "This daemon set provides dynamic configuration of nodes and relabels nodes as appropriate.\\n"}, "selfLink": "/apis/extensions/v1beta1/namespaces/openshift-node/daemonsets/sync", "uid": "7835e5be-1a50-11e9-857c-00155d540c67"}}]}}\r\n', 'Shared connection to ctmosmn1.cdm.com closed.\r\n') ESTABLISH SSH CONNECTION FOR USER: root SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/55a7f9cf3f ctmosmn1.cdm.com '/bin/sh -c '"'"'rm -f -r /root/.ansible/tmp/ansible-tmp-1547756440.22-71191695970245/ > /dev/null 2>&1 && sleep 0'"'"'' (0, '', '') fatal: [ctmosmn1.cdm.com]: FAILED! => { "attempts": 80, "changed": false, "invocation": { "module_args": { "all_namespaces": null, "content": null, "debug": false, "delete_after": false, "field_selector": null, "files": null, "force": false, "kind": "daemonset", "kubeconfig": "/etc/origin/master/admin.kubeconfig", "name": "sync", "namespace": "openshift-node", "selector": null, "state": "list" } }, "results": { "cmd": "/usr/bin/oc get daemonset sync -o json -n openshift-node", "results": [ { "apiVersion": "extensions/v1beta1", "kind": "DaemonSet", "metadata": { "annotations": { "image.openshift.io/triggers": "[\n {\"from\":{\"kind\":\"ImageStreamTag\",\"name\":\"node:v3.11\"},\"fieldPath\":\"spec.template.spec.containers[?(@.name==\\\"sync\\\")].image\"}\n]\n", "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"apps/v1\",\"kind\":\"DaemonSet\",\"metadata\":{\"annotations\":{\"image.openshift.io/triggers\":\"[\\n {\\\"from\\\":{\\\"kind\\\":\\\"ImageStreamTag\\\",\\\"name\\\":\\\"node:v3.11\\\"},\\\"fieldPath\\\":\\\"spec.template.spec.containers[?(@.name==\\\\\\\"sync\\\\\\\")].image\\\"}\\n]\\n\",\"kubernetes.io/description\":\"This daemon set provides dynamic configuration of nodes and relabels nodes as appropriate.\\n\"},\"name\":\"sync\",\"namespace\":\"openshift-node\"},\"spec\":{\"selector\":{\"matchLabels\":{\"app\":\"sync\"}},\"template\":{\"metadata\":{\"annotations\":{\"scheduler.alpha.kubernetes.io/critical-pod\":\"\"},\"labels\":{\"app\":\"sync\",\"component\":\"network\",\"openshift.io/component\":\"sync\",\"type\":\"infra\"}},\"spec\":{\"containers\":[{\"command\":[\"/bin/bash\",\"-c\",\"#!/bin/bash\\nset -euo pipefail\\n\\n# set by the node image\\nunset KUBECONFIG\\n\\ntrap 'kill $(jobs -p); exit 0' TERM\\n\\n# track the current state of the config\\nif [[ -f /etc/origin/node/node-config.yaml ]]; then\\n md5sum /etc/origin/node/node-config.yaml \\u003e /tmp/.old\\nelse\\n touch /tmp/.old\\nfi\\n\\n# loop until BOOTSTRAP_CONFIG_NAME is set\\nwhile true; do\\n file=/etc/sysconfig/origin-node\\n if [[ -f /etc/sysconfig/atomic-openshift-node ]]; then\\n file=/etc/sysconfig/atomic-openshift-node\\n elif [[ -f /etc/sysconfig/origin-node ]]; then\\n file=/etc/sysconfig/origin-node\\n else\\n echo \\\"info: Waiting for the node sysconfig file to be created\\\" 2\\u003e\\u00261\\n sleep 15 \\u0026 wait\\n continue\\n fi\\n name=\\\"$(sed -nE 's|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\\\1|p' \\\"${file}\\\" | head -1)\\\"\\n if [[ -z \\\"${name}\\\" ]]; then\\n echo \\\"info: Waiting for BOOTSTRAP_CONFIG_NAME to be set\\\" 2\\u003e\\u00261\\n sleep 15 \\u0026 wait\\n continue\\n fi\\n # in the background check to see if the value changes and exit if so\\n pid=$BASHPID\\n (\\n while true; do\\n if ! updated=\\\"$(sed -nE 's|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\\\1|p' \\\"${file}\\\" | head -1)\\\"; then\\n echo \\\"error: Unable to check for bootstrap config, exiting\\\" 2\\u003e\\u00261\\n kill $pid\\n exit 1\\n fi\\n if [[ \\\"${updated}\\\" != \\\"${name}\\\" ]]; then\\n echo \\\"info: Bootstrap configuration profile name changed, exiting\\\" 2\\u003e\\u00261\\n kill $pid\\n exit 0\\n fi\\n sleep 15\\n done\\n ) \\u0026\\n break\\ndone\\nmkdir -p /etc/origin/node/tmp\\n# periodically refresh both node-config.yaml and relabel the node\\nwhile true; do\\n if ! oc extract \\\"configmaps/${name}\\\" -n openshift-node --to=/etc/origin/node/tmp --confirm --request-timeout=10s --config /etc/origin/node/node.kubeconfig \\\"--token=$( cat /var/run/secrets/kubernetes.io/serviceaccount/token )\\\" \\u003e /dev/null; then\\n echo \\\"error: Unable to retrieve latest config for node\\\" 2\\u003e\\u00261\\n sleep 15 \\u0026\\n wait $!\\n continue\\n fi\\n\\n KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :\\n if ! [[ -z \\\"$KUBELET_HOSTNAME_OVERRIDE\\\" ]]; then\\n #Patching node-config for hostname override\\n echo \\\"nodeName: $KUBELET_HOSTNAME_OVERRIDE\\\" \\u003e\\u003e /etc/origin/node/tmp/node-config.yaml\\n fi\\n\\n # detect whether the node-config.yaml has changed, and if so trigger a restart of the kubelet.\\n if [[ ! -f /etc/origin/node/node-config.yaml ]]; then\\n cat /dev/null \\u003e /tmp/.old\\n fi\\n\\n md5sum /etc/origin/node/tmp/node-config.yaml \\u003e /tmp/.new\\n if [[ \\\"$( cat /tmp/.old )\\\" != \\\"$( cat /tmp/.new )\\\" ]]; then\\n mv /etc/origin/node/tmp/node-config.yaml /etc/origin/node/node-config.yaml\\n SYSTEMD_IGNORE_CHROOT=1 systemctl restart tuned || :\\n echo \\\"info: Configuration changed, restarting kubelet\\\" 2\\u003e\\u00261\\n # TODO: kubelet doesn't relabel nodes, best effort for now\\n # https://github.com/kubernetes/kubernetes/issues/59314\\n if args=\\\"$(openshift-node-config --config /etc/origin/node/node-config.yaml)\\\"; then\\n labels=$(tr ' ' '\\\\n' \\u003c\\u003c\\u003c$args | sed -ne '/^--node-labels=/ { s/^--node-labels=//; p; }' | tr ',\\\\n' ' ')\\n if [[ -n \\\"${labels}\\\" ]]; then\\n echo \\\"info: Applying node labels $labels\\\" 2\\u003e\\u00261\\n if ! oc label --config=/etc/origin/node/node.kubeconfig \\\"node/${NODE_NAME}\\\" ${labels} --overwrite; then\\n echo \\\"error: Unable to apply labels, will retry in 10\\\" 2\\u003e\\u00261\\n sleep 10 \\u0026\\n wait $!\\n continue\\n fi\\n fi\\n else\\n echo \\\"error: The downloaded node configuration is invalid, retrying later\\\" 2\\u003e\\u00261\\n sleep 10 \\u0026\\n wait $!\\n continue\\n fi\\n if ! pkill -U 0 -f '(^|/)hyperkube kubelet '; then\\n echo \\\"error: Unable to restart Kubelet\\\" 2\\u003e\\u00261\\n sleep 10 \\u0026\\n wait $!\\n continue\\n fi\\n fi\\n # annotate node with md5sum of the config\\n oc annotate --config=/etc/origin/node/node.kubeconfig \\\"node/${NODE_NAME}\\\" \\\\\\n node.openshift.io/md5sum=\\\"$( cat /tmp/.new | cut -d' ' -f1 )\\\" --overwrite\\n cp -f /tmp/.new /tmp/.old\\n sleep 180 \\u0026\\n wait $!\\ndone\\n\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"image\":\" \",\"name\":\"sync\",\"securityContext\":{\"privileged\":true,\"runAsUser\":0},\"volumeMounts\":[{\"mountPath\":\"/etc/origin/node/\",\"name\":\"host-config\"},{\"mountPath\":\"/etc/sysconfig\",\"name\":\"host-sysconfig-node\",\"readOnly\":true},{\"mountPath\":\"/var/run/dbus\",\"name\":\"var-run-dbus\",\"readOnly\":true},{\"mountPath\":\"/run/systemd/system\",\"name\":\"run-systemd-system\",\"readOnly\":true}]}],\"hostNetwork\":true,\"hostPID\":true,\"priorityClassName\":\"system-node-critical\",\"serviceAccountName\":\"sync\",\"terminationGracePeriodSeconds\":1,\"tolerations\":[{\"operator\":\"Exists\"}],\"volumes\":[{\"hostPath\":{\"path\":\"/etc/origin/node\"},\"name\":\"host-config\"},{\"hostPath\":{\"path\":\"/etc/sysconfig\"},\"name\":\"host-sysconfig-node\"},{\"hostPath\":{\"path\":\"/var/run/dbus\"},\"name\":\"var-run-dbus\"},{\"hostPath\":{\"path\":\"/run/systemd/system\"},\"name\":\"run-systemd-system\"}]}},\"updateStrategy\":{\"rollingUpdate\":{\"maxUnavailable\":\"50%\"},\"type\":\"RollingUpdate\"}}}\n", "kubernetes.io/description": "This daemon set provides dynamic configuration of nodes and relabels nodes as appropriate.\n" }, "creationTimestamp": "2019-01-17T12:07:32Z", "generation": 9, "labels": { "app": "sync", "component": "network", "openshift.io/component": "sync", "type": "infra" }, "name": "sync", "namespace": "openshift-node", "resourceVersion": "50430", "selfLink": "/apis/extensions/v1beta1/namespaces/openshift-node/daemonsets/sync", "uid": "7835e5be-1a50-11e9-857c-00155d540c67" }, "spec": { "revisionHistoryLimit": 10, "selector": { "matchLabels": { "app": "sync" } }, "template": { "metadata": { "annotations": { "scheduler.alpha.kubernetes.io/critical-pod": "" }, "creationTimestamp": null, "labels": { "app": "sync", "component": "network", "openshift.io/component": "sync", "type": "infra" } }, "spec": { "containers": [ { "command": [ "/bin/bash", "-c", "#!/bin/bash\nset -euo pipefail\n\n# set by the node image\nunset KUBECONFIG\n\ntrap 'kill $(jobs -p); exit 0' TERM\n\n# track the current state of the config\nif [[ -f /etc/origin/node/node-config.yaml ]]; then\n md5sum /etc/origin/node/node-config.yaml > /tmp/.old\nelse\n touch /tmp/.old\nfi\n\n# loop until BOOTSTRAP_CONFIG_NAME is set\nwhile true; do\n file=/etc/sysconfig/origin-node\n if [[ -f /etc/sysconfig/atomic-openshift-node ]]; then\n file=/etc/sysconfig/atomic-openshift-node\n elif [[ -f /etc/sysconfig/origin-node ]]; then\n file=/etc/sysconfig/origin-node\n else\n echo \"info: Waiting for the node sysconfig file to be created\" 2>&1\n sleep 15 & wait\n continue\n fi\n name=\"$(sed -nE 's|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\1|p' \"${file}\" | head -1)\"\n if [[ -z \"${name}\" ]]; then\n echo \"info: Waiting for BOOTSTRAP_CONFIG_NAME to be set\" 2>&1\n sleep 15 & wait\n continue\n fi\n # in the background check to see if the value changes and exit if so\n pid=$BASHPID\n (\n while true; do\n if ! updated=\"$(sed -nE 's|^BOOTSTRAP_CONFIG_NAME=([^#].+)|\\1|p' \"${file}\" | head -1)\"; then\n echo \"error: Unable to check for bootstrap config, exiting\" 2>&1\n kill $pid\n exit 1\n fi\n if [[ \"${updated}\" != \"${name}\" ]]; then\n echo \"info: Bootstrap configuration profile name changed, exiting\" 2>&1\n kill $pid\n exit 0\n fi\n sleep 15\n done\n ) &\n break\ndone\nmkdir -p /etc/origin/node/tmp\n# periodically refresh both node-config.yaml and relabel the node\nwhile true; do\n if ! oc extract \"configmaps/${name}\" -n openshift-node --to=/etc/origin/node/tmp --confirm --request-timeout=10s --config /etc/origin/node/node.kubeconfig \"--token=$( cat /var/run/secrets/kubernetes.io/serviceaccount/token )\" > /dev/null; then\n echo \"error: Unable to retrieve latest config for node\" 2>&1\n sleep 15 &\n wait $!\n continue\n fi\n\n KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :\n if ! [[ -z \"$KUBELET_HOSTNAME_OVERRIDE\" ]]; then\n #Patching node-config for hostname override\n echo \"nodeName: $KUBELET_HOSTNAME_OVERRIDE\" >> /etc/origin/node/tmp/node-config.yaml\n fi\n\n # detect whether the node-config.yaml has changed, and if so trigger a restart of the kubelet.\n if [[ ! -f /etc/origin/node/node-config.yaml ]]; then\n cat /dev/null > /tmp/.old\n fi\n\n md5sum /etc/origin/node/tmp/node-config.yaml > /tmp/.new\n if [[ \"$( cat /tmp/.old )\" != \"$( cat /tmp/.new )\" ]]; then\n mv /etc/origin/node/tmp/node-config.yaml /etc/origin/node/node-config.yaml\n SYSTEMD_IGNORE_CHROOT=1 systemctl restart tuned || :\n echo \"info: Configuration changed, restarting kubelet\" 2>&1\n # TODO: kubelet doesn't relabel nodes, best effort for now\n # https://github.com/kubernetes/kubernetes/issues/59314\n if args=\"$(openshift-node-config --config /etc/origin/node/node-config.yaml)\"; then\n labels=$(tr ' ' '\\n' <<<$args | sed -ne '/^--node-labels=/ { s/^--node-labels=//; p; }' | tr ',\\n' ' ')\n if [[ -n \"${labels}\" ]]; then\n echo \"info: Applying node labels $labels\" 2>&1\n if ! oc label --config=/etc/origin/node/node.kubeconfig \"node/${NODE_NAME}\" ${labels} --overwrite; then\n echo \"error: Unable to apply labels, will retry in 10\" 2>&1\n sleep 10 &\n wait $!\n continue\n fi\n fi\n else\n echo \"error: The downloaded node configuration is invalid, retrying later\" 2>&1\n sleep 10 &\n wait $!\n continue\n fi\n if ! pkill -U 0 -f '(^|/)hyperkube kubelet '; then\n echo \"error: Unable to restart Kubelet\" 2>&1\n sleep 10 &\n wait $!\n continue\n fi\n fi\n # annotate node with md5sum of the config\n oc annotate --config=/etc/origin/node/node.kubeconfig \"node/${NODE_NAME}\" \\\n node.openshift.io/md5sum=\"$( cat /tmp/.new | cut -d' ' -f1 )\" --overwrite\n cp -f /tmp/.new /tmp/.old\n sleep 180 &\n wait $!\ndone\n" ], "env": [ { "name": "NODE_NAME", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "spec.nodeName" } } } ], "image": "registry.redhat.io/openshift3/ose-node:v3.11.59", "imagePullPolicy": "IfNotPresent", "name": "sync", "resources": {}, "securityContext": { "privileged": true, "runAsUser": 0 }, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [ { "mountPath": "/etc/origin/node/", "name": "host-config" }, { "mountPath": "/etc/sysconfig", "name": "host-sysconfig-node", "readOnly": true }, { "mountPath": "/var/run/dbus", "name": "var-run-dbus", "readOnly": true }, { "mountPath": "/run/systemd/system", "name": "run-systemd-system", "readOnly": true } ] } ], "dnsPolicy": "ClusterFirst", "hostNetwork": true, "hostPID": true, "priorityClassName": "system-node-critical", "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "sync", "serviceAccountName": "sync", "terminationGracePeriodSeconds": 1, "tolerations": [ { "operator": "Exists" } ], "volumes": [ { "hostPath": { "path": "/etc/origin/node", "type": "" }, "name": "host-config" }, { "hostPath": { "path": "/etc/sysconfig", "type": "" }, "name": "host-sysconfig-node" }, { "hostPath": { "path": "/var/run/dbus", "type": "" }, "name": "var-run-dbus" }, { "hostPath": { "path": "/run/systemd/system", "type": "" }, "name": "run-systemd-system" } ] } }, "templateGeneration": 9, "updateStrategy": { "rollingUpdate": { "maxUnavailable": "50%" }, "type": "RollingUpdate" } }, "status": { "currentNumberScheduled": 5, "desiredNumberScheduled": 5, "numberAvailable": 3, "numberMisscheduled": 0, "numberReady": 3, "numberUnavailable": 2, "observedGeneration": 9, "updatedNumberScheduled": 4 } } ], "returncode": 0 }, "state": "list" } NO MORE HOSTS LEFT ************************************************************************************************************************************************************************************************ to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry PLAY RECAP ******************************************************************************************************************************************************************************************************** ctmoslb1.cdm.com : ok=33 changed=2 unreachable=0 failed=0 ctmosmn1.cdm.com : ok=442 changed=118 unreachable=0 failed=1 ctmosmn2.cdm.com : ok=264 changed=75 unreachable=0 failed=0 ctmosmn3.cdm.com : ok=264 changed=75 unreachable=0 failed=0 ctmoswn1.cdm.com : ok=103 changed=21 unreachable=0 failed=0 ctmoswn2.cdm.com : ok=103 changed=21 unreachable=0 failed=0 localhost : ok=12 changed=0 unreachable=0 failed=0 INSTALLER STATUS ************************************************************************************************************************************************************************************************** Initialization : Complete (0:03:59) Health Check : Complete (0:00:48) Node Bootstrap Preparation : Complete (0:11:21) etcd Install : Complete (0:03:20) Load Balancer Install : Complete (0:00:37) Master Install : In Progress (0:29:47) This phase can be restarted by running: playbooks/openshift-master/config.yml Failure summary: 1. Hosts: ctmosmn1.cdm.com Play: Deploy the central bootstrap configuration Task: Wait for the sync daemonset to become ready and available Message: Failed without returning a message. ##### Additional Information Inventory [OSEv3:vars] ansible_ssh_user=root openshift_deployment_type=openshift-enterprise openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}] openshift_master_cluster_method=native openshift_master_cluster_hostname=csmoscluster-internal.cdm.com openshift_master_cluster_public_hostname=csmoscluster.cdm.com openshift_master_default_subdomain=cloud.domain.eu debug_level = 4 # services subnet; def. 172.30.0.0(16 openshift_portal_net=10.195.0.0/16 openshift_hosted_registry_routehost=registry.cloud.domain.eu openshift_hosted_registry_routetermination=reencrypt openshift_hosted_registry_routecertificates= "{'certfile': '/root/certs/x.domain.eu.crt', 'keyfile': '/root/certs/x.domain.eu.key', 'cafile': '/root/certs/inter.crt'}" openshift_master_named_certificates= [{"certfile": "/root/certs/x.cdm.com.crt", "keyfile": "/root/certs/x.cdm.com.key", "cafile": "/root/certs/rapidssl.crt", "names": ["csmosmn.cdm.com"]}] openshift_master_overwrite_named_certificates=true oreg_auth_user=myuser oreg_auth_password=mypass openshift_disable_check=docker_image_availability [masters] csmosmn1.cdm.com csmosmn2.cdm.com csmosmn3.cdm.com [etcd] csmosmn1.cdm.com csmosmn2.cdm.com csmosmn3.cdm.com [lb] csmoslb1.cdm.com [nodes] csmosmn1.cdm.com openshift_node_group_name='node-config-master-infra' csmosmn2.cdm.com openshift_node_group_name='node-config-master-infra' csmosmn3.cdm.com openshift_node_group_name='node-config-master-infra' csmoswn1.cdm.com openshift_node_group_name='node-config-compute' csmoswn2.cdm.com openshift_node_group_name='node-config-compute'
vrutkovs commented 5 years ago

"numberUnavailable": 2,

Do you have the logs why two sync pods didn't come up?

vjernej commented 5 years ago

Hi, I've just reinstalled complete environment just to be sure. I'll search the logs for 2 pods that are unavailable if the problem happens again. Thanks for the hint.

vjernej commented 5 years ago

After the complete RHEL re-installation the issue is solved. Thank you very much for help anyway.