NVIDIA / deepops

Tools for building GPU clusters
BSD 3-Clause "New" or "Revised" License
1.25k stars 326 forks source link

Implementation Fails on RHEL 7.6 - UndefinedError: 'dict object' has no attribute 'kube_node' #1231

Closed anieshmathew closed 1 year ago

anieshmathew commented 2 years ago

I am trying to install deepops on RHEL 7.6 and failing at the below task with the error mentioned.

TASK [kubernetes/node : Write kubelet environment config file (kubeadm)] *** task path: /home/admin/deepops/submodules/kubespray/roles/kubernetes/node/tasks/kubelet.yml:18 <10.2.95.200> ESTABLISH SSH CONNECTION FOR USER: admin <10.2.95.200> SSH: ansible.cfg set ssh_args: (-o)(ControlMaster=auto)(-o)(ControlPersist=5m)(-o)(ConnectionAttempts=100)(-o)(UserKnownHostsFile=/dev/null) <10.2.95.200> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no) <10.2.95.200> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User="admin") <10.2.95.200> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=60) <10.2.95.200> SSH: Set ssh_common_args: () <10.2.95.200> SSH: Set ssh_extra_args: () <10.2.95.200> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath="~/.ssh/ansible-%r@%h:%p") <10.2.95.200> SSH: EXEC sshpass -d9 ssh -vvv -o ControlMaster=auto -o ControlPersist=5m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o 'User="admin"' -o ConnectTimeout=60 -o 'ControlPath="~/.ssh/ansible-%r@%h:%p"' 10.2.95.200 '/bin/sh -c '"'"'( umask 77 && mkdir -p "echo /tmp"&& mkdir "echo /tmp/ansible-tmp-1664645489.9844334-25691-50490429029708" && echo ansible-tmp-1664645489.9844334-25691-50490429029708="echo /tmp/ansible-tmp-1664645489.9844334-25691-50490429029708" ) && sleep 0'"'"'' <10.2.95.200> (0, b'ansible-tmp-1664645489.9844334-25691-50490429029708=/tmp/ansible-tmp-1664645489.9844334-25691-50490429029708\n', b'OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 58: Applying options for \r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 21080\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n') looking for "kubelet.env.v1beta1.j2" at "/home/admin/deepops/submodules/kubespray/roles/kubernetes/node/templates/kubelet.env.v1beta1.j2" <10.2.95.200> ESTABLISH SSH CONNECTION FOR USER: admin <10.2.95.200> SSH: ansible.cfg set ssh_args: (-o)(ControlMaster=auto)(-o)(ControlPersist=5m)(-o)(ConnectionAttempts=100)(-o)(UserKnownHostsFile=/dev/null) <10.2.95.200> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no) <10.2.95.200> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User="admin") <10.2.95.200> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=60) <10.2.95.200> SSH: Set ssh_common_args: () <10.2.95.200> SSH: Set ssh_extra_args: () <10.2.95.200> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath="~/.ssh/ansible-%r@%h:%p") <10.2.95.200> SSH: EXEC sshpass -d9 ssh -vvv -o ControlMaster=auto -o ControlPersist=5m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o 'User="admin"' -o ConnectTimeout=60 -o 'ControlPath="~/.ssh/ansible-%r@%h:%p"' 10.2.95.200 '/bin/sh -c '"'"'rm -f -r /tmp/ansible-tmp-1664645489.9844334-25691-50490429029708/ > /dev/null 2>&1 && sleep 0'"'"'' <10.2.95.200> (0, b'', b'OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 58: Applying options for \r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 21080\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n') The full traceback is: Traceback (most recent call last): File "/opt/deepops/env/lib/python3.6/site-packages/ansible/template/init.py", line 1121, in do_template res = j2_concat(rf) File "