redhat-ztp / ztp-cluster-deploy

5 stars 25 forks source link

[ibm-telco] Worker node addition is failing with latest playbooks #81

Closed maheshd2 closed 3 years ago

maheshd2 commented 3 years ago

@yrobla

With the latest playbooks, we got unblocked on this issue https://github.com/redhat-ztp/ztp-cluster-deploy/issues/75. Now cluster deployment is successful.

However, we are facing issue with worker node addition at later stage.

Issue 1: One of the variable is missing in the inventory/hosts file under [all:vars] section. I defined the missing variable 'final_iso_path' and able to move further.

final_iso_path=/var/www/html

Issue 2: The generated worker ISO is failing to boot (BLOCKER)

Just to ensure worker node boot properties are proper, I did following test

Then I tried to manually boot the CoreOS worker ISO image generated by 'modify-iso-day2' play. But it FAILED.

modify-iso-day2 image

maheshd2 commented 3 years ago

Inventory/hosts

[all:vars]

# The pull_secret from https://cloud.redhat.com/openshift/install/crc/installer-provisioned
pull_secret='*****************************************************'

# ssh key of the installer machine
ssh_public_key="**********************************************"

# should be same as the installer machine IP-address
ai_url="http://172.29.231.41:8080"
iso_url="http://172.29.231.41"

# You will need to configure your env DNS server for the cluster_domain and cluster_name
cluster_name="ztpdev"
cluster_domain="rtp.raleigh.ibm.com"
cluster_version="4.6"
#cluster_sdn="OpenShiftSDN" # if not set, defaults to OVNKubernetes

# Make sure api_vip is mapped in your env DNS server to api.{cluster_name}.{cluster_domain} AND ingress_vip to *.apps.{cluster_name}.{cluster_domain}
ingress_vip=172.29.231.155
api_vip=172.29.231.156

bridge_name=ztp-br
libvirt_uri="qemu:///system"
need_racadm=true

# path where to generate temporary directories
temporary_path=/tmp
final_iso_path=/var/www/html

# optional: path where to store images for controlplane
# libvirt_images_path=/var/lib/libvirt/images
#
# network configuration, in nmstate format. Please enable
# that if you need to rely on static ip configuration
#network_config_path=./samples/worker.yaml

[provisioner]
# host from where the installation is performed
localhost ansible_connection=local
#<remote_ip> ansible_connection=ssh

[master_nodes]
master_1 name=master_1 mac_address=52:54:00:55:f3:11
master_2 name=master_2 mac_address=52:54:00:55:f3:12
master_3 name=master_3 mac_address=52:54:00:55:f3:13

[worker_nodes]
# only set ip if you need to embed static network. It needs to match with the nmstate config
ztpdev-worker-0 bmc_type=Dell name=ztpdev-worker-0 bmc_address=172.29.230.44 bmc_user="root" bmc_password="*************"
yrobla commented 3 years ago

I think that you are right, we modified the ISO generation, and i haven't been able to test with Dell properly. Let me push some changes, they should unlock you.

yrobla commented 3 years ago

In the meantime, try adding iso_url=/var/www/html , to each of the worker_nodes entries

maheshd2 commented 3 years ago

@yrobla I tried the stable branch stable/4.7 as suggested. But still facing the issue while adding worker node.

I'm able to manually attach virtual media successfully. But via ZTP script/commands it is failing even after repeated attempts.


TASK [/home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/../common-roles/enroll-dell : Run racadm] *********************
task path: /home/worker_fix/ztp-cluster-deploy/common-roles/enroll-dell/tasks/main.yml:8
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278 `" && echo ansible-tmp-1616216005.6716409-1865433-145891592932278="` echo /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
<localhost> PUT /root/.ansible/tmp/ansible-local-1865225bshhymx0/tmp3nxf032n TO /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278/AnsiballZ_command.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278/ /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616216005.6716409-1865433-145891592932278/ > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {
    "changed": true,
    "cmd": "/root/racadm.sh root tncL@b1mm! 172.29.230.44 /var/www/html ztpdev-day2.iso",
    "delta": "0:00:06.470488",
    "end": "2021-03-20 00:53:32.467695",
    "invocation": {
        "module_args": {
            "_raw_params": "/root/racadm.sh root tncL@b1mm! 172.29.230.44 /var/www/html ztpdev-day2.iso",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "msg": "non-zero return code",
    "rc": 1,
    "start": "2021-03-20 00:53:25.997207",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "Security Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rRemote File Share is Disabled\r\nUserName \r\nPassword \r\nShareName \r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rDisable Remote File Started. Please check status using -s\noption to know Remote File Share is ENABLED or DISABLED.\r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rERROR: Unable to perform requested operation.",
    "stdout_lines": [
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "Remote File Share is Disabled",
        "UserName ",
        "Password ",
        "ShareName ",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "Disable Remote File Started. Please check status using -s",
        "option to know Remote File Share is ENABLED or DISABLED.",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "ERROR: Unable to perform requested operation."
    ]
}

PLAY RECAP *****************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=1    skipped=1    rescued=0    ignored=0

Even tried executing the commands one by one.

[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.44 -u root -p tncL@b1mm! remoteimage -s
Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.
Remote File Share is Disabled
UserName
Password
ShareName

[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.44 -u root -p tncL@b1mm! remoteimage -d
Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.
Disable Remote File Started. Please check status using -s
option to know Remote File Share is ENABLED or DISABLED.

[root@ztpdevhost ai-deploy-cluster-remoteworker]#
[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.44 -u root -p tncL@b1mm! remoteimage -c -l /var/www/html/ztpdev-day2.iso
Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.
ERROR: Unable to perform requested operation.

[root@ztpdevhost ai-deploy-cluster-remoteworker]#
[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.44 -u root -p tncL@b1mm! remoteimage -c -l /var/www/html/ztpdev-day2.iso
Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.
ERROR: Unable to perform requested operation.

[root@ztpdevhost ai-deploy-cluster-remoteworker]# ping 172.29.230.44
PING 172.29.230.44 (172.29.230.44) 56(84) bytes of data.
64 bytes from 172.29.230.44: icmp_seq=1 ttl=63 time=0.595 ms
^C
--- 172.29.230.44 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.595/0.595/0.595/0.000 ms
pmallan commented 3 years ago

@yrobla did you get chance to look into this issue. We are blocked as we are not able to use playbooks to add worker node. Please let us know if you need any further details

maheshd2 commented 3 years ago

@yrobla I tried again with the latest playbooks, I could see the Virtual Media/Remote File Share attached, but boot fails to detect the OS and skips installation of OS.

  1. Via ZTP playbooks as well as manually, I'm able to attach the virtual media successfully.
    
    [root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.43 -u root -p tncL@b1mm! remoteimage -d
    Security Alert: Certificate is invalid - self signed certificate
    Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.
    Disable Remote File Started. Please check status using -s
    option to know Remote File Share is ENABLED or DISABLED.

[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.43 -u root -p tncL@b1mm! remoteimage -c -l http://172.29.231.41/ztpdev-day2.iso Security Alert: Certificate is invalid - self signed certificate Continuing execution. Use -S option for racadm to stop execution on certificate-related errors. Remote Image is now Configured

[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.43 -u root -p tncL@b1mm! set iDRAC.VirtualMedia.BootOnce 1 Security Alert: Certificate is invalid - self signed certificate Continuing execution. Use -S option for racadm to stop execution on certificate-related errors. [Key=iDRAC.Embedded.1#VirtualMedia.1] Object value modified successfully

[root@ztpdevhost ai-deploy-cluster-remoteworker]# [root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.43 -u root -p tncL@b1mm! set iDRAC.ServerBoot.FirstBootDevice VCD-DVD Security Alert: Certificate is invalid - self signed certificate Continuing execution. Use -S option for racadm to stop execution on certificate-related errors. [Key=iDRAC.Embedded.1#ServerBoot.1] Object value modified successfully

[root@ztpdevhost ai-deploy-cluster-remoteworker]# /opt/dell/srvadmin/bin/idracadm7 -r 172.29.230.43 -u root -p tncL@b1mm! serveraction powercycle Security Alert: Certificate is invalid - self signed certificate Continuing execution. Use -S option for racadm to stop execution on certificate-related errors. Server power operation initiated successfully



2. But installation failed.

![image](https://user-images.githubusercontent.com/27758674/112188296-e3997e80-8c28-11eb-89b9-cf7d5fbc6846.png)

3. To make sure there is no issue with server, I tried booting a RHEL ISO. It successfully detected OS and install wizard was rendered.

![image](https://user-images.githubusercontent.com/27758674/112188720-3e32da80-8c29-11eb-9564-a8dc1289989c.png)

4. Also tried with CoreOS image generated from RedHat Online AI. It also got installed successfully.

![image](https://user-images.githubusercontent.com/27758674/112189011-923dbf00-8c29-11eb-9136-75f0e45907e0.png)
maheshd2 commented 3 years ago

@yrobla Ansible Logs


TASK [Gathering Facts] *****************************************************************************************************************
task path: /home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/playbook.yml:2
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370 `" && echo ansible-tmp-1616529604.7317722-3068320-162783656797370="` echo /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370 `" ) && sleep 0'
<localhost> Attempting python interpreter discovery
<localhost> EXEC /bin/sh -c 'echo PLATFORM; uname; echo FOUND; command -v '"'"'/usr/bin/python'"'"'; command -v '"'"'python3.7'"'"'; command -v '"'"'python3.6'"'"'; command -v '"'"'python3.5'"'"'; command -v '"'"'python2.7'"'"'; command -v '"'"'python2.6'"'"'; command -v '"'"'/usr/libexec/platform-python'"'"'; command -v '"'"'/usr/bin/python3'"'"'; command -v '"'"'python'"'"'; echo ENDFOUND && sleep 0'
<localhost> EXEC /bin/sh -c '/bin/python3.6 && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/system/setup.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmpgi5wdsn0 TO /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370/AnsiballZ_setup.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370/ /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370/AnsiballZ_setup.py && sleep 0'
<localhost> EXEC /bin/sh -c '/usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370/AnsiballZ_setup.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616529604.7317722-3068320-162783656797370/ > /dev/null 2>&1 && sleep 0'
ok: [localhost]
META: ran handlers

TASK [Enroll a SuperMicro node] ********************************************************************************************************
task path: /home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/roles/add-remote-workers/tasks/main.yml:2
skipping: [localhost] => (item=ztpdev-worker-0)  => {
    "ansible_loop_var": "item",
    "changed": false,
    "item": "ztpdev-worker-0",
    "skip_reason": "Conditional result was False"
}

TASK [Enroll a Dell node] **************************************************************************************************************
task path: /home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/roles/add-remote-workers/tasks/main.yml:15

TASK [/home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/../common-roles/enroll-dell : Copy template racadm] ***********
task path: /home/worker_fix/ztp-cluster-deploy/common-roles/enroll-dell/tasks/main.yml:1
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603 `" && echo ansible-tmp-1616529606.8792417-3068519-68507767683603="` echo /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/files/stat.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmp7r6l63gh TO /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/AnsiballZ_stat.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/ /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/AnsiballZ_stat.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/AnsiballZ_stat.py && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/files/file.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmp7fz9aefw TO /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/AnsiballZ_file.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/ /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/AnsiballZ_file.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/AnsiballZ_file.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616529606.8792417-3068519-68507767683603/ > /dev/null 2>&1 && sleep 0'
ok: [localhost] => {
    "changed": false,
    "checksum": "f0487716f474b151207153956d0415a2f1e2c80b",
    "dest": "/root/racadm.sh",
    "diff": {
        "after": {
            "path": "/root/racadm.sh"
        },
        "before": {
            "path": "/root/racadm.sh"
        }
    },
    "gid": 0,
    "group": "root",
    "invocation": {
        "module_args": {
            "_diff_peek": null,
            "_original_basename": "racadm.sh",
            "access_time": null,
            "access_time_format": "%Y%m%d%H%M.%S",
            "attributes": null,
            "backup": null,
            "content": null,
            "delimiter": null,
            "dest": "/root/racadm.sh",
            "directory_mode": null,
            "follow": false,
            "force": false,
            "group": null,
            "mode": 700,
            "modification_time": null,
            "modification_time_format": "%Y%m%d%H%M.%S",
            "owner": "root",
            "path": "/root/racadm.sh",
            "recurse": false,
            "regexp": null,
            "remote_src": null,
            "selevel": null,
            "serole": null,
            "setype": null,
            "seuser": null,
            "src": null,
            "state": "file",
            "unsafe_writes": null
        }
    },
    "mode": "01274",
    "owner": "root",
    "path": "/root/racadm.sh",
    "secontext": "system_u:object_r:admin_home_t:s0",
    "size": 660,
    "state": "file",
    "uid": 0
}

TASK [/home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/../common-roles/enroll-dell : Run racadm] *********************
task path: /home/worker_fix/ztp-cluster-deploy/common-roles/enroll-dell/tasks/main.yml:8
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300 `" && echo ansible-tmp-1616529607.6451797-3068543-194760621425300="` echo /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmp4kuyzumf TO /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300/AnsiballZ_command.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300/ /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616529607.6451797-3068543-194760621425300/ > /dev/null 2>&1 && sleep 0'
changed: [localhost] => {
    "changed": true,
    "cmd": "/root/racadm.sh root tncL@b1mm! 172.29.230.43 http://172.29.231.41 ztpdev-day2.iso",
    "delta": "0:00:17.339003",
    "end": "2021-03-23 16:00:25.307584",
    "invocation": {
        "module_args": {
            "_raw_params": "/root/racadm.sh root tncL@b1mm! 172.29.230.43 http://172.29.231.41 ztpdev-day2.iso",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "rc": 0,
    "start": "2021-03-23 16:00:07.968581",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "Security Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rRemote File Share is Enabled\r\nUserName \r\nPassword \r\nShareName http://172.29.231.41/ztpdev-day2.iso\r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rDisable Remote File Started. Please check status using -s\noption to know Remote File Share is ENABLED or DISABLED.\r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rRemote Image is now Configured\r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \r[Key=iDRAC.Embedded.1#VirtualMedia.1]\r\nObject value modified successfully\r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \r[Key=iDRAC.Embedded.1#ServerBoot.1]\r\nObject value modified successfully\r\n\nSecurity Alert: Certificate is invalid - self signed certificate\nContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.\n\r                                                                             \r\r                                                                             \rServer power operation initiated successfully",
    "stdout_lines": [
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "Remote File Share is Enabled",
        "UserName ",
        "Password ",
        "ShareName http://172.29.231.41/ztpdev-day2.iso",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "Disable Remote File Started. Please check status using -s",
        "option to know Remote File Share is ENABLED or DISABLED.",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "Remote Image is now Configured",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "[Key=iDRAC.Embedded.1#VirtualMedia.1]",
        "Object value modified successfully",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "[Key=iDRAC.Embedded.1#ServerBoot.1]",
        "Object value modified successfully",
        "",
        "Security Alert: Certificate is invalid - self signed certificate",
        "Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.",
        "",
        "                                                                             ",
        "",
        "                                                                             ",
        "Server power operation initiated successfully"
    ]
}

TASK [add-remote-workers : Wait until hosts are ready in Assisted Installer] ***********************************************************
task path: /home/worker_fix/ztp-cluster-deploy/ai-deploy-cluster-remoteworker/roles/add-remote-workers/tasks/main.yml:26
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340 `" && echo ansible-tmp-1616529625.4084525-3068631-264696509737340="` echo /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmpu30_af8x TO /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340/AnsiballZ_command.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340/ /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin AI_URL=http://172.29.231.41:8080 /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616529625.4084525-3068631-264696509737340/ > /dev/null 2>&1 && sleep 0'
FAILED - RETRYING: Wait until hosts are ready in Assisted Installer (100 retries left).Result was: {
    "attempts": 1,
    "changed": true,
    "cmd": "aicli list hosts | grep ztpdev-day2 | grep 'insufficient\\|known' | wc -l",
    "delta": "0:00:00.298402",
    "end": "2021-03-23 16:00:25.885856",
    "invocation": {
        "module_args": {
            "_raw_params": "aicli list hosts | grep ztpdev-day2 | grep 'insufficient\\|known' | wc -l",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "rc": 0,
    "retries": 101,
    "start": "2021-03-23 16:00:25.587454",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "0",
    "stdout_lines": [
        "0"
    ]
}
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890 `" && echo ansible-tmp-1616529655.9780512-3068631-80372126225890="` echo /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmpm8vu7lqw TO /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890/AnsiballZ_command.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890/ /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin AI_URL=http://172.29.231.41:8080 /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616529655.9780512-3068631-80372126225890/ > /dev/null 2>&1 && sleep 0'
FAILED - RETRYING: Wait until hosts are ready in Assisted Installer (99 retries left).Result was: {
    "attempts": 2,
    "changed": true,
    "cmd": "aicli list hosts | grep ztpdev-day2 | grep 'insufficient\\|known' | wc -l",
    "delta": "0:00:00.317610",
    "end": "2021-03-23 16:00:56.488313",
    "invocation": {
        "module_args": {
            "_raw_params": "aicli list hosts | grep ztpdev-day2 | grep 'insufficient\\|known' | wc -l",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "rc": 0,
    "retries": 101,
    "start": "2021-03-23 16:00:56.170703",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "0",
    "stdout_lines": [
        "0"
    ]
}
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187 `" && echo ansible-tmp-1616529686.5517852-3068631-136166034286187="` echo /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmpuc6f43jk TO /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187/AnsiballZ_command.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187/ /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin AI_URL=http://172.29.231.41:8080 /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1616529686.5517852-3068631-136166034286187/ > /dev/null 2>&1 && sleep 0'
FAILED - RETRYING: Wait until hosts are ready in Assisted Installer (98 retries left).Result was: {
    "attempts": 3,
    "changed": true,
    "cmd": "aicli list hosts | grep ztpdev-day2 | grep 'insufficient\\|known' | wc -l",
    "delta": "0:00:00.296384",
    "end": "2021-03-23 16:01:27.034245",
    "invocation": {
        "module_args": {
            "_raw_params": "aicli list hosts | grep ztpdev-day2 | grep 'insufficient\\|known' | wc -l",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "rc": 0,
    "retries": 101,
    "start": "2021-03-23 16:01:26.737861",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "0",
    "stdout_lines": [
        "0"
    ]
}
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1616529717.1187513-3068631-171640932259195 `" && echo ansible-tmp-1616529717.1187513-3068631-171640932259195="` echo /root/.ansible/tmp/ansible-tmp-1616529717.1187513-3068631-171640932259195 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
<localhost> PUT /root/.ansible/tmp/ansible-local-3068313dmb93mzm/tmp8uy6heey TO /root/.ansible/tmp/ansible-tmp-1616529717.1187513-3068631-171640932259195/AnsiballZ_command.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1616529717.1187513-3068631-171640932259195/ /root/.ansible/tmp/ansible-tmp-1616529717.1187513-3068631-171640932259195/AnsiballZ_command.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/usr/bin/:/usr/local/bin/:/sbin:/bin:/usr/sbin:/usr/bin AI_URL=http://172.29.231.41:8080 /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1616529717.1187513-3068631-171640932259195/AnsiballZ_command.py && sleep 0'
maheshd2 commented 3 years ago

This issue got resolved after code fix + changing boot mode to BIOS. It is observed there is some issue with UEFI boot mode with ISO generated using Assisted Installer.

The install guide has to be updated with the instructions to make sure boot mode is set to BIOS. Added another issue to track doc changes and UEFI support: https://github.com/redhat-ztp/ztp-cluster-deploy/issues/84

This issue can be closed.

yrobla commented 3 years ago

We are waiting for a fix so we can get UEFI support. I will update the code to consume it as soon as it's available