dell / dellemc-openmanage-ansible-modules

Dell OpenManage Ansible Modules
GNU General Public License v3.0
329 stars 162 forks source link

lifecycle controller not available after os installation #64

Closed mting806 closed 4 years ago

mting806 commented 5 years ago

it could be re reproduced. now i am using idrac reset as a workaround.

- name: install os
  dellemc_boot_to_network_iso: 
    idrac_ip: "{{ idrac_ip }}"
    idrac_password: "{{ idrac_password }}"
    idrac_user: "{{ idrac_user }}"
    share_name: "192.168.1.1:/opt/ansible"
    iso_image: "CentOS-ks.iso"
  register: result_install_os
- debug:
    var: result_install_os
ibt23sec5 commented 5 years ago

The same issue reproduced. When I'm retrying to boot again, the LifeCycle controller says:

LC is not ready

Even if I provide racreset soft via racadm console, the problem persist and LC is again available after one half of day. But setting the image manually + boot order via web interface works immediately.

Model: PowerEdge R640 Firmware version: 3.34.34.34 Playbook:

  dellemc_boot_to_network_iso:
    idrac_ip: "{{ idrac_ip }}"
    idrac_user: "root"
    idrac_pwd: "{{ idrac_password }}"
    share_name: "{{ nfs_image_path }}"
    iso_image: "{{ paths.dest_iso_image }}"

Output:

task path: /builds/gitlab/devops/debian-installer/roles/idrac/tasks/boot.yml:18
<xxxxxxxxxxxxxxxxx> ESTABLISH LOCAL CONNECTION FOR USER: root
<xxxxxxxxxxxxxxxxx> EXEC /bin/sh -c 'echo ~root && sleep 0'
<xxxxxxxxxxxxxxxxx> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334 `" && echo ansible-tmp-1563528982.07-281357880619334="` echo /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334 `" ) && sleep 0'
Using module file /usr/local/lib/python2.7/dist-packages/ansible/modules/remote_management/dellemc/idrac/dellemc_boot_to_network_iso.py
<xxxxxxxxxxxxxxxxx> PUT /root/.ansible/tmp/ansible-local-22s6qFbS/tmpBjY14S TO /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334/AnsiballZ_dellemc_boot_to_network_iso.py
<xxxxxxxxxxxxxxxxx> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334/ /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334/AnsiballZ_dellemc_boot_to_network_iso.py && sleep 0'
<xxxxxxxxxxxxxxxxx> EXEC /bin/sh -c '/usr/bin/python /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334/AnsiballZ_dellemc_boot_to_network_iso.py && sleep 0'
<xxxxxxxxxxxxxxxxx> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1563528982.07-281357880619334/ > /dev/null 2>&1 && sleep 0'
fatal: [xxxxxxxxxxxxxxxxx]: FAILED! => {
    "changed": false, 
    "invocation": {
        "module_args": {
            "idrac_ip": "xxx.xxx.xxx.xxx", 
            "idrac_password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER", 
            "idrac_port": 443, 
            "idrac_pwd": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER", 
            "idrac_user": "root", 
            "iso_image": "debian-netinst.iso", 
            "share_name": "nfs-share:/data/nfs-share", 
            "share_password": null, 
            "share_user": null
        }
    }, 
    "msg": {
        "Message": "LC is not ready", 
        "Status": "Failed"
    }
}
anupamaloke commented 5 years ago

@mting806 @ibt23sec5, thank you for reporting this issue. we are looking into this and working on adding an attribute that will allow users to specify the attach duration for an ISO image. On the expiry of this duration, the ISO image will get auto-detached so that LC does not remain busy.

rajeevarakkal commented 4 years ago

@mting806 @ibt23sec5 ; the latest devel version added support for "expose_duration" option which will help you specify timeout for OSD operation after that ISO will be detached and LC become free for further operations. Please Try this out and let us know the status

wioch commented 4 years ago

I have the same issue. I set _exposeduration: 180. I'll let you know the result.

Edit: As for attached ISO, I do not see it on iDRAC:

[root@cOS7-01 ansible]#  racadm -r 1.2.3.4 -u root -p"somepas" remoteimage -s
/opt/dell/srvadmin/sbin/racadm: line 13: printf: 0x: invalid hex number
Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.
Remote File Share is Disabled
UserName
Password
ShareName
rajeevarakkal commented 4 years ago

@wioch, let us know the update

tsquillario commented 4 years ago

I have been running into this issue for a while! As mentioned above the workaround is to reset the iDRAC racadm racreset

rajeevarakkal commented 4 years ago

@tsquillario the latest module have a expose_duration parameter which will help you to set a time out and make LC free. No longer need to use racadm racreset

rajeevarakkal commented 4 years ago

@tsquillario Not heard anything from you yet; also we have timeout feature enabled for OS deployment, I am closing this defect now.

davisglenn commented 4 years ago

@rajeevarakkal - I just started using the idrac_os_deployment module and even with expose_duration set to 180, after I deploy my windows server, I still get "Lifecycle Controller not available". I tried a number of different things, but the only thing that seems to work is a racadm racreset. I have a Poweredge R740 with iDRAC version 3.36.36.36

tsquillario commented 4 years ago

@davisglenn Since you're using the modules from the devel repo I think you need to update your OMSDK. This may be why it's not working. See https://github.com/dell/dellemc-openmanage-ansible-modules/issues/75

davisglenn commented 4 years ago

@tsquillario - I was hoping that was the case. But I did do a fresh clone and build yesterday and it looks like I have the latest. $ pip list | grep omsdk omsdk (1.2.387-)

tsquillario commented 4 years ago

I would suggest opening a new issue since this one is closed.