dell / dellemc-openmanage-ansible-modules

Dell OpenManage Ansible Modules
GNU General Public License v3.0
340 stars 164 forks source link

[BUG]: idrac_server_config_profile doesn't work with "job_wait: True" correctly #486

Closed ruic11111 closed 1 year ago

ruic11111 commented 1 year ago

Bug Description

in the idrac_server_config_profile module, the ansible job completes without waiting for the iDRAC SCP import job to complete even if "job_wait" is "True"

I have confirmed that there is a problem with the functions of the following modules referenced from idrac_server_config_profile.

#dellemc-openmanage-ansible-modules/plugins/module_utils/idrac_redfish.py -> wait_for_job_complete
 while job_wait:
            try:
                response = self.invoke_request(task_uri, "GET")
                if response.json_data.get("TaskState") == "Running":
                    time.sleep(10)
                else:
                    break

When the iDRAC loads the SCP, the job status transitions to several statuses before becoming Running. ex New, Scheduled, Downloading.. However, in the above part, the module only check if job status is the Running status. If the job status is new, etc.. when the module check the iDRAC job status, the above part will exit the loop without waiting for the SCP import job to complete.

Since I don't know every status of the job, I changed the above part as follows and play the playbook again. As the result, I confirmed that the ansible work as expected.

while job_wait:
                  try:
                      time.sleep(60) #add sleep
                      response = self.invoke_request(task_uri, "GET")
                      if response.json_data.get("TaskState") in ["New", "Downloading", "Scheduled", "Running"]: #add some status
                          time.sleep(10)
                      else:
                  break

Component or Module Name

idrac_server_config_profile

Ansible Version

any

Python Version

any

iDRAC/OME/OME-M version

iDRAC 6.10.30.00

Operating System

any

Playbook Used

tasks:

Logs

no log

Steps to Reproduce

run playbook as mentioned above. (In my case, SCP contains the firmware catalog file)

Expected Behavior

Ansible job will wait until iDRAC SCP import job complete

Actual Behavior

Ansible job complete without waiting iDRAC SCP import job to complete

Screenshots

No response

Additional Information

No response

sachin-apa commented 1 year ago

@ruic11111 Thanks for reporting the bug, we did observe this issue recently.

I see two changes needs to be done here

  1. Change the JOB URI to use /redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/Jobs/<JOB_ID> and remove the code to pick from the Location of Response Header reference code here for export operation similarly has to be changed for import and preview
  2. Better to revert the condition as below.
    while job_wait:
    try:
        response = self.invoke_request(task_uri, "GET")
        if response.json_data.get("PercentComplete") == 100 and response.json_data.get("JobState") == "Completed":
            break
        else:
            time.sleep(30)
    except ValueError:
        response = response.body
        break

    Any chance you can submit a PR for this?

@anupamaloke @felixs88

ruic11111 commented 1 year ago

@sachin-apa Thank you for your message. However, your suggested code also seems to have a problem. The program seems not to break the loop and seems to go into an infinite loop when job status change to "Failed" or "Completed with Errors" and so on.

sachin-apa commented 1 year ago

@ruic11111 Oh yes you are right, I didn't try that here though..

we have to take care of the URL change (1) also may be because the location returns the URL /redfish/v1/TaskService/Tasks/<JOB_ID> which is not persisted in the iDRAC, So better to use /redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/Jobs/<JOB_ID> I will confirm and verify on the URL changes though.

Do please see if you can submit a PR with the changes you proposed, if not no worries will check internally with team on the timeline of the fix.

ruic11111 commented 1 year ago

Thank you for your message. I understand that the URL need to be changed.

I'm sorry but It seems to be difficult for me to fix and PR because I don't know about the job status exactly.

satoshi-tokyo commented 1 year ago

@sachin-apa @ruic11111 Hi, in this PR, I imported module_utils/utils.py to monitor job process using the endpoint /redfish/v1/Managers/iDRAC.Embedded.1/Jobs/{0} since modules/idrac_bios.py uses the same.

There seem to be 3 endpoints that can monitor job id for now including /redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/Jobs/<JOB_ID>. My idea is that URIs are better managed in one place like module_utils/utils.py.

sachin-apa commented 1 year ago

Fixed after merging the PR : https://github.com/dell/dellemc-openmanage-ansible-modules/pull/504