dell / iDRAC-Redfish-Scripting

Python and PowerShell scripting for Dell EMC PowerEdge iDRAC REST API with DMTF Redfish
GNU General Public License v2.0
600 stars 278 forks source link

Job history of iDRAC using Redfish #200

Closed ashokbeh closed 2 years ago

ashokbeh commented 2 years ago

Hi, I am in need of a Redfish API that can validate if a particular job is run most recently in iDRAC 9.

For example: Remote diagnostic. Scheduled tart time: Scheduled end time.

Can we use the redfish URL ? URL /redfish/v1/Managers//Jobs/<Job-Id But we cannot use the job-id because that will vary unlike job name "Remote Diagnostics". As part of the script we need to collect the such job and its id

Example from a iDRAC 9 GUI

                          ID                                                       Job                                                                               Status
            RID_414099047932          Reboot: Graceful OS shutdown with powercycle on timeout         Reboot Completed (100%)
            JID_414099044945           Remote Diagnostics        Completed (100%)
                       Scheduled Start Time2022-01-05T19:11:44
                       Actual Start Time2022-01-05T19:17:24
                       Expiration TimeNot Applicable
                       Actual Completion Time2022-01-05T21:18:11
                       MessageSYS018: Job completed successfully.

Would appreciate your support to advise the most suitable Redfish URL for iDRAC9.

Regards Ashok Behera

anupamaloke commented 2 years ago

@ashokbeh, I am guessing following is the workflow that you are looking to automate using ansible:

  tasks:
  - name: Get the job queue
    ansible.builtin.uri:
      url: "https://{{ inventory_hostname }}/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/Jobs?$expand=*"
      user: "root"
      password: "Dell_123"
      validate_certs: False
      force_basic_auth: yes
      method: GET
      headers:
        Accept: "application/json"
        OData-Version: "4.0"
      status_code: 200
    register: job_queue_response

  - name: check if a diagnostics job is still running
    ansible.builtin.set_fact:
      diag_jobs: "{{ job_queue_response.json.Members | selectattr('Name', 'eq', 'Remote Diagnostics') }}"
    when:
      - job_queue_response.json["Members@odata.count"] > 0

  - ansible.builtin.debug:
      var: diag_jobs

  - block:
      - name: get the diag job ID if still running
        ansible.builtin.set_fact:
          diag_job_id: "{{ item.Id }}"
          diag_job_running: True
        when:
          - item.JobState == "Running"
        loop: "{{ diag_jobs }}"
    when:
      - diag_jobs is defined and diag_jobs|length > 0

  - ansible.builtin.debug:
      msg: "{{ diag_job_id }} {{ diag_job_running }}"
    when:
      - diag_job_id is defined and diag_job_running is defined

  - block:
      - name: run diagnostics
        ansible.builtin.uri:
          url: "https://{{ inventory_hostname }}/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/DellLCService/Actions/DellLCService.RunePSADiagnostics"
          user: "root"
          password: "Dell_123"
          validate_certs: False
          force_basic_auth: yes
          method: POST
          headers:
            Accept: "application/json"
            Content-Type: "application/json"
            OData-Version: "4.0"
          body:
            RebootJobType: "GracefulRebootWithoutForcedShutdown"
            RunMode: "Express"
          body_format: json
          status_code: [200, 202]
        register: diag_response

      - name: get the job ID
        ansible.builtin.set_fact:
          diag_job_id: "{{ diag_response.location.split('/')[-1] }}"

    when:
      - (diag_job_running is not defined) or (not diag_job_running)

  - name: track job till completion
    dellemc.openmanage.idrac_lifecycle_controller_job_status_info:
      idrac_ip: "{{ inventory_hostname }}"
      idrac_user: "{{ user }}"
      idrac_password: "{{ password }}"
      job_id: "{{ diag_job_id }}"
    register: result
    until: result.job_info.JobStatus == "Completed" or result.job_info.JobStatus == "Failed"
    failed_when: result.job_info.JobStatus == "Failed"
    retries: 6
    delay: 300
    when:
      - diag_job_id
ashokbeh commented 2 years ago

Hi Anupamaloke,

Thank you for sharing the sample Ansible playbook. I will test and confirm.

Also, to know the most recent "Remote Diagnostic" before we run the next diagnostic test on iDRAC, can we use URL /redfish/v1/Dell/Managers//DellLCService/Actions/ ?

The export output contains all the components end time stamp.Thinking if it will be right to consider the end time of test of the last component.

Example: Ended: 09/21/2021 16:31:05, Elapsed time: 00:00:00

System Management - Functional Test

Started: 09/21/2021 16:31:05

IPMI Sep 21 2021 13:15:03 Warning. POST Pkg Repair: Memory sensor, redundancy degraded A4 was asserted.

3 records since last scan (Pass=2, Warning=1, Fail=0 Critical=0, Other=0)

Ended: 09/21/2021 16:31:05, Elapsed time: 00:00:00

Test Results : Warning

Msg : Event log - The log contains failing records **

Regards Ashok Behera

texroemer commented 2 years ago

Hi @ashokbeh

You can use URI "redfish/v1/Managers/iDRAC.Embedded.1/Jobs?$expand=*($levels=1)" to get the complete iDRAC job queue. Then filter JSON output for JobType and CompletionTime to find a specific job type completed after a certain date/time.

To find out if remote diags has been executed before or to capture the last execution, use action DellLCService.ExportePSADiagnosticsResult to export it. See example below executing action when remote diags have never been executed and then executing remote diags, run export again to get the results.

C:\Python39>RunDiagnosticsREDFISH.py -ip 192.168.0.120 -u root -p calvin -e 1

- FAIL, POST command failed for ExportePSADiagnosticsResult method, status code is 400

- POST command failure results:
 {'error': {'@Message.ExtendedInfo': [{'Message': 'Unable to export the diagnostics results because the results do not exist.', 'MessageArgs': [], 'MessageArgs@odata.count': 0, 'MessageId': 'IDRAC.2.7.SYS099', 'RelatedProperties': [], 'RelatedProperties@odata.count': 0, 'Resolution': 'Run the RunEPSADiagnostics method to make sure that the diagnostics results are available, and then retry the operation.', 'Severity': 'Warning'}], 'code': 'Base.1.8.GeneralError', 'message': 'A general error has occurred. See ExtendedInfo for more information'}}

C:\Python39>RunDiagnosticsREDFISH.py -ip 192.168.0.120 -u root -p calvin -m 0 -r 2

- INFO, arguments and values for RunePSADiagnostics method

RebootJobType: PowerCycle
RunMode: Express

- PASS: POST command passed for RunePSADiagnostics method, status code 202 returned
- PASS, job ID JID_428044200872 successfuly created for RunePSADiagnostics method

- INFO, server will now automatically reboot and run remote diagnostics once POST completes. Script will check job status every 1 minute until marked completed

- INFO, job not marked completed, status running, execution time: 0:00:11
- INFO, job not marked completed, status running, execution time: 0:01:11
- INFO, job not marked completed, status running, execution time: 0:02:12
- INFO, job not marked completed, status running, execution time: 0:03:12
- INFO, job not marked completed, status running, execution time: 0:04:13
- INFO, job not marked completed, status running, execution time: 0:05:13
- INFO, job not marked completed, status running, execution time: 0:06:14
- INFO, job not marked completed, status running, execution time: 0:07:15
- INFO, job not marked completed, status running, execution time: 0:08:15
- INFO, job not marked completed, status running, execution time: 0:09:16
- INFO, job not marked completed, status running, execution time: 0:10:16
- INFO, job not marked completed, status running, execution time: 0:11:17
- INFO, job not marked completed, status running, execution time: 0:12:17
- INFO, job not marked completed, status running, execution time: 0:13:18
- INFO, job not marked completed, status running, execution time: 0:14:19
- INFO, job not marked completed, status running, execution time: 0:15:20
- INFO, job not marked completed, status running, execution time: 0:16:22

--- PASS, Final Detailed Job Status Results ---

ActualRunningStartTime: 2022-01-21T16:36:41
ActualRunningStopTime: 2022-01-21T16:50:18
CompletionTime: 2022-01-21T16:50:18
Description: Job Instance
EndTime: TIME_NA
Id: JID_428044200872
JobState: Completed
JobType: RemoteDiagnostics
Message: Job completed successfully.
MessageId: SYS018
Name: Remote Diagnostics
PercentComplete: 100
StartTime: 2022-01-21T16:33:40

C:\Python39>RunDiagnosticsREDFISH.py -ip 192.168.0.120 -u root -p calvin -e 1

- PASS: POST command passed for ExportePSADiagnosticsResult method, status code 202 returned

- INFO, use browser session to view diags text file? Type "y" or "n": y

- WARNING, check you default browser session to view diags text file.

Thanks Tex