dell / iDRAC-Redfish-Scripting

Python and PowerShell scripting for Dell EMC PowerEdge iDRAC REST API with DMTF Redfish
GNU General Public License v2.0
598 stars 276 forks source link

createvirtualdisk.py is failing with KeyError on 'JobType' with Dell R750 servers #224

Closed dbf1234 closed 1 year ago

dbf1234 commented 2 years ago

We were using a Dell R840 (idrac firmware 4.40.40) to test the createvirtualdisk.py script found out on
https://github.com/dell/iDRAC-Redfish-Scripting/blob/master/ Has worked great in all our testing.

Received in some Dell R750’s for another project (iDrac firmware 5.10.30) and keeps failing at random times with same error. See below. createvirtualdisk.py will start creating virtual disks we pass in script... and then we get KeyError on ‘JobType’ It’s pretty random. I’ve seen it fail all over the place. Sometimes it fails right out of the box on the first drive. On Saturday we had a server that ran all the way through with no errors at all, i.e. created all the VD’s.

Downloaded latest version and seeing the same issue. These are Hadoop servers with 14 disks 1x raid1 and 12 x raid 0 Not sure if api call is not returning dict value consistently?

for loop running os.system('python3 createvirtdisk.py -ip x.x.x.x -u root -p calvin --create ' + array['controller'] + ' --raid-level ' + str(vd['level']) + ' --disks ' + vd['disks'] )

Output snippet. Successfully created 5 Virtual disks and then crashed on the 6th disk.

('@odata.context', '/redfish/v1/$metadata#DellJob.DellJob') ('@odata.id', '/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/Jobs/JID_618081473830') ('@odata.type', '#DellJob.v1_2_0.DellJob') ('ActualRunningStartTime', '2022-08-29T16:22:28') ('ActualRunningStopTime', '2022-08-29T16:24:07') ('CompletionTime', '2022-08-29T16:24:07') ('Description', 'Job Instance') ('EndTime', 'TIME_NA') ('Id', 'JID_618081473830') ('JobState', 'Completed') ('JobType', 'RealTimeNoRebootConfiguration') ('Message', 'Job completed successfully.') ('MessageArgs', []) ('MessageArgs@odata.count', 0) ('MessageId', 'PR19') ('Name', 'Configure: RAID.Slot.3-1') ('PercentComplete', 100) ('StartTime', '2022-08-29T16:22:27') ('TargetSettingsURI', None)

texroemer commented 2 years ago

Hi @dbf1234

I was able to reproduce the issue you hit with iDRAC 5.10.30. Looks to be an intermittent timing issue in the iDRAC firmware, not a script issue which i'll need to escalate this issue to the iDRAC team internally.

I went ahead and updated the script to have a 10 second sleep delay after POST command passes before checking the job type in the JSON response. I also added retry logic if the JSON output does not return JobType. I looped the updated script overnight and wasn't able to hit the issue anymore.

Can you go ahead and test with latest updated script. If you still see failures please let me know.

Thanks Tex

dbf1234 commented 2 years ago

Thank you Tex.

Will download latest updated script and test. Will let you know.

Thanks! Doug

From: texroemer @.> Sent: Wednesday, August 31, 2022 1:29 PM To: dell/iDRAC-Redfish-Scripting @.> Cc: Doug Fink @.>; Mention @.> Subject: Re: [dell/iDRAC-Redfish-Scripting] createvirtualdisk.py is failing with KeyError on 'JobType' with Dell R750 servers (Issue #224)

External to the Groupe / en provenance de l'extérieur du Groupe

Hi @dbf1234https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdbf1234&data=05%7C01%7Cdoug.fink%40epsilon.com%7Cd398db3fb5394e879e5108da8b7eab83%7Cd52c9ea17c2147b182a333a74b1f74b8%7C1%7C0%7C637975673412345308%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rz8A77rY07XSQVkGtWyZIckf8hTfTjQThBV8JbKyv0Y%3D&reserved=0

I was able to reproduce the issue you hit with iDRAC 5.10.30. Looks to be an intermittent timing issue in the iDRAC firmware, not a script issue which i'll need to escalate this issue to the iDRAC team internally.

I went ahead and updated the script to have a 10 second sleep delay after POST command passes before checking the job type in the JSON response. I also added retry logic if the JSON output does not return JobType. I looped the updated script overnight and wasn't able to hit the issue anymore.

Can you go ahead and test with latest updated script. If you still see failures please let me know.

Thanks Tex

- Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdell%2FiDRAC-Redfish-Scripting%2Fissues%2F224%23issuecomment-1233275640&data=05%7C01%7Cdoug.fink%40epsilon.com%7Cd398db3fb5394e879e5108da8b7eab83%7Cd52c9ea17c2147b182a333a74b1f74b8%7C1%7C0%7C637975673412345308%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6nBsyUYXv3ak3zgtJTn%2F%2FONttkV54fC9d6bV9qR6TSc%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FA2Z6EFOMP3N3S55EF77KGGDV36P6VANCNFSM5773GCJA&data=05%7C01%7Cdoug.fink%40epsilon.com%7Cd398db3fb5394e879e5108da8b7eab83%7Cd52c9ea17c2147b182a333a74b1f74b8%7C1%7C0%7C637975673412345308%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xQjwMZ9tjbzEI605NWckzsbBjMv6kWg%2F5pLyGfNHq4o%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.**@.>>


Disclaimer The information in this email and any attachments may contain proprietary and confidential information that is intended for the addressee(s) only. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, retention or use of the contents of this information is prohibited. When addressed to our clients or vendors, any information contained in this e-mail or any attachments is subject to the terms and conditions in any governing contract. If you have received this e-mail in error, please immediately contact the sender and delete the e-mail.

gint914 commented 1 year ago

The first test with the new script to create 6 VDs worked great. I need to delete 13 disks on some other servers to give it a full test. But looks great so far! Thank you.

gint914 commented 1 year ago

I see the DeleteVirtualDiskREDFISH.py script has this new retry logic added. I'm going to have to try that because the previous version had the same issue. DeleteVirtualDiskREDFISH.py

texroemer commented 1 year ago

Yes i also made same changes to DeleteVirtualDiskREDFISH.py and InitializeVirtualDiskREDFISH.py scripts since they both leverage the same logic as create VD script.

Let me know how your testing goes with delete VD script.

Thanks Tex

gint914 commented 1 year ago

Thank you VERY much. Both create and delete scripts work great.

I tested the create script on 5 servers at the same time and got consistent results, run once and done.

The delete script outputs the VD name that it's deleting. It would be great if the create script did something similar. It doesn't affect automation usage, but when running to test and develop this would be a great feature.

dbf1234 commented 1 year ago

Same result as gint914. 13 virtual disks created without issue on the test I ran last night. thanks!

texroemer commented 1 year ago

Hi @gint914

Implemented your suggestion and uploaded new script to report new VD FQDD created. See example below (new log message in bold):

C:\Python39>CreateVirtualDiskREDFISH.py -ip 192.168.0.120 -u root -p calvin --get-virtualdisks RAID.SL.3-1

Disk.Virtual.0:RAID.SL.3-1, Volume type: NonRedundant Disk.Virtual.2:RAID.SL.3-1, Volume type: NonRedundant

C:\Python39>CreateVirtualDiskREDFISH.py -ip 192.168.0.120 -u root -p calvin --create RAID.SL.3-1 --disks Disk.Bay.2:Enclosure.Internal.0-1:RAID.SL.3-1 --raid-level 0

--- PASS, Final Detailed Job Status Results ---

('@odata.context', '/redfish/v1/$metadata#DellJob.DellJob') ('@odata.id', '/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell/Jobs/JID_620519710365') ('@odata.type', '#DellJob.v1_2_0.DellJob') ('ActualRunningStartTime', '2022-09-01T12:06:12') ('ActualRunningStopTime', '2022-09-01T12:07:32') ('CompletionTime', '2022-09-01T12:07:32') ('Description', 'Job Instance') ('EndTime', 'TIME_NA') ('Id', 'JID_620519710365') ('JobState', 'Completed') ('JobType', 'RealTimeNoRebootConfiguration') ('Message', 'Job completed successfully.') ('MessageArgs', []) ('MessageArgs@odata.count', 0) ('MessageId', 'PR19') ('Name', 'Configure: RAID.SL.3-1') ('PercentComplete', 100) ('StartTime', '2022-09-01T12:06:11') ('TargetSettingsURI', None)

- INFO, new VD FQDD created: Disk.Virtual.1:RAID.SL.3-1

C:\Python39>CreateVirtualDiskREDFISH.py -ip 192.168.0.120 -u root -p calvin --get-virtualdisks RAID.SL.3-1

Disk.Virtual.0:RAID.SL.3-1, Volume type: NonRedundant Disk.Virtual.2:RAID.SL.3-1, Volume type: NonRedundant Disk.Virtual.1:RAID.SL.3-1, Volume type: NonRedundant

gint914 commented 1 year ago

I'll try that out. I wanted to do another test run anyway. Thank you very much sir!