bb-Ricardo / check_redfish

A monitoring/inventory plugin to check components and health status of systems which support Redfish. It will also create a inventory of all components of a system.
MIT License
113 stars 34 forks source link

unexpected --storage output on R730 and R740 #91

Closed 2bjc466 closed 1 year ago

2bjc466 commented 2 years ago

Hello,

Unsure if this is expected or not, but I'm getting results on two systems that aren't what I expected. First is an R730:

$ check_redfish.py --storage --nosession -H -f auth [OK]: One or more storage components report an issue

I'm guessing this is because none of the storage elements reported a status of 'OK', this is the tail end of the output when -d -v are used:

[OK]: One or more storage components report an issue [OK]: C610/X99 series chipset sSATA Controller [AHCI mode] (FW: None) status is: None [OK]: Physical Drive Solid State Disk 0:0 (SSDSC2KB240G8R / SSD / SATA) 240.06GiB status: None [OK]: Physical Drive Solid State Disk 1:1 (THNSF8120CCSE / SSD / SATA) 120.03GiB status: None [OK]: Logical Drive Solid State Disk 0:0 (Solid State Disk 0:0) 240GiB (RawDevice) status: None [OK]: Logical Drive Solid State Disk 1:1 (Solid State Disk 1:1) 120GiB (RawDevice) status: None [OK]: C610/X99 series chipset 6-Port SATA Controller [AHCI mode] (FW: None) status is: None

Those are all of the components I would like to monitor, it just looks like Dell isn't telling us anything interesting about their state.

On an R740 I see:

[OK]: All storage controllers (3), volumes (68) and disk drives (70) are in good condition [OK]: Dell 12Gbps HBA (FW: 16.17.01.00) status is: OK [OK]: StorageEnclosure H4060-J 0:0 (Power: On) status: OK [OK]: C620 Series Chipset Family SATA Controller [AHCI mode] (FW: None) status is: None [OK]: C620 Series Chipset Family SSATA Controller [AHCI mode] (FW: None) status is: None [OK]: Physical Drive Solid State Disk 0:1:0 (MTFDDAK480TDT / SSD / SATA) 480.10GiB status: None [OK]: Physical Drive Solid State Disk 0:1:1 (MTFDDAK480TDT / SSD / SATA) 480.10GiB status: None [OK]: Logical Drive Solid State Disk 0:1:0 (Disk 0 on Embedded AHCI Controller 1) 480GiB (RawDevice) status: None [OK]: Logical Drive Solid State Disk 0:1:1 (Disk 1 on Embedded AHCI Controller 1) 480GiB (RawDevice) status: None [OK]: Dell 12Gbps HBA (FW: 16.17.01.00) status is: OK [OK]: StorageEnclosure H4060-J 0:0 (Power: On) status: OK

This system is attached for four fully populated external JBODs (HGST 4060), so I don't know why it found 68 of 240 external disks. Ideally I would like to monitor the health of all controllers and the internal disks, but I don't see a way to specify which devices I do/don't care about. The R730 is also attached to the same class and number of JBODs, not sure why they didn't show up there.

Just getting started using this plugin and liking it so far! Let me know if this is expected behavior, and if not, what information I can provide that would be useful.

Thank you!

bb-Ricardo commented 2 years ago

Hi,

I'm not sure why the data doesn't show up here. For some reason it is not exposed via Redfish or in a different url path.

Are you using the latest iDRAC versions? R730 is fairly old as I remember.

2bjc466 commented 2 years ago

Hello,

We're running iDRAC firmware 2.83.83.83, which appears to still be the latest. This R730 is about four years old, doesn't seem like we've had it that long but I guess R750s are already out.

bb-Ricardo commented 2 years ago

If you want you can create a Redfish Mockup https://github.com/DMTF/Redfish-Mockup-Creator and send me the whole folder. Or you have a look yourself and see if you can find the missing drives.

If this is an OOB controller then you need to run a daemon on you server which then communicates between iDRAC and your RAID controller.

bb-Ricardo commented 1 year ago

Hi,

Is this still relevant or can I close this issue?