jenningsloy318 / redfish_exporter

exporter to get metrics from redfish based hardware such as lenovo/dell/superc servers
Apache License 2.0
70 stars 61 forks source link

got no metrics from Dell IDRAC due to duplicate devices #61

Closed ulikl closed 1 year ago

ulikl commented 1 year ago

Hi, like in https://github.com/jenningsloy318/redfish_exporter/issues/57

we get with your latest version and iDRAC firmware version 6.00.02.00 no metrics, but just errors

An error has occurred while serving metrics:

14 error(s) occurred:
* [from Gatherer #2] collected metric "redfish_system_pcie_device_state" { label:<name:"hostname" value:"" > label:<name:"pcie_device" value:"LPe31002-M6-D 2-Port 16Gb Fibre Channel Adapter" > label:<name:"pcie_device_id" value:"96-0" > label:<name:"pcie_device_partnumber" value:"0RXNT1" > label:<name:"pcie_device_type" value:"MultiFunction," > label:<name:"pcie_serial_number" value:"...." > label:<name:"resource" value:"pcie_device" > gauge:<value:1 > } was collected before with the same name and label values
* [from Gatherer #2] collected metric "redfish_system_pcie_device_health_state" { label:<name:"hostname" value:"" > label:<name:"pcie_device" value:"LPe31002-M6-D 2-Port 16Gb Fibre Channel Adapter" > label:<name:"pcie_device_id" value:"96-0" > label:<name:"pcie_device_partnumber" value:"0RXNT1" > label:<name:"pcie_device_type" value:"MultiFunction," > label:<name:"pcie_serial_number" value:"...." > label:<name:"resource" value:"pcie_device" > gauge:<value:1 > } was collected before with the same name and label values
...

Our devices list contains duplicates:

curl -k -u ... https://.../redfish/v1/Systems/System.Embedded.1
{
...
"Model":"PowerEdge R740",
"Name":"System",
"NetworkInterfaces":{"@odata.id":"/redfish/v1/Systems/System.Embedded.1/NetworkInterfaces"},
"Oem":{"Dell":{"@odata.type":"#DellOem.v1_3_0.DellOemResources",
"DellSystem":{"BIOSReleaseDate":"12/13/2021",
....
"@odata.type":"#DellSystem.v1_3_0.DellSystem",
"@odata.id":"/redfish/v1/Systems/System.Embedded.1/Oem/Dell/DellSystem/System.Embedded.1"}}},
"PCIeDevices":[{"@odata.id":"/redfish/v1/Systems/System.Embedded.1/PCIeDevices/0-31"},
{"@odata.id":"/redfish/v1/Systems/System.Embedded.1/PCIeDevices/96-0"},
{"@odata.id":"/redfish/v1/Systems/System.Embedded.1/PCIeDevices/96-0"},
{"@odata.id":"/redfish/v1/Systems/System.Embedded.1/PCIeDevices/94-0"},
...

Would it be possible to remove the duplicates from the device list?

stmcginnis commented 1 year ago

Would it be possible to remove the duplicates from the device list?

This issue should be reported to Dell. The iDRAC firmware should not be reporting the same linked device multiple times.

ulikl commented 1 year ago

Some Update: I opened a dell case , they are still looking into it.

P.S.: the latest iDRAC Version 6.00.30.00 released on Nov. 11th, still didn't fix the duplicates. The most recent version, where the exporter is working for my servers is iDRAC V5.00.10.20.

ulikl commented 1 year ago

I got an Update from my Dell case: They could reproduce the problem and it will be fixed with iDRAC Release 6.10.00.00, which is scheduled for Release in February 2023.

ulikl commented 1 year ago

Update: the iDRAC Release 6.10.00.00 is already available and fixed the issue.