jenningsloy318 / redfish_exporter

exporter to get metrics from redfish based hardware such as lenovo/dell/superc servers
Apache License 2.0
70 stars 62 forks source link

Error on HP Proliant G9 servers #29

Closed 80dB closed 2 years ago

80dB commented 3 years ago

The exporter throws this error on HP Proliant G9 servers:

An error has occurred while serving metrics:

4 error(s) occurred:
* [from Gatherer #2] collected metric "redfish_chassis_power_powersupply_state" { label:<name:"chassis_id" value:"1" > label:<name:"power_supply" value:"HpServerPowerSupply" > label:<name:"power_supply_id" value:"" > label:<name:"resource" value:"power_supply" > gauge:<value:1 > } was collected before with the same name and label values
* [from Gatherer #2] collected metric "redfish_chassis_power_powersupply_health" { label:<name:"chassis_id" value:"1" > label:<name:"power_supply" value:"HpServerPowerSupply" > label:<name:"power_supply_id" value:"" > label:<name:"resource" value:"power_supply" > gauge:<value:1 > } was collected before with the same name and label values
* [from Gatherer #2] collected metric "redfish_chassis_power_powersupply_last_power_output_watts" { label:<name:"chassis_id" value:"1" > label:<name:"power_supply" value:"HpServerPowerSupply" > label:<name:"power_supply_id" value:"" > label:<name:"resource" value:"power_supply" > gauge:<value:122 > } was collected before with the same name and label values
* [from Gatherer #2] collected metric "redfish_chassis_power_powersupply_power_capacity_watts" { label:<name:"chassis_id" value:"1" > label:<name:"power_supply" value:"HpServerPowerSupply" > label:<name:"power_supply_id" value:"" > label:<name:"resource" value:"power_supply" > gauge:<value:500 > } was collected before with the same name and label values

Seems it's not properly reading power supply id. Is there any way to debug this?

jenningsloy318 commented 3 years ago

can you get the output via api regarding power supply metrics, not sure if it is related missing labels or something else.

from the error, your server must be have more attributes exposed to distinguish these metric, maybe we should figure it out.

if this is the case, we will need to create a issue on upstream project https://github.com/stmcginnis/gofish.

jyokako commented 3 years ago

I got the same error on HPE server.

image

jenningsloy318 commented 3 years ago

can you get the output via api regarding power supply metrics, not sure if it is related missing labels or something else?

jyokako commented 3 years ago

I cloud get the power info via api. Test server model is ProLiant DL380 Gen9. This was also issued in https://github.com/jenningsloy318/redfish_exporter/issues/14. From error info, it was collected before with the same name and label values when collecting the state , health, last power output watts and power capacity watts. Can you tell me how to exclude those four metrics?

https://xxxxxx/redfish/v1/Chassis/1/Power/
{
    "@odata.context": "/redfish/v1/$metadata#Chassis/Members/1/Power$entity",
    "@odata.id": "/redfish/v1/Chassis/1/Power/",
    "@odata.type": "#Power.1.0.1.Power",
    "Id": "Power",
    "Name": "PowerMetrics",
    "Oem": {
        "Hp": {
            "@odata.type": "#HpPowerMetricsExt.1.2.0.HpPowerMetricsExt",
            "SNMPPowerThresholdAlert": {
                "DurationInMin": 0,
                "ThresholdWatts": 0,
                "Trigger": "Disabled"
            },
            "Type": "HpPowerMetricsExt.1.2.0",
            "links": {
                "FastPowerMeter": {
                    "href": "/redfish/v1/Chassis/1/Power/FastPowerMeter/"
                },
                "FederatedGroupCapping": {
                    "href": "/redfish/v1/Chassis/1/Power/FederatedGroupCapping/"
                },
                "PowerMeter": {
                    "href": "/redfish/v1/Chassis/1/Power/PowerMeter/"
                }
            }
        }
    },
    "PowerCapacityWatts": 1600,
   ....
jenningsloy318 commented 3 years ago

@jyokako I see the root cause, in my code, there is a lable named power_supply_id, but which retrieved from PowerSupply.MemberID, but your server don't have such ID. please make sure you server is comatiable with https://github.com/stmcginnis/gofish/blob/main/redfish/power.go

can you also post full output, then we can see which one is a replacement for memberID

jenningsloy318 commented 3 years ago

Hi @jyokako per https://github.com/stmcginnis/gofish/issues/136#issuecomment-817302887, you can try to upgrade your firmware to the latest, and then try it again.

MartinRoenneburg commented 3 years ago

I ran into the same issue.

I updated to the latest ilo FW 2.78 (May 2021) ... and the issue is still present

... any workaround would be wonderful because i still have some Gen9 runnning ...

MartinRoenneburg commented 3 years ago

/redfish/v1/Chassis/1/Power with ilo FW 2.78

{ "@odata.context": "/redfish/v1/$metadata#Chassis/Members/1/Power$entity", "@odata.id": "/redfish/v1/Chassis/1/Power/", "@odata.type": "#Power.1.0.1.Power", "Id": "Power", "Name": "PowerMetrics", "Oem": { "Hp": { "@odata.type": "#HpPowerMetricsExt.1.2.0.HpPowerMetricsExt", "SNMPPowerThresholdAlert": { "DurationInMin": 0, "ThresholdWatts": 0, "Trigger": "Disabled" }, "Type": "HpPowerMetricsExt.1.2.0", "links": { "FastPowerMeter": { "href": "/redfish/v1/Chassis/1/Power/FastPowerMeter/" }, "FederatedGroupCapping": { "href": "/redfish/v1/Chassis/1/Power/FederatedGroupCapping/" }, "PowerMeter": { "href": "/redfish/v1/Chassis/1/Power/PowerMeter/" } } } }, "PowerCapacityWatts": 1600, "PowerConsumedWatts": 93, "PowerControl": [ { "PowerCapacityWatts": 1600, "PowerConsumedWatts": 93, "PowerLimit": { "LimitInWatts": null }, "PowerMetrics": { "AverageConsumedWatts": 92, "IntervalInMin": 20, "MaxConsumedWatts": 110, "MinConsumedWatts": 91 } } ], "PowerLimit": { "LimitInWatts": null }, "PowerMetrics": { "AverageConsumedWatts": 92, "IntervalInMin": 20, "MaxConsumedWatts": 110, "MinConsumedWatts": 91 }, "PowerSupplies": [ { "FirmwareVersion": "1.00", "LastPowerOutputWatts": 48, "LineInputVoltage": 235, "LineInputVoltageType": "ACHighLine", "Model": "720479-B21", "Name": "HpServerPowerSupply", "Oem": { "Hp": { "@odata.type": "#HpServerPowerSupply.1.0.0.HpServerPowerSupply", "AveragePowerOutputWatts": 48, "BayNumber": 1, "HotplugCapable": true, "MaxPowerOutputWatts": 51, "Mismatched": false, "PowerSupplyStatus": { "State": "Ok" }, "Type": "HpServerPowerSupply.1.0.0", "iPDUCapable": false } }, "PowerCapacityWatts": 800, "PowerSupplyType": "AC", "SerialNumber": "some-serial", "SparePartNumber": "754381-001", "Status": { "Health": "OK", "State": "Enabled" } }, { "FirmwareVersion": "1.00", "LastPowerOutputWatts": 45, "LineInputVoltage": 230, "LineInputVoltageType": "ACHighLine", "Model": "720479-B21", "Name": "HpServerPowerSupply", "Oem": { "Hp": { "@odata.type": "#HpServerPowerSupply.1.0.0.HpServerPowerSupply", "AveragePowerOutputWatts": 45, "BayNumber": 2, "HotplugCapable": true, "MaxPowerOutputWatts": 48, "Mismatched": false, "PowerSupplyStatus": { "State": "Ok" }, "Type": "HpServerPowerSupply.1.0.0", "iPDUCapable": false } }, "PowerCapacityWatts": 800, "PowerSupplyType": "AC", "SerialNumber": "some-serial", "SparePartNumber": "754381-001", "Status": { "Health": "OK", "State": "Enabled" } } ], "Redundancy": [ { "MaxNumSupported": 2, "MemberId": "0", "MinNumNeeded": 2, "Mode": "Failover", "Name": "PowerSupply Redundancy Group 1", "RedundancySet": [ { "@odata.id": "/redfish/v1/Chassis/1/Power#/PowerSupplies/0" }, { "@odata.id": "/redfish/v1/Chassis/1/Power#/PowerSupplies/1" } ] } ], "Type": "PowerMetrics.0.11.0", "links": { "self": { "href": "/redfish/v1/Chassis/1/Power/" } } }

stmcginnis commented 3 years ago

It looks like you will have to take this up with HPE. They are not following the Redfish specification.

Current Power schema version is v1.7.0, but even going all the way back to the v1.0.0 version of the specification (quite old at this point) MemberId was always part of the PowerSupply specification:

http://redfish.dmtf.org/schemas/v1/Power.v1_0_0.json

They didn't include the required properties in those older schema files, but looking at more recent ones, HPE is not including either of the two required properties for PowerSupply objects:

            "required": [
                "@odata.id",
                "MemberId"
            ],

From above, it looks like they are using the v1.0.1 version of the Power object that contains the PowerSupply.

"@odata.type": "#Power.1.0.1.Power",

That is one of the versions where the above required properties is not explicitly called out in the schema files, but I believe they were still required, even though that part of the spec was not captured in the file.

Without even the base required @odata.id included in these objects, I don't think either gofish or redfish_exporter can do much in this case. HPE will need to address this in their ILO firmware.

deepankersharmaa commented 2 years ago

Hi can you please help me to create docker image. Please share proper steps.

Complete! fatal: not a git repository (or any parent up to mount point /go/src/github.com/jenningsloy318) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). fatal: not a git repository (or any parent up to mount point /go/src/github.com/jenningsloy318) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

building binaries

rfpronk commented 1 year ago

I know HPe should just fix their stuff, but I need it to to work on gen9;s, so I created this ugly workaround https://github.com/rfpronk/redfish_exporter/pull/1/commits/e9ef4dfaa42b406bfa478f08b7e043486ef39928