Closed ulikl closed 1 year ago
I checked the code, we have redfish_chassis_temperature_celsius
and redfish_chassis_temperature_sensor_state
, but we don't have redfish_chassis_temperature_sensor_health
, I will check if we can add redfish_chassis_temperature_sensor_health
@ulikl latest commit add such metric, please build and test since I don't have device
@jenningsloy318 , Thank you very much. Its working
# HELP redfish_chassis_temperature_celsius celsius of temperature on this chassis component
# TYPE redfish_chassis_temperature_celsius gauge
redfish_chassis_temperature_celsius{chassis_id="System.Embedded.1",resource="temperature",sensor="CPU1 Temp",sensor_id="0"} 36
redfish_chassis_temperature_celsius{chassis_id="System.Embedded.1",resource="temperature",sensor="CPU2 Temp",sensor_id="1"} 36
redfish_chassis_temperature_celsius{chassis_id="System.Embedded.1",resource="temperature",sensor="System Board Exhaust Temp",sensor_id="3"} 37
redfish_chassis_temperature_celsius{chassis_id="System.Embedded.1",resource="temperature",sensor="System Board Inlet Temp",sensor_id="2"} 27
# HELP redfish_chassis_temperature_sensor_health status health of temperature on this chassis component,1(Enabled),2(Disabled),3(StandbyOffinline),4(StandbySpare),5(InTest),6(Starting),7(Absent),8(UnavailableOffline),9(Deferring),10(Quiesced),11(Updating)
# TYPE redfish_chassis_temperature_sensor_health gauge
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="CPU1 Temp",sensor_id="0"} 1
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="CPU2 Temp",sensor_id="1"} 1
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="System Board Exhaust Temp",sensor_id="3"} 1
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="System Board Inlet Temp",sensor_id="2"} 1
With inlet over warning:
# TYPE redfish_chassis_temperature_sensor_health gauge
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="CPU1 Temp",sensor_id="0"} 1
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="CPU2 Temp",sensor_id="1"} 1
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="System Board Exhaust Temp",sensor_id="3"} 1
redfish_chassis_temperature_sensor_health{chassis_id="System.Embedded.1",resource="temperature",sensor="System Board Inlet Temp",sensor_id="2"} 2
if "2" means Warning, the HELP text is wrong, should be CommonHealthHelp
instead of CommonStateHelp
, no?
Hi,
The current temperature metrics looks like
Note: for the test I set the Warning threshold for sensor "System Board Inlet Temp" to 17. The only state/health metrics > 1 in this case are:
So we in this case, when can only get a unspecific Chassis alert or need to define a Alert on the redfish_chassis_temperature_celsius using separate thresholds int the alert definition, which might not match the server configurations.
But the at least for our Dell servers also a Health value is provided via: https:///redfish/v1/Chassis/System.Embedded.1/Sensors/SystemBoardInletTemp
e.g. for
Can the redfish_exporter be extended by such a temperature health metric?