Open scotts-tp opened 6 months ago
Seems like the error is coming from here: https://github.com/prometheus/procfs/blob/69fc8f61debb3bd7efca3a9a1c295d4012022830/sysfs/class_thermal.go#L73 / https://github.com/prometheus/procfs/blob/69fc8f61debb3bd7efca3a9a1c295d4012022830/sysfs/class_thermal.go#L52 - maybe there should be a check here if the error is of type os.ErrInvalid and either return an empty ClassThermalZonesStat{} or ignore it. Another option could be to check the mode for ‘disabled’ first in parseClassThermalZone() and return early.
not sure how to achieve this directly from node_exporter.
@Kylea650 Checking mode for disabled sounds like a good option. If anyone wants to submit a PR to sysfs feel free to ping me there
@discordianfish Happy to raise a new issue mentioning this one and PR over in sysfs this week. Cheers!
Is this issue still open?
Host operating system:
Linux 5.10.104-tegra #18 SMP PREEMPT aarch64 aarch64 aarch64 GNU/Linux
node_exporter version:
1.7.0
node_exporter command line flags:
--path.rootfs=/host
node_exporter log output
Are you running node_exporter in Docker?
Yes
What did you do that produced an error?
Running node_exporter in a docker container on a custom embedded device.
What did you expect to see?
Disabled thermal zones as either being ignored or optionally being filtered out.
What did you see instead?
The entire thermal_zone collector fails for all thermal_zones.
When a thermal zone is disabled which can be determined via
/sys/class/thermal/thermal_zone10/mode
, it would be nice for node_exporter to handle it gracefully whether natively or via flag, or allow specific files/devices be filtered out manually instead of as an entire class of devices.My temporry workaround has been to use the Pushgateway with a curl container in my docker compose file as so:
With this
pushgateway-thermal-zones.sh
script: