DRuggeri / nut_exporter

Network UPS Tools Prometheus Exporter
Other
172 stars 25 forks source link

UPS Status Metric #2

Closed demokom-gmail-com closed 3 years ago

demokom-gmail-com commented 3 years ago

Hi, this is a great job, but I think it would be great if you add the UPS Status metric to your exporter's output, especially since this variable takes place in the exporter's input. Thank you.

DRuggeri commented 3 years ago

Hi, @demokom-gmail-com I'm happy to entertain enhancements, but I'm not sure I follow the request. Can you help me understand what status metric you have in mind?

demokom-gmail-com commented 3 years ago

Thank you for you answer! Well look. When I was constucting app environment I used this string:

/usr/local/bin/nut_exporter --nut.vars_enable=""

This means the return of all possible metrics including UPS Status (online, onbattery, charging, overload, etc.). And yes when I run exporter from command line with debug log.level it shows me it's all ok:

DEBU[0001] ups.status: source="nut_collector.go:112" DEBU[0001] Value: 'OL CHRG' source="nut_collector.go:113" DEBU[0001] Type: STRING source="nut_collector.go:114" DEBU[0001] Description: 'UPS status' source="nut_collector.go:115" DEBU[0001] Writeable: false source="nut_collector.go:116" DEBU[0001] MaximumLength: 0 source="nut_collector.go:117" DEBU[0001] OriginalType: NUMBER source="nut_collector.go:118" DEBU[0001] Export the variable? true source="nut_collector.go:127"

But when I try to get it with prometheus it's missing:

HELP network_ups_tools_ups_load Value of the ups.load variable from Network UPS Tools TYPE network_ups_tools_ups_load gauge network_ups_tools_ups_load 9 HELP network_ups_tools_ups_productid Value of the ups.productid variable from Network UPS Tools TYPE network_ups_tools_ups_productid gauge network_ups_tools_ups_productid 5 HELP network_ups_tools_ups_temperature Value of the ups.temperature variable from Network UPS Tools TYPE network_ups_tools_ups_temperature gauge network_ups_tools_ups_temperature 24.9 HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served. TYPE promhttp_metric_handler_requests_in_flight gauge promhttp_metric_handler_requests_in_flight 1

Hope it helps and thank you for your efforts!

DRuggeri commented 3 years ago

Ah, OK, yeah - now I understand. This was not showing up in the output because NUT returns the value as a STRING type (Prometheus only allows numeric types as a value). Since ups.status is such an important variable, I've added a bit of handling in c908cf5d6bc306007f0770a9de6bbbd4b2954ea9 and released it as v1.1.0. Have a look at the README for what the integers each mean

As far as I know, only three statuses exist for the ups.status variable... but if you come across another, let me know.

sshaikh commented 3 years ago

There's actually a whole bunch:

https://github.com/p404/nut_exporter/blob/3704ee7fd0b605eed1672bae450d10f534a19402/main.go#L196

But I personally just use (or rather the absence of) OL.

I've commented on the commit, but just to track it here - it would be useful to have this metric in the default list.

DRuggeri commented 3 years ago

OK, great - I'll add those later today and will plan for them to be in an upcoming release. Thanks for the pointer! Will also plan to add the status in the defaults

sshaikh commented 3 years ago

I feel like I'm spamming you now but I've created a new issue discussing why we might not want a "default metric list" in the first place: https://github.com/DRuggeri/nut_exporter/issues/4

sshaikh commented 3 years ago

Oh and you should be aware that some UPS's are obnoxious enough to provide a composite status (eg OL TRIM) - I'm not sure if the go nut library handles this gracefully or not, and I haven't seen it yet.

That said, I think it's one of those things to be aware of rather than preemptively fix.

sshaikh commented 3 years ago

So yes - I'm getting repeated failures as my UPS reports OL TRIM.

Not entirely sure what the solution would be here, either another status code or a reduction to OL or TRIM.

sshaikh commented 3 years ago

So... it turns out that "status" isn't a status, but a set of flags (at least for the USB driver - I'd assume it's the case elsewhere too):

https://github.com/networkupstools/nut/blob/2b4a105038723da0f93859029b665f44e6dc860b/drivers/usbhid-ups.c#L182

And the Nut clients know this:

https://github.com/networkupstools/nut/blob/master/clients/upsmon.c

So, yeh. Not sure what the best approach is. Technically the go nut client should handle these as separate flags too, but it might just be easier to set the status based on the string somehow.

I'll send this upstream to see what they think.

DRuggeri commented 3 years ago

Wow - thanks for the research. That's far more complicated than I had anticipated.

I've opened a separate issue (#5) to continue the discussion on this topic since the rabbit hole goes deeper than planned.