Closed jangrewe closed 4 years ago
@mdlayher Any chance you could have a look at this PR? I'm sorry that it got a bit out of hand, but i think it's worth it! 😄
Thank you for the PR and I appreciate the effort you've put in, but this PR has a lot of unrelated changes that make it difficult to review quickly.
My free time is rather limited this week. I'll try to get to this early next week.
Just a note to say that I have tested from the @jangrewe repo, and it works for me:
apcupsd_apcupsd_nominal_power_watts 390
apcupsd_battery_charge_percent 100
apcupsd_battery_cumulative_time_on_seconds_total 0
apcupsd_battery_nominal_volts 12
apcupsd_battery_number_transfers_total 0
apcupsd_battery_time_left_seconds 4602
apcupsd_battery_time_on_seconds 0
apcupsd_battery_volts 13.4
apcupsd_line_nominal_volts 230
apcupsd_line_volts 240
apcupsd_ups_info{hostname="nuc2",model="Back-UPS XS 700U",ups_name="bx700"} 1
apcupsd_ups_load_percent 3
apcupsd_ups_status{status="BOOST"} 0
apcupsd_ups_status{status="CAL"} 0
apcupsd_ups_status{status="COMMLOST"} 0
apcupsd_ups_status{status="LOWBATT"} 0
apcupsd_ups_status{status="NOBATT"} 0
apcupsd_ups_status{status="ONBATT"} 0
apcupsd_ups_status{status="ONLINE"} 1
apcupsd_ups_status{status="OVERLOAD"} 0
apcupsd_ups_status{status="REPLACEBATT"} 0
apcupsd_ups_status{status="SHUTTING DOWN"} 0
apcupsd_ups_status{status="SLAVE"} 0
apcupsd_ups_status{status="SLAVEDOWN"} 0
apcupsd_ups_status{status="TRIM"} 0
I am ambivalent, however, about the loss of metadata from the individual metrics.
Including metadata in every metric is how the official snmp_exporter works. Example:
# HELP ifHCInOctets The total number of octets received on the interface, including framing characters - 1.3.6.1.2.1.31.1.1.1.
6
# TYPE ifHCInOctets counter
ifHCInOctets{ifAlias="",ifDescr="bridge",ifIndex="14",ifName="bridge"} 6.343343949e+09
ifHCInOctets{ifAlias="",ifDescr="ether1",ifIndex="16",ifName="ether1"} 2.4808641625e+10
ifHCInOctets{ifAlias="",ifDescr="ether2",ifIndex="2",ifName="ether2"} 0
ifHCInOctets{ifAlias="",ifDescr="ether3",ifIndex="3",ifName="ether3"} 6.083273145e+09
ifHCInOctets{ifAlias="",ifDescr="ether4",ifIndex="4",ifName="ether4"} 6.79593826e+08
ifHCInOctets{ifAlias="",ifDescr="ether5",ifIndex="5",ifName="ether5"} 2.91243382e+08
ifHCInOctets{ifAlias="",ifDescr="pppoe-out1",ifIndex="17",ifName="pppoe-out1"} 2.4255138238e+10
...
# HELP ifHCOutOctets The total number of octets transmitted out of the interface, including framing characters - 1.3.6.1.2.1.31.1.1.1.10
# TYPE ifHCOutOctets counter
ifHCOutOctets{ifAlias="",ifDescr="bridge",ifIndex="14",ifName="bridge"} 2.5131203193e+10
ifHCOutOctets{ifAlias="",ifDescr="ether1",ifIndex="16",ifName="ether1"} 5.096503509e+09
ifHCOutOctets{ifAlias="",ifDescr="ether2",ifIndex="2",ifName="ether2"} 0
ifHCOutOctets{ifAlias="",ifDescr="ether3",ifIndex="3",ifName="ether3"} 1.0343428154e+10
ifHCOutOctets{ifAlias="",ifDescr="ether4",ifIndex="4",ifName="ether4"} 1.4271999034e+10
ifHCOutOctets{ifAlias="",ifDescr="ether5",ifIndex="5",ifName="ether5"} 4.11163912e+09
ifHCOutOctets{ifAlias="",ifDescr="pppoe-out1",ifIndex="17",ifName="pppoe-out1"} 4.764235347e+09
This in turn makes it very easy to use in grafana dashboards, with all of the graph lines labelled with the available metadata, and easy to filter on any particular piece of metadata.
If they changed it to work the way proposed here, then it would look more like this:
ifIndex{ifAlias="",ifDescr="bridge",ifIndex="14",ifName="bridge"} 1
ifIndex{ifAlias="",ifDescr="ether1",ifIndex="16",ifName="ether1"} 1
ifIndex{ifAlias="",ifDescr="ether2",ifIndex="2",ifName="ether2"} 1
ifIndex{ifAlias="",ifDescr="ether3",ifIndex="3",ifName="ether3"} 1
ifIndex{ifAlias="",ifDescr="ether4",ifIndex="4",ifName="ether4"} 1
ifIndex{ifAlias="",ifDescr="ether5",ifIndex="5",ifName="ether5"} 1
ifIndex{ifAlias="",ifDescr="pppoe-out1",ifIndex="17",ifName="pppoe-out1"} 1
...
ifHCInOctets{ifIndex="14"} 6.343343949e+09
ifHCInOctets{ifIndex="16"} 2.4808641625e+10
ifHCInOctets{ifIndex="2"} 0
ifHCInOctets{ifIndex="3"} 6.083273145e+09
ifHCInOctets{ifIndex="4"} 6.79593826e+08
ifHCInOctets{ifIndex="5"} 2.91243382e+08
ifHCInOctets{ifIndex="17"} 2.4255138238e+10
...
ifHCOutOctets{ifIndex="14"} 2.5131203193e+10
ifHCOutOctets{ifIndex="16"} 5.096503509e+09
ifHCOutOctets{ifIndex="2"} 0
ifHCOutOctets{ifIndex="3"} 1.0343428154e+10
ifHCOutOctets{ifIndex="4"} 1.4271999034e+10
ifHCOutOctets{ifIndex="5"} 4.11163912e+09
ifHCOutOctets{ifIndex="17",ifName="pppoe-out1"} 4.764235347e+09
This has some advantages with regards to continuity of timeseries if metadata changes, but seems to be a lot harder to work with in grafana.
Let's consider the simpler case of apcupsd with this proposed patch applied. Suppose you have multiple UPSes, and are showing all the "time remaining" in a single graph with a separate line for each UPS. Once prometheus has added its own scraping labels, you would get metrics like this:
apcupsd_ups_info{job="apcupsd",instance="192.0.2.1",hostname="nuc1",model="Back-UPS XS 700U",ups_name="bx700-1"} 1
apcupsd_battery_time_left_seconds{job="apcupsd",instance="192.0.2.1",status="ONLINE"} 4062
...
apcupsd_ups_info{job="apcupsd",instance="192.0.2.2",hostname="nuc2",model="Back-UPS XS 700U",ups_name="bx700-2"} 1
apcupsd_battery_time_left_seconds{job="apcupsd",instance="192.0.2.2",status="ONLINE"} 3950
If you want your graphs labelled with the UPS name or hostname, you need to invoke some prometheus join magic. According to this article I think that adding the ups_name
to the apcupsd_battery_time_left_seconds
metric might look something like this:
apcupsd_battery_time_left_seconds * on (instance, job) group_left(ups_name) apcupsd_ups_info
I don't pretend to understand this, and it may be wrong. In any case it would have to be copy-pasted into every grafana query in the dashboard.
Similarly: suppose you want a drop-down menu to select between UPSes. If you don't mind just showing the 'instance' value this is straightforward:
label_values(apcupsd_ups_info, instance)
apcupsd_battery_time_left_seconds{instance="$instance"}
But if you want the dropdown to select on 'ups_name', it gets tricky.
label_values(apcupsd_ups_info, ups_name)
apcupsd_battery_time_left_seconds * on (instance, job) group_left(ups_name) apcupsd_ups_info{ups_name="$ups_name"}
???I am ambivalent, however, about the loss of metadata from the individual metrics.
Including metadata in every metric is how the official snmp_exporter works.
For reference, discussion about it at https://github.com/prometheus/snmp_exporter/issues/180 and https://github.com/prometheus/snmp_exporter/pull/339
I ended up fixing things up, thanks for your time.
Can this be revisited? It would be very useful to have status info in the metrics, or as a label in apcupsd_info
. I made an issue pertaining to it:
https://github.com/mdlayher/apcupsd_exporter/issues/19
FWIW, I changed to https://github.com/damomurf/apcupsd-exporter a while ago.
Fair enough, but I need both status metrics and the metrics that this exporter provides... Both exporters use the same command line tool so it seems a bit silly to be installing two exporters for the same tool 😅
This adds multiple
apcupsd_ups_status
metrics, each containing the labelstatus
with all possible values theSTATUS:
line can contain. (Label values taken from https://www.systutorials.com/docs/linux/man/8-apcaccess/)This also add the
apcupsd_ups_info
metric and moves all the redundant info labels from the other metrics to this one (as suggested by you in #6).And lastly this PR adds an example Grafana dashboard and the respective screenhot.
Added bonus: a Dockerfile (image size: 8MB), with the image available at https://hub.docker.com/r/jangrewe/apcupsd-exporter