Open runningman84 opened 4 years ago
It does:
# TYPE mikrotik_health_temperature gauge
mikrotik_health_temperature{address="xx.yy.zz.10",name="xx-yy-zz-1"} 41
mikrotik_health_temperature{address="aa.bb.cc",name="aa-bb-cc-2"} 45
# HELP mikrotik_health_voltage Input voltage to the RouterOS board, in volts
# TYPE mikrotik_health_voltage gauge
mikrotik_health_voltage{address="xx.yy.zz.10",name="xx-yy-zz-1"} 23
mikrotik_health_voltage{address="aa.bb.cc",name="aa-bb-cc-2"} 26.6
However those values are not present on hardware that does not support them, like running RouterOS on off the shelf x86 hardware. It think there's also a bug on some CRS devices with SNMP and temperature, so perhaps that is related.
Were you trying this with a particular piece of hardware?
I am using a hEX
Are you invoking with -with-health
or including: health: true
in the features:
section of your config file?
For reference this does work for me on the following devices: crs112 rb4011 rb962
i cant seem to get this to work, i have the following in my config file:
devices:
- name: router
address: xxx
user: xxx
password: xxx
log-level: debug
features:
BGP: true
DHCP: true
DHCPL: true
Health: true
Firmware: true
#routes: true
Optics: true
POE: true
Monitor: true
also tried with lowercase. it just doesnt seem to be pulling this in. In particular i cant see the health data. Not sure where im going wrong :( using Docker
I have the CRS312-4C+8XG and the don't get any health exports with the latest commit d3285ba and using -with-health
. I don't even get an error message on the console. Shouln't this produce an error in case fetching the metrics fails?
Metrics available on the switch/router (at least via its SSH interface):
cpu-temperature
psu1-state
psu2-state
fan1-speed
fan2-speed
fan3-speed
fan4-speed
@pklaus what is the path in the OS that you are using to get those metrics? I only get :
> /system health print
voltage: 24.1V
temperature: 32C
This is on a RB3011UiAS-RM.
@nshttpd
On the CRS312-4C+8XG, the command /system health print
reveals the following:
[admin@switch10g] > /system health print
cpu-temperature: 45C
psu1-state: ok
psu2-state: fail
fan1-speed: 3120RPM
fan2-speed: 3195RPM
fan3-speed: 3300RPM
fan4-speed: 3315RPM
With the latest commit 26d4264, I do get the CPU temperature now:
# HELP mikrotik_health_cpu_temperature Temperature of RouterOS CPU, in degrees Celsius
# TYPE mikrotik_health_cpu_temperature gauge
mikrotik_health_cpu_temperature{address="10.2.1.5",name="switch10g"} 46
@pklaus ok thanks. I'll take a look. Kind of annoying that the API doesn't just return a list for something like psu-state
or fan-speed
to get all of the data without having to specify each individual item.
Just chiming in: health metrics are 'broken' and don't return anything on RouterOS 7+.
[admin@MikroTik] > /system health print
Columns: NAME, VALUE, TYPE
# NAME VALU T
0 voltage 23.8 V
1 temperature 42 C
RB4011iGS+
running RouterOS 7.1beta5
.
EDIT: exporter was built from source (6f53244308040a85a37798472a82a7d6d7a65821) at the time of reporting the issue.
I don't have a 7.0 device to test this on. I wonder if they changed something in the response.
can you pull and build from the nshttpd/debug-health
branch? run that with -log-level debug
and paste in the output. It may look something like :
{"level":"debug","msg":"!re @ [{`voltage` `24.1`} {`temperature` `34`}]\n!done @ []","time":"2021-05-05T23:05:55-04:00"}
Here's the requested output:
root@host:/# /usr/local/bin/mikrotik-exporter -device RB4011 -address 10.0.1.1 -user <redacted> -password <redacted> -with-health -log-level debug
{"level":"info","msg":"setting up collector for devices","numDevices":1,"time":"2021-05-06T08:47:03+02:00"}
{"level":"info","msg":"Listening on :9436","time":"2021-05-06T08:47:03+02:00"}
{"device":"RB4011","level":"debug","msg":"trying to Dial","time":"2021-05-06T08:47:10+02:00"}
{"device":"RB4011","level":"debug","msg":"done dialing","time":"2021-05-06T08:47:10+02:00"}
{"device":"RB4011","level":"debug","msg":"got client","time":"2021-05-06T08:47:10+02:00"}
{"device":"RB4011","level":"debug","msg":"trying to login","time":"2021-05-06T08:47:10+02:00"}
{"level":"debug","msg":"!re @ []\n!re @ []\n!done @ []","time":"2021-05-06T08:47:10+02:00"}
{"level":"debug","msg":"OK: RB4011 collector succeeded after 0.028125s.","time":"2021-05-06T08:47:10+02:00"}
Thank for looking into this 🎉 .
Interesting, so that device and os version isn't returning any data for the health call :
{"level":"debug","msg":"!re @ []\n!re @ []\n!done @ []","time":"2021-05-06T08:47:10+02:00"}
I just pushed to that same branch an update that removes the property list that's being requested. So if it returns anything it should do it now and not just the specific properties that are being requested. Can you grab and build it again and see what the output is? If it's still empty then it's something over on the OS side. Assuming that other metrics are returned and this is the only one that's empty.
Here it goes:
{"level":"debug","msg":"!re @ [{`.id` `*D`} {`name` `voltage`} {`value` `23.9`} {`type` `V`}]\n!re @ [{`.id` `*E`} {`name` `temperature`} {`value` `42`} {`type` `C`}]\n!done @ []","time":"2021-05-07T09:22:25+02:00"}
EDIT: you're right, all other metrics are returned and this is the only one that's empty. Only "broke" when I upgraded RouterOS to version 7+, worked perfectly fine before.
@barnumbirr thanks. I'm going to have to figure out what I have that might be able to run 7.0. Looks like they have changed a bunch of stuff.
can you pull and build from the
nshttpd/debug-health
branch? run that with-log-level debug
and paste in the output. It may look something like :
I checked the output of branch nshttpd/debug-health
for my device (CRS312-4C+8XG) running firmware version 7.1 beta6 (development):
{"level":"debug","msg":"!re @ [{`.id` `*11`} {`name` `cpu-temperature`} {`value` `38`} {`type` `C`}]\n!re @ [{`.id` `*34`} {`name` `phy-temperature`} {`value` `39`} {`type` `C`}]\n!re @ [{`.id` `*1B59`} {`name` `fan1-speed`} {`value` `3165`} {`type` `RPM`}]\n!re @ [{`.id` `*1B5A`} {`name` `fan2-speed`} {`value` `3225`} {`type` `RPM`}]\n!re @ [{`.id` `*1B5B`} {`name` `fan3-speed`} {`value` `3315`} {`type` `RPM`}]\n!re @ [{`.id` `*1B5C`} {`name` `fan4-speed`} {`value` `3270`} {`type` `RPM`}]\n!re @ [{`.id` `*1CE9`} {`name` `psu1-state`} {`value` `ok`} {`type` ``}]\n!re @ [{`.id` `*1CEA`} {`name` `psu2-state`} {`value` `fail`} {`type` ``}]\n!done @ []","time":"2021-05-26T18:38:28+02:00"}
I hope that it helps.
@pklaus && @barnumbirr ... looks like with v7.1beta4 Mikrotik has implemented a REST API. Which .. Yay!.
I'll try to carve out some time in the next couple of weeks to figure out a way to integrated the v7 stuff into the exporter.
@nshttpd any update on this issue? Anything we can do to help?
who has a better grafana dashboard? can anyone share a more better and updated grafana dashboard than the 3 years old ones?
anyone willing to share a more up to date grafana dashboard? will appreciate it thanks
I saw that a health collector fix for RouterOS 7 was implemented in e1b06c6ebe6e71a5661326b3a33afe2fd741283d and have been testing it for a couple of days now, works a treat running v7.1.4.
I'd say this issue can probably be marked as resolved.
It would be great to get the system temperature and voltage.