louislam / uptime-kuma

A fancy self-hosted monitoring tool
https://uptime.kuma.pet
MIT License
55.1k stars 4.95k forks source link

SNMP Value Retrieval #1675

Closed wglenn01 closed 1 month ago

wglenn01 commented 2 years ago

⚠️ Please verify that this feature request has NOT been suggested before.

🏷️ Feature Request Type

New Monitor

🔖 Feature description

I would love to be able to put in custom SNMP OID's per device to monitor the value of that OID with each update and display them in the table, would be SO helpful.

✔️ Solution

Add SNMP monitoring for custom OID's per device.

❓ Alternatives

No response

📝 Additional Context

No response

mattv8 commented 4 months ago

Hey @owlysk I've started working on implementing this feature. It's nowhere near complete, but I'm dropping my work thus far into PR #4717. I've laid the framework for your control_value logic. I'll continue to work on this next week (I'm assuming I won't have time over the weekend but ya never know).

mattv8 commented 4 months ago

@owlysk I'm wondering if you'd be able to take a look. I'm basically stuck with a strange timeout issue. See my PR, specifically, #issuecomment-2083823715

camdenyoung commented 4 months ago

Apologies @owlysk, I've been away the last week. I'm back in the office now if you still need me to send through an snmpwalk?

camdenyoung commented 4 months ago

@owlysk @mattv8 below is the result for RSSI (signal strength) from a Siklu EH-710 microwave link as an example

snmpget -v 2c -c public 10.27.2.82 .1.3.6.1.4.1.31926.2.1.1.19.1 SNMPv2-SMI::enterprises.31926.2.1.1.19.1 = INTEGER: -26

Setting thresholds is where it matters with sensors such as these. A link that is powered up, but has no wireless connection will return a value of -128 (dBm).

A connection anywhere between -20 (dBm) through -55 (dBm) is generally a healthy link, but is relative to what the physical distance the links are connecting across. So, with the above example unit that is reporting a -26 RSSI, I might set a warning threshold at -30 and then a critical if it equals or passes -40 for example.

mattv8 commented 4 months ago

@camdenyoung I'm essentially implementing @owlysk's suggested logic for checking SNMP:

> - check integer >= - check integer =< - check integer <= - check integer == - check integer & string contains - check if string is in OID value

However, one potential issue I foresee is that often times OID's will report values in binary, or OctetString (more on that here). So we need to have logic in place to handle those situations.

In my PR I was originally going to use snmp-native, but I am switching to net-snmp because has SNMP walking. As I work on this feature, I wrote an SNMP walking script, which I used to try to parse my Cisco SG-300 switch. I let the walk run for 30 minutes or so and it reported back 211,468 unique OID's parsable in Cisco's default RO library... So I think implementing an SNMP walk for this feature is impractical. Which is frustrating because unless you already know the OID you're wanting to monitor, there isn't some central directory to easily look up, for example, CPU usage on my Cisco switch...

Posting my SNMP walking script for the record:

Click to view code ```javascript const fs = require('fs'); const snmp = require('net-snmp'); // Configuration const monitor = { ipAddress: 'changeme', port: '161', communityString: 'public', snmpVersion: '2c', rootOid: '1.3.6.1',// This is pretty high level }; // SNMP options const options = { port: monitor.port, retries: 1, timeout: 5000, version: snmp.Version2c, }; // Create SNMP session const session = snmp.createSession(monitor.ipAddress, monitor.communityString, options); // Create writable stream for CSV const csvStream = fs.createWriteStream('snmp_data.csv'); // Function to handle SNMP walk feed callback function feedCb(varbinds) { for (const varbind of varbinds) { if (!snmp.isVarbindError(varbind) && varbind.value !== null && varbind.value !== undefined) { console.log(`OID: ${varbind.oid}, Type: ${findType(varbind.type)}, Value: ${varbind.value}`); const csvRow = `${varbind.oid},${findType(varbind.type)},${varbind.value}\n`; csvStream.write(csvRow); } } } // Function to handle SNMP walk done callback function doneCb(error) { if (error) { console.error(error.toString()); } else { console.log('SNMP walk completed.'); } } // Start SNMP walk from the specified OID session.walk(monitor.rootOid, 20, feedCb, doneCb); // Function to find the corresponding type name for a given value function findType(value) { const typeName = Object.entries(snmp.ObjectType).find(([key, val]) => val === value); return typeName ? typeName[0] : 'Unknown'; } ```
camdenyoung commented 4 months ago

As you've just discovered, I wouldn't worry about implementing any OID scanning functionality. There are thousands of tools out there that the user can utilise to find the right OID for the sensor / status to be monitored. Leave it to the user to know what OID to enter, then only functionality required from the uptime kuma perspective is the logic behind the alert. Is the value true or false, or is the threshold met or exceeded, etc.

On a side note though, it might be worth having a look over librenms GitHub. This is a much higher level (and as a result, significantly more complicated) monitoring solution, but it might help explain how they have implemented the scanning functionality as well as the monitoring side.

mattv8 commented 4 months ago

Thanks for the tip to librenms, I'll take a look. I'll also continue working on the PR tomorrow morning, but I was able to get basic functionality working at least, so I can sleep tonight...

mattv8 commented 4 months ago

Alright I finished this monitor and have requested review of the PR. For now, I'm not going to implement any sort of OID discovery-- it's just too complicated. I'm going to assume people know the OID they want to monitor already and can paste it in.

If anyone wants to check out this feature, you can clone my PR #4717 and test for yourself. It's working well so far in my dev environment. The PR can be tested via

docker run --rm -it -p 3000:3000 -p 3001:3001 --pull always -e 'UPTIME_KUMA_GH_REPO=mattv8:snmp-monitor' louislam/uptime-kuma:pr-test2
owlysk commented 4 months ago

@camdenyoung Thank you for snmpwalk result. @mattv8 was quicker in implementation, than me :)

@mattv8 I'm testing your PR but I always get error - Request timed out. I'm using SNMPv1.

mattv8 commented 4 months ago

Hey @owlysk I got lucky and had some spare time. 🙂

I'm currently working through the error you're seeing. There is an issue with how my code is parsing strings: SNMP Error: toString() radix argument must be between 2 and 36. Working on this now, should have resolution in an hour or so.

mattv8 commented 4 months ago

I found the issue and will submit my fix before the end of the day.

mattv8 commented 4 months ago

@owlysk I have committed my code fixing the issue you were encountering. Please re-pull the updated docker image with my latest commits.

If you're still seeing Request timed out it is likely that you need to turn on the SNMP service on your SNMP enabled device. Verify you have configured a community string-- this is essentially the password for your SNMP queries.

FWIW I'm testing on my Cisco SG-300 switch. I had to go into the device's web interface and turn on the SNMP server, and configure a community string as well as a "view" which turns on a specific set of OID's.

iamk3 commented 1 month ago

@mattv8 Where can I pull the container to do some testing myself?

mattv8 commented 1 month ago

Rather than pull it into a Docker container, it's IMO easiest to clone the PR to a local directory and spin up the server from there. Here are the commands to do that:

git clone https://github.com/louislam/uptime-kuma.git # Assuming you have git installed
cd uptime-kuma # Change into the directory you just cloned
git fetch origin pull/4717/head:pr-4717 && git checkout pr-4717 # Check out this PR
npm ci # Assuming you have node and npm installed
npm run dev # This will start the server on http://localhost:3000