librenms / librenms-agent

LibreNMS Agent & Scripts
GNU General Public License v2.0
116 stars 187 forks source link

32bit restriction with application wireguard? #445

Open efelon opened 1 year ago

efelon commented 1 year ago

The problem

I have a long running wireguard on a (still 32bit) raspberry pi 4 Debian buster: Linux 5.10.103-v7l+ #1529 SMP Tue Mar 8 12:24:00 GMT 2022 armv7l GNU/Linux LibreNMS is running an 32bit Debian buster as well.

Some of the peers have high values for send/receive transfer counter. e.g.:

peer: ***  mobile_mw
  preshared key: (hidden)
  endpoint: *:*
  allowed ips: *
  latest handshake: 1 minute, 56 seconds ago
  transfer: 761.16 MiB received, 3.83 GiB sent

Althoug the absolute byte values output by wg show all dump are not at the limit:

4113015344 (value by wg)
4294967296 (2^32)

the graphs in the web ui show nand for these values. As a test I modified wireguard.py as follows, which "fixes" the display problem in the web ui:

[...]
        bytes_rcvd = long(line_parsed[6]) / 1000
        bytes_sent = long(line_parsed[7]) / 1000
[...]

I'm pretty sure, the 32bit OS/php/python is the problem somehow, but wanted to report this behavior in case there is another solution other than "32bit is not supported any more".

Opened here by request from librenms/librenms#14688

Output of ./validate.php

===========================================
Component | Version
--------- | -------
LibreNMS  | 22.11.0 (2022-11-24T07:01:26+01:00)
DB Schema | 2022_08_15_084507_add_rrd_type_to_wireless_sensors_table (248)
PHP       | 8.1.12
Python    | 3.7.3
Database  | MariaDB 10.3.36-MariaDB-0+deb10u2
RRDTool   | 1.7.1
SNMP      | 5.7.3
===========================================

[OK]    Composer Version: 2.4.4
[OK]    Dependencies up-to-date.
[OK]    Database connection successful
[OK]    Database Schema is current
[OK]    SQL Server meets minimum requirements
[OK]    lower_case_table_names is enabled
[OK]    MySQL engine is optimal
[OK]
[OK]    Database schema correct
[OK]    MySQl and PHP time match
[OK]    Active pollers found
[OK]    Dispatcher Service not detected
[OK]    Locks are functional
[OK]    Python poller wrapper is polling
[WARN]  Using database for locking, you should set CACHE_DRIVER=redis
[OK]    rrd_dir is writable
[OK]    rrdtool version ok

What was the last working version of LibreNMS?

No response

Anything in the logs that might be useful for us?

No response

bnerickson commented 1 year ago

@efelon As a test, would you remove the "long" casting and, assuming you're running python3 on you Pi, would you print the output of python3 -c "import sys; print(sys.maxsize)" here? If you're running python2, I believe the command is python -c "import sys; print(sys.maxint)"

efelon commented 1 year ago

Hello @bnerickson , my default is python2 for this installation, but I have python 3 also installed:

# python3 -c "import sys; print(sys.maxsize)"
2147483647
# python2 -c "import sys; print(sys.maxint)"
2147483647

One of the devices has these values now:

{"mobile_mw": {"minutes_since_last_handshake": 1, "bytes_rcvd": 999921456, "bytes_sent": 5157117812}

wg dump:

wg0 **= **= *.*.*.*:39881   10.5.0.2/32,fd23:42::2/128  1670059337  1001515004  5203107064  off

looking like this (with the original wireguard.py): grafik

bnerickson commented 1 year ago

@efelon thanks. Would you paste the relevant output when running the /etc/snmp/wireguard.py command under two conditions?:

  1. With bytes_sent/bytes_rcvd casted to the long type without division:
    [...]
        bytes_rcvd = long(line_parsed[6])
        bytes_sent = long(line_parsed[7])
    [...]
  2. With bytes_sent/bytes_rcvd original:
    [...]
        bytes_rcvd = int(line_parsed[6])
        bytes_sent = int(line_parsed[7])
    [...]
efelon commented 1 year ago

@bnerickson, of course. But the output is the same:

# int()
"mobile_mw": {"minutes_since_last_handshake": 0, "bytes_sent": 5790460872, "bytes_rcvd": 1030401704},
# long()
"mobile_mw": {"minutes_since_last_handshake": 0, "bytes_sent": 5790460904, "bytes_rcvd": 1030401704},
bnerickson commented 1 year ago

It might be unrelated since the max value I supplied is larger than yours, but I screwed up by setting a maximum for the bytes received/sent here: https://github.com/librenms/librenms/blob/49abee372268d2d49448f9557e00b6cb8a54521e/includes/polling/applications/wireguard.inc.php#L24

Can you change those two lines on your LibreNMS install to the following and report if you start seeing data on the graph without the divide by 1000 and long casting?

    ->addDataset('bytes_rcvd', 'DERIVE', 0)
    ->addDataset('bytes_sent', 'DERIVE', 0)

You might have to delete the RRD and re-poll for the changes to take effect. In either case, I need to submit a PR with those changes.

efelon commented 1 year ago

The high values still don't show up. I deleted the rrd files as requested and waited several polls.

bnerickson commented 1 year ago

Thanks. On your LibreNMS server (I assume it is separate from your RaspPi), what is the wireguard-specific output when you run the following?:

snmpwalk -v2c -c <snmp_community> <rasppi_ip_address> NET-SNMP-EXTEND-MIB::nsExtendOutput2Table | grep wireguard

Replacing and with your snmp community and RaspPi IP address or hostname respectively.

efelon commented 1 year ago

I use only v3 so I changed the snmpwalk command accordingly, but the output is most likely not what you expect:

[...]
Did not find 'nsExtensions' in module NET-SNMP-AGENT-MIB (/usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt)
Did not find 'DisplayString' in module #-1 (/usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt)
Did not find 'RowStatus' in module #-1 (/usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt)
Did not find 'StorageType' in module #-1 (/usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt)
Unlinked OID in NET-SNMP-EXTEND-MIB: nsExtendGroups ::= { nsExtensions 3 }
Undefined identifier: nsExtensions near line 39 of /usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt
Unlinked OID in NET-SNMP-EXTEND-MIB: nsExtendObjects ::= { nsExtensions 2 }
Undefined identifier: nsExtensions near line 38 of /usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt
Unlinked OID in NET-SNMP-EXTEND-MIB: netSnmpExtendMIB ::= { nsExtensions 1 }
Undefined identifier: nsExtensions near line 19 of /usr/share/snmp/mibs/NET-SNMP-EXTEND-MIB.txt
NET-SNMP-EXTEND-MIB::nsExtendOutput2Table: Unknown Object Identifier
bnerickson commented 1 year ago

Ah, yes. Let's try running the command with the OID:

snmpwalk -v2c -c <snmp_community> <rasppi_ip_address> .1.3.6.1.4.1.8072.1.3.2.4 | grep wireguard
efelon commented 1 year ago

I grep for wg0 as there is no "wireguard" in the output, and skipped the other clients:

./walk.sh .1.3.6.1.4.1.8072.1.3.2.4 | grep wg0 iso.3.6.1.4.1.8072.1.3.2.4.1.2.9.119.105.114.101.103.117.97.114.100.1 = STRING: "{\"errorString\": \"\", \"error\": 0, \"version\": 1, \"data\": {\"wg0\": {\"mobile_mw\": {\"minutes_since_last_handshake\": 0, \"bytes_rcvd\": 1055315724, \"bytes_sent\": 6103999272}, [...]

bnerickson commented 1 year ago

Thanks. Is your LibreNMS installation on a 32-bit or 64-bit kernel?

bnerickson commented 1 year ago

Hm. I tried to reproduce your scenario by creating a dummy sample_guest with 1055315724 bytes_rcvd and 6103999272 bytes_sent, but LibreNMS graphed that successfully:

wg0

efelon commented 1 year ago

Thanks. Is your LibreNMS installation on a 32-bit or 64-bit kernel?

Both (wireguard pi and LibreNMS pi) are actually on 32-bit kernel at the moment. Coincidently I'm about to move the LibreNMS instance to a 64bit installation in the next few days. I will report back afterwards.

bnerickson commented 1 year ago

Sounds good. Hope that fixes the issue. My LibreNMS is on a 64-bit kernel FWIW.

efelon commented 1 year ago

I have moved my LibreNMS installation over to a 64bit system, and the values are back. Wireguard still runs on the 32bit system: image

One thing to note. I couldn't transfer the rrd files directly. When trying to open the "32bit" rrd files (managed with rrdtool version 1.7.1) i got the following error (rrdtool version 1.7.2): ERROR: reached EOF while loading header rrd->ds_def. This error also showed up in the librenms WebUI instead of the graph. I had to rrdtool dump every file to xml, transfer those over to the new machine and rrdtool restore them back to rrd.