Open bunnyevans opened 10 months ago
Hi,
There are some troubleshooting steps in the plugin readme, specifically:
1) making sure that telegraf has the right capabilities
2) ensuring wg show
shows the devices you think are there`
Can you provide the output of the above command?
There does appear to be a scenario where if enumerate devices from the library we use returns no devices, that there would be no errors and as a result no output. That may be happening here.
I decided to try reproducing this, and I got a message after installing wireguard that it was only compatible with FreeBSD 12 and about to be removed. It recommended wireguard-go which I believe works quite differently. Which of these are you running on FreeBSD 14?
Edit: I see, you get bad error messages trying either because it's native to the kernel now and all you need is wireguard-tools. Removing foot from mouth and testing now that I have a wireguard connection.
Turning on debug, I get this message:
2023-12-21T02:57:30Z W! [inputs.wireguard] No Wireguard device found with name wg0
The setcap instruction isn't relevant for FreeBSD. I decided to make sure the telegraf user could look into this.
# sudo -u telegraf ifconfig -v wg0
wg0: flags=10080c1<UP,RUNNING,NOARP,MULTICAST,LOWER_UP> metric 0 mtu 1420
options=80000<LINKSTATE>
inet 10.8.0.5 netmask 0xffffffff
groups: wg
nd6 options=109<PERFORMNUD,IFDISABLED,NO_DAD>
# sudo -u telegraf wg show
interface: wg0
public key: [redacted]
listening port: 22979
peer: [redacted]
endpoint: [redacted]
allowed ips: 0.0.0.0/0
latest handshake: 38 seconds ago
transfer: 1.86 MiB received, 68.71 KiB sent
persistent keepalive: every 30 seconds
So at this point, I started suspecting the wgctrl library you're using as it hasn't been updated in a year. I wrote a quick program that just prints out client.Device("wg0")
and it actually worked fine.
# sudo -u telegraf ./wireguard-freebsd-test
&{wg0 FreeBSD kernel AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= [redacted] 22979 0 [{[redacted] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= [redacted] 30s 2023-12-20 21:57:47.0403048 -0500 EST 23418581 1048772 [{0.0.0.0 00000000}] 1}]}
So this does seem to be a problem in Telegraf. Maybe the output on 14 isn't what it expects?
telegraf-1.29.1
Name : telegraf
Version : 1.29.1
Installed on : Wed Dec 20 20:39:34 2023 EST
Origin : net-mgmt/telegraf
Architecture : FreeBSD:14:amd64
I dropped in on the issues board because I'm looking for a way to kill time on Christmas. Maybe I'll look into this.
The version of wgctrl that Telegraf is using is before they added FreeBSD support, so this never worked. It needs the version bumped and some labels tweaked. I'll start testing.
Sorry for over-commenting in the thread in the middle of the night, I got inspired.
I have this working now, however, it will only build with wgctrl's os_freebsd.go module if CGO_ENABLED is set to 1, which is explicitly set to 0 in the Telegraf Makefile. I'm assuming that's there on purpose because of a known problem?
if CGO_ENABLED is set to 1, which is explicitly set to 0 in the Telegraf Makefile. I'm assuming that's there on purpose because of a known problem?
This is really unfortunate that the library started using cgo. We do not support adding cgo dependencies or code in Telegraf. Telegraf produces a static binary and our binaries are cross built, adding cgo code would prevent that or limit that ability.
The version of wgctrl that Telegraf is using is before they added FreeBSD support, so this never worked.
Is there a version between what we currently use, before they started CGO that we can update to?
The component that monitors the FreeBSD kernel implementation of wireguard is specifically what requires access to C libraries. If it worked on FreeBSD in the past, it seems Telegraf and the old 2021 wgctrl supported userspace implementations, so wireguard-go probably reported metrics just fine. As it is, it looks like upgrading wgctrl for FreeBSD kernel support isn't a supported change.That's a bummer.
To offer the person originally asking an alternative, I think https://github.com/MindFlavor/prometheus_wireguard_exporter just parses the output of the wg command. I'd have to try it; I use it on Linux and it's fine. Could have telegraf scrape/ship to influx.
Thinking about it, I'll offer an idea, you can advise if it's a good or not.
It could be made optional in the config to gather metrics using the commands from wireguard-tools as opposed to the wgctrl library, parsing the output like Telegraf does with nvme/smartctl/some other things.
Does the CLI have a JSON or other format option? We strongly dislike parsing CLI output due to changes, whitespace, etc., but if there is a parseable output, we are much more likely to add support for that.
Looking into it, no. In their contrib folder, there is a json command that can be built separately, but all it does is what I was thinking about doing anyway (parses the other binary's format): https://github.com/WireGuard/wireguard-tools/blob/master/contrib/json/wg-json
Format hasn't changed as long as I've been using it, but I understand your concern. Too bad it's not a native feature. In the case of Linux/userspace though, in the long run that wgctrl dependency might still need to be dropped or forked due to future golang changes or other discovered issues.
interface: wg0
public key: [redacted]
private key: (hidden)
listening port: 51280
peer: [redacted]=
endpoint: [redacted]
allowed ips: 10.8.0.5/32
latest handshake: 9 hours, 54 minutes, 7 seconds ago
transfer: 85.00 MiB received, 2.11 GiB sent
peer: [redacted]=
endpoint: [redacted]
allowed ips: 10.8.0.7/32, abcd:abcd:abcd::7/128
latest handshake: 11 hours, 18 minutes, 45 seconds ago
transfer: 248.01 MiB received, 11.02 GiB sent
I am not opposed to an opt-in parsing option. I think the fact that the library we were using will not be able to be updated without work can further justify this as well.
Is this something you are interested in contributing?
What I would look for is:
Yeah, was dropping in because I've been using InfluxDB for years on hobby projects and felt like giving back. Do you have a doc or wiki somewhere for general requirements for contributing to Telegraf?
Sweet! We have some guidelines here:
https://github.com/influxdata/telegraf/tree/master/docs/developers
If you have more specific questions, you can ask them here (although I'm about to disappear for the rest of the year) or in our community slack. In general I would say @srebhan and I prefer to see a PR, even if you are not sure about it and we can work with you to resolve any issues!
Thanks, I'll get a draft PR up in a week or so after I've verified it runs right on Linux and FreeBSD for a few days.
Also, apparently there is an argument to have it print tab separated values with a new peer on every line, just had to read the man page more clearly, so we shouldn't have to worry about formatting changes.
Dumb question here, not being even remotely understanding of the gubbins of telegraf.
wireguard is now a first-class citizen in freebsd 14 and as a result it shows up in netstat like this:
wg0: flags=10080c1<UP,RUNNING,NOARP,MULTICAST,LOWER_UP> metric 0 mtu 1280
options=80000<LINKSTATE>
inet 172.25.248.35 netmask 0xffff0000
groups: wg
nd6 options=101<PERFORMNUD,NO_DAD>
wg1: flags=10080c1<UP,RUNNING,NOARP,MULTICAST,LOWER_UP> metric 0 mtu 1280
options=80000<LINKSTATE>
inet 10.20.30.64 netmask 0xffffff00
groups: wg
nd6 options=101<PERFORMNUD,NO_DAD>
and
bunny@turbinia:~ % netstat -i -I wg1
Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll
wg1 1280 <Link#6> wg1 4497637 0 0 2552503 25927 0
wg1 - 10.20.30.0/24 10.20.30.64 4450794 - - 2471502 - -
Perhaps this might be an easier route? Don't forget to shoot me down if I am completely wrong!
Apologies, this thread became kind of a brain dump of my troubleshooting. I'll break it down a little better.
So the reason Telegraf isn't able to monitor Wireguard with the Go library it ships with is that, although the newer version of that library supports FreeBSD's native kernel implementation, it needs access to certain C libraries to study it. This is ultimately a breaking change in the case of Telegraf, so that's how we wound up at plan B.
ifconfig
/netstat
don't provide very much of the information we're looking for, but if you install wireguard-tools
and run wg0 show all dump
[exclude the "dump" for something easier to read but harder to script], you'll see most of that information. In order to get the information in parity with the current labels it outputs, I just have to gather a couple OS details and that should be everything. I'm hoping to work on this next week.
Hoping for an easier way than duplicating or if
'ing all these functions for if wg_path
is specified in the config, I decided to take a detour to see if I could eliminate the C dependency in the wgctrl library instead.
Something to note, this cgo dependency only affects builds with GOOS set to freebsd or openbsd. Comment in the code about it:
// Package wgh is an auto-generated package which contains constants and // types used to access WireGuard information using ioctl calls.
Now it seems they are using sys/unix for most of their functions, but they're initializing things per architecture. The FreeBSD client lives here: https://github.com/WireGuard/wgctrl-go/blob/master/internal/wgfreebsd/client_freebsd.go
I'm now wondering if I should spend some time with FreeBSD's docs around ioctl and see if I can remove these C dependencies and just use Go's sys/unix, but I imagine they probably did it this way because of a limitation with that, and I'd subsequently need to do it for OpenBSD as well. If wgctrl can be changed, all that it would take to fix the wireguard plugin in Telegraf would be to update the go.mod and the deviceTypeNames
map to support the additional operating systems in the new version (this is how I got it working on a FreeBSD vm, but I had to enable CGO).
This will delay me a bit, but if it can be done this way that's ultimately better I think.
I am going to mark this upstream, as the client library we are currently using appears to require cgo for freebsd support. It does look like openbsd does not require cgo, so it may point to how this could get resolved.
If anyone does end up either updating the upstream library or want to create a freebsd specific config option, feel free to.
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.21.4, freebsd, FreeBSD 14.0-RELEASE
Docker
No response
Steps to reproduce
create simple config based on: https://github.com/influxdata/telegraf/tree/release-1.21/plugins/inputs/wireguard
, above config is in telegraf.conf.TEST here.
/usr/local/bin/telegraf --config=/usr/local/etc/telegraf.conf --config-directory=/usr/local/etc/telegraf.conf.TEST --test --debug
...
Expected behavior
results of some sort as per the documentation.
Actual behavior
silence
Additional info
Both wg0 and wg1 exist, but even removing the "devices =" line produces only silence.