Closed apognu closed 5 years ago
This sounds familiar. I think there was a matching WireGuard change to follow some new netlink convention, but I'm on vacation and traveling for the next week or so. I'd check the commit logs there, but otherwise I can take a look in a couple of weeks.
I logged in on GitHub to open exactly this ticket. I can confirm the bug. The same code is working fine with older kernels. This problem is isolated only to wgctrl-go
(not affecting wireguard).
I think there was a matching WireGuard change to follow some new netlink convention, but I'm on vacation and traveling for the next week or so. I'd check the commit logs there, but otherwise I can take a look in a couple of weeks.
I'll try to have a look if I have time. Enjoy your vacation. :smile_cat:
This might be the cause of an issue I found here: https://github.com/costela/wesher/issues/5.
I'm on Linux arch1 5.2.3-arch1-1-ARCH.
I've got an nlmon capture if it's helpful for debugging purposes.
@devinrsmith that would be helpful, thanks.
@mdlayher -> nlmon0.cap
Unfortunately I have some other obligations for the next few weeks and probably won't get to this for a bit, since I'm not running 5.2+ kernels on my machines. Would anybody like to take a crack at this? There's probably some change in the genetlink subsystem that requires things to be more specific.
I don't suppose @DMarby or other Mullvad folks have run into this? Sorry, I'm winding down travels but I still have low bandwidth at the moment.
Haven't seen this so far, since we don't run 5.2+ kernels yet either. Will keep an eye out and see if I can allocate some time to look at it, but can't promise anything
I'm looking at commit https://git.zx2c4.com/WireGuard/commit/src/netlink.c?id=3120425f69003be287cb2d308f89c7a6a0335ff0 and I suspect something here is the cause. I haven't isolated the problem yet though.
Thanks to some investigative work from "ius" on IRC, I was shown this: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae0be8de9a53cda3505865c11826d8ff0640237c
It appears that NLA_F_NESTED is the key on newer kernels! I have a patch locally that seems to work, but I want to make sure I can get everything working on both old and new kernels. I should have something ready soon.
Ever since I upgraded to Linux 5.1.2 (Arch Linux 5.1.2-arch1-1-ARCH), my tool that uses this library fails to set up Wireguard with a
invalid argument
error returned by theConfigureDevice
function. By my testing around, the peers configuration is what makes this error show up (if I remove all my code about peers, no issue appears, but of course, no peers are added to the Wireguard setup).I checked by using the same version of my tool on another box running 5.0.5 and the issue does not appear. And as far as I can tell, my previous kernel version (5.1.4) did not have this issue either.
Using wg-quick instead of this library works properly on my current kernel.
I'm a bit at a loss as to how to debug this on my end and provide you with more information.