Open skillcoder opened 1 month ago
TestRouting
failed because can't cleanup route 10.0.0.0/8
time="2024-08-06T21:15:16Z" level=info msg="using userspace bind mode"
time="2024-08-06T21:15:16Z" level=info msg="create tun interface"
time="2024-08-06T21:15:16Z" level=info msg="assign addr 100.64.0.1 mask 0xffffff00 to utun100 interface"
time="2024-08-06T21:15:16Z" level=debug msg="adding Wireguard private key"
time="2024-08-06T21:15:16Z" level=debug msg="Route for 0.0.0.0: interface &{2 16384 lo0 up|loopback|multicast|running} nexthop 192.168.0.1, preferred source 192.168.0.1"
time="2024-08-06T21:15:16Z" level=debug msg="Failed to get route for ::: no route found for ::"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet 0.0.0.0/1 -interface utun100: add net 0.0.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet 128.0.0.0/1 -interface utun100: add net 128.0.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet6 ::/1 -interface utun100: add net ::/1: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet6 8000::/1 -interface utun100: add net 8000::/1: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="try to add route: 10.0.0.0/8, utun100"
time="2024-08-06T21:15:16Z" level=trace msg="sleep 1000000000"
time="2024-08-06T21:15:16Z" level=warning msg="Skipping adding a new route for network 10.0.0.0/8 because it already exists"
time="2024-08-06T21:15:16Z" level=trace msg="try to add route: 10.10.0.0/24, utun100"
time="2024-08-06T21:15:16Z" level=debug msg="Route for 0.0.0.0: interface &{2 16384 lo0 up|loopback|multicast|running} nexthop 192.168.0.1, preferred source 192.168.0.1"
time="2024-08-06T21:15:16Z" level=debug msg="Skipping adding a new route for gateway 192.168.0.1 because it is not in the network 10.10.0.0/24"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet 10.10.0.0/24 -interface utun100: add net 10.10.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="try to add route: 127.0.10.0/24, utun100"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet 127.0.10.0/24 -interface utun100: add net 127.0.10.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="try to add route: 172.16.0.0/12, utun100"
time="2024-08-06T21:15:16Z" level=trace msg="route -n add -inet 172.16.0.0/12 -interface utun100: add net 172.16.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet 172.16.0.0/12 -interface utun100: delete net 172.16.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet 127.0.10.0/24 -interface utun100: delete net 127.0.10.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet 10.10.0.0/24 -interface utun100: delete net 10.10.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet 10.0.0.0/8 -interface utun100: delete net 10.0.0.0: gateway utun100 fib 0: not in table\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet 0.0.0.0/1 -interface utun100: delete net 0.0.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet 128.0.0.0/1 -interface utun100: delete net 128.0.0.0: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet6 ::/1 -interface utun100: delete net ::/1: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="route -n delete -inet6 8000::/1 -interface utun100: delete net 8000::/1: gateway utun100\n"
time="2024-08-06T21:15:16Z" level=trace msg="accept unix /var/run/wireguard/utun100.sock: use of closed network connection"
time="2024-08-06T21:15:16Z" level=error msg="failed to close uapi listener: bad file descriptor"
--- FAIL: TestRouting (0.06s)
--- FAIL: TestRouting/To_external_host_without_custom_dialer_via_vpn (0.06s)
systemops_generic_test.go:368:
Error Trace: /home/runner/work/netbird/netbird/client/internal/routemanager/systemops/systemops_generic_test.go:368
/usr/local/go121/src/testing/testing.go:1169
/usr/local/go121/src/testing/testing.go:1347
/usr/local/go121/src/testing/testing.go:1589
Error: Received unexpected error:
failed to delete route for 10.0.0.0/8: route cmd retry failed: exit status 1
Test: TestRouting/To_external_host_without_custom_dialer_via_vpn
Messages: removeVPNRoute should not return err: 10.0.0.0/8 utun100
FAIL
FAIL github.com/netbirdio/netbird/client/internal/routemanager/systemops 2.017s
FAIL
But it strange, it can't add the route to utun100
since it already somehow appeared on lo0
I've spend 3h and did not find the source of this, probably I should setup remote debugger to be able to run tests under debugger on remote FreeBSD machine. With my current setup it is too time consuming.
TestAddRemoveRoutes
time="2024-08-06T21:26:48Z" level=info msg="using userspace bind mode"
time="2024-08-06T21:26:48Z" level=info msg="create tun interface"
time="2024-08-06T21:26:48Z" level=info msg="assign addr 100.65.75.2 mask 0xffffff00 to utun530 interface"
--- FAIL: TestAddRemoveRoutes (0.02s)
--- FAIL: TestAddRemoveRoutes/Should_Add_And_Remove_Route_100.66.120.0/24 (0.02s)
systemops_generic_test.go:83:
Error Trace: /home/runner/work/netbird/netbird/client/internal/routemanager/systemops/systemops_generic_test.go:416
/home/runner/work/netbird/netbird/client/internal/routemanager/systemops/systemops_generic_test.go:83
Error: Not equal:
expected: "100.65.75.2"
actual : "invalid IP"
Diff:
--- Expected
+++ Actual
@@ -1 +1 @@
-100.65.75.2
+invalid IP
Test: TestAddRemoveRoutes/Should_Add_And_Remove_Route_100.66.120.0/24
Messages: route should point to wireguard interface IP
FAIL
FAIL github.com/netbirdio/netbird/client/internal/routemanager/systemops 1.989s
Found the reason of failed TestAddRemoveRoutes
https://github.com/netbirdio/netbird/blob/bcce1bf18474e68744d81a6b05b8a32e901ae771/client/internal/routemanager/systemops/systemops_generic.go#L381
It was added in https://github.com/skillcoder/netbird/commit/370eecfa908c208f8306fafd7c3800598cb6de75 to fix routing on FreeBSD, without it routing was not working properly, but during refactoring it 2 return parameters replaced with Nexthop{}
, I'm not sure it is right.
But now without this workaround tests passed.
And seems TestGetNextHop
also fixed after these changes.
Only TestRouting
left, and it's still failed with the same logs.
Error: Received unexpected error:
failed to delete route for 10.0.0.0/8: route cmd retry failed: exit status 1
Test: TestRouting/To_external_host_without_custom_dialer_via_vpn
Messages: removeVPNRoute should not return err: 10.0.0.0/8 utun100
Important! You can't run our netbird test in the environments with subnet 10.0.0.0/8, 192.168.0.0/24, 192.168.1.0/24. Because we use this subnets in test, tests just will failed, so make sure you setup you FreeBSD host on some different subnet, for example 192.168.91.0/24
BTW, here is a settings and commands to debug FreeBSD tests via delve.
It seems currently dlv dap
does not supported
https://github.com/golang/vscode-go/issues/2698
launch.json
{
"version": "0.2.0",
"configurations": [
{
"name": "FreeBSD tests",
"type": "go",
"debugAdapter": "dlv-dap",
"mode": "remote",
"request": "attach",
"substitutePath": [
{ "from": "${workspaceFolder}", "to": "/usr/home/REMOTEUSER/github/netbirdio/netbird" },
{ "from": "/usr/lib/go", "to": "/usr/local/go121" }
],
"host": "192.168.91.91",
"port": 12345,
"showLog": true,
"hideSystemGoroutines": true
}
]
}
sudo pkg install delve
cd ~/github/netbirdio/netbird/client/internal/routemanager/systemops
sudo dlv test --accept-multiclient --api-version 2 --headless --listen=192.168.91.91:12345 --log
Related links: https://go.microsoft.com/fwlink/?linkid=830387 https://github.com/golang/vscode-go/wiki/debugging https://github.com/golang/vscode-go/wiki/debugging#connect-to-delve-dap-with-target-specified-at-client-start-up https://forums.freebsd.org/threads/vscode-and-go-on-freebsd.75905/ https://code.visualstudio.com/docs/editor/variables-reference
I found a reason of this workaround
If remove it netbird
will create a following route
192.168.91.223 192.168.91.91 UGHS em0
And it means route everything to local IP. For reference full route table
Internet:
Destination Gateway Flags Netif Expire
default 192.168.91.223 UGS em0
127.0.0.1 link#2 UH lo0
192.168.91.0/24 link#1 U em0
192.168.91.91 link#1 UHS lo0
192.168.91.223 192.168.91.91 UGHS em0
Where 192.168.91.91 is local IP
TestAddExistAndRemoveRoute
can detect this problem
And it disconnect host from any external connection since default route broke.
This case broken
{
name: "Should Not Add Route if overlaps with default gateway",
prefix: netip.MustParsePrefix(defaultNexthop.IP.String() + "/31"),
shouldAddRoute: false,
},
These workaround with returning only interface name without IP fix problem because of this code https://github.com/netbirdio/netbird/blob/af1b42e538e74fe01b3f4398f95b4521d8ac3298/client/internal/routemanager/systemops/systemops_unix.go#L50
To be able to properly check TestAddExistAndRemoveRoute
shouldAddRoute: false
cases I will try to replace unreliable logs-parsing-based test
https://github.com/netbirdio/netbird/blob/4bbedb5193f13f1e033acd4657a9c4f41d048bde/client/internal/routemanager/systemops/systemops_generic_test.go#L256
with OpenTelemetry spans and attributes.
Describe the problem Currently under FreeBSD some tests are skipped, which could lead to undetected bugs. Specifically tests related to routing disabled with
Following tests affected
TestAddRemoveRoutes
,TestGetNextHop
,TestRouting
To Reproduce
To find all affected tests - grep codebase by
freebsd
substring.Expected behavior
We should not skip any tests.