netbirdio / netbird

Connect your devices into a secure WireGuard®-based overlay network with SSO, MFA and granular access controls.
https://netbird.io
BSD 3-Clause "New" or "Revised" License
11.32k stars 519 forks source link

High Availability network routes not working #1263

Closed lfarkas closed 1 year ago

lfarkas commented 1 year ago

i've got a HA network routes for the 192.168.0.0/16 network with 2 peers with the same metric 9999 :

my netbird client use this network through fox (and it's working) than just ssh to fox and shutdown it. the result this can be seen:

netbird status -d
...
 kvm.netbird.cloud:
  NetBird IP: 100.76.54.238
  Public key: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=
  Status: Connected
  -- detail --
  Connection type: Relayed
  Direct: false
  ICE candidate (Local/Remote): relay/srflx
  Last connection update: 2023-10-28 18:52:46

 fox.netbird.cloud:
  NetBird IP: 100.76.171.201
  Public key: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=
  Status: Disconnected
  -- detail --
  Connection type: P2P
  Direct: false
  ICE candidate (Local/Remote): srflx/prflx
  Last connection update: 2023-10-28 18:59:51

Daemon version: 0.24.2
CLI version: 0.24.2
Management: Connected to https://api.wiretrustee.com:443
Signal: Connected to https://signal.netbird.io:443
FQDN: wolf.netbird.cloud
NetBird IP: 100.76.24.179/16
Interface type: Kernel
Peers count: 2/6 Connected

while in the last line on the /var/log/netbird/client.log:

2023-10-28T18:59:51+02:00 WARN client/internal/routemanager/client.go:119: the network 192.168.0.0/16 has not been assigned a routing peer as no peers from the list [FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY= hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] are currently connected

and after this it's obvious it's not working...

the strange thing is that I can ssh into kvm through netbird!

and a systemctl restart netbird.service solve the problem...but there is still only one such line in the log:

2023-10-28T19:10:43+02:00 WARN client/internal/routemanager/client.go:119: the network 192.168.0.0/16 has not been assigned a routing peer as no peers from the list [FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY= hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] are currently connected

and a little bit later such:

2023-10-28T19:10:46+02:00 INFO client/internal/routemanager/client.go:122: new chosen route is ci2dej2t2r9s73e43sh0 with peer +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI= with score 2
2023-10-28T19:10:46+02:00 INFO client/internal/routemanager/client.go:122: new chosen route is ckt224qfic3c739igj60 with peer +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI= with score 2

even though for me this 2 line is totally unusable since I can't identify any of these routes.

mlsmaycon commented 1 year ago

Hello @lfarkas can you run the client with debug logs?

You can do that by running the following commands:

sudo netbird service stop
sudo netbird up -F -l debug | tee /tmp/netbird.debug.log

After running for 60s you can share the logs for us to check them.

lfarkas commented 1 year ago

i can not reproduce this error now. but it's rather complicated to shutdown one of the ha peer and see the switch. let's just close this bug now and if i can reproduce it again reopen it with this log.