netbirdio / netbird

Connect your devices into a single secure private WireGuard®-based mesh network with SSO/MFA and simple access controls.
https://netbird.io
BSD 3-Clause "New" or "Revised" License
9.87k stars 428 forks source link

Unable to connect to turn server. Stun server works #2051

Closed alex-ritter closed 1 month ago

alex-ritter commented 1 month ago

Describe the problem

When connecting to Netbird externally, I'm unable to ping any peers. When connected locally everything works fine. Running netbird status -d it seems like I'm connected to the stun relay just fine but the turn relay is causing an issue.

[stun:netbird..com:3478] is Available [turn:netbird..com:3478?transport=udp] is Unavailable, reason: allocate: Allocate error response (error 401: Unauthorized)

Management logs, dashboard logs, api logs all look fine. The only backend error I'm seeing is in the Coturn service. When running docker compose Coturn logs I get the following errors

coturn-1 | 133802: (12): ERROR: check_stun_auth: user self credentials are incorrect

To Reproduce

This is most likely a configuration / networking issue. To reproduce this on my side

  1. Install a new peer agent using the latest netbird agent or in my case 0.27.7
  2. Connect to netbird successfully
  3. Try to ping another peer either by the netbird IP 100.xx.xx.xx or by their host name xxxxx.netbird.example

Expected behavior

Be able to ping the host name xxxx.netbird.example. All routes are setup by default, this is a default instance of netbird so all routes should be enabled.

Are you using NetBird Cloud?

No, this is self hosted

NetBird version

0.27.7

NetBird status -d output:

OS: windows/amd64 Daemon version: 0.27.7 CLI version: 0.27.7 Management: Connected to https://netbird.****.com:443 Signal: Connected to https://netbird.****.com:443 Relays: [stun:netbird..com:3478] is Available [turn:netbird..com:3478?transport=udp] is Unavailable, reason: allocate: Allocate error response (error 401: Unauthorized) Nameservers: FQDN: **netbird.*** NetBird IP: 100.../ Interface type: Userspace Quantum resistance: false Routes: - Peers count: 0/3 Connected

Additional context

Most likely this is a configuration / networking issue. This is how my netbird is hosted. Domain -> Cloudflare (non-proxied) -> Reverse Proxy (nginx) -> Netbird.

I use unifi and unifi firewall. For now I'm trying to set everything to allow all just to get this to work. This is my current firewall config.

I have a vlan for monitoring, a vlan for external services, and a vlan for my personal devices. monitoring vlan has access to all other vlans external vlan has access to all other vlans

nginx reverse proxy is on the external vlan, has port 80, 443 exposed. netbird is on the monitoring vlan, has port 3478, 49152 - 65535, 33073, and 10000 exposed.

Only reason why I think this is a networking issue is because I can connect to peers locally, but externally I can't. But that just might be the difference between the 'stun' and the 'turn' relay since a lot of this is new to me. I'd like opinions as I'm looking at this later today. I'll post any fixes if I find any.

alex-ritter commented 1 month ago

Created a slack message for debugging purposes and just general questions I have. For documentation purposes I'll add the link to the message here.

https://netbirdio.slack.com/archives/C05T5K65X7U/p1716567332360629

alex-ritter commented 1 month ago

This problem kind of just fixed itself...

I enabled verbose logging and tried to run the process again and it worked fine. Netbird worked completely end to end