Closed da-wilky closed 3 months ago
I also have the same issue, appreciate anyone give a suggestion.
Do you have any proxy in front that cuts the grpc connection after some duration? Could be traefik itself
You are completly right, thank you for the hint! I had some research before in the direction of traefik
, but didnt find anything, so I thought its about the signaling service, even tho it's indeed a traefik misconfiguration and not related to netbird.
Traefik has a default readTimeout
of 60s
. (https://doc.traefik.io/traefik/routing/entrypoints/#respondingtimeouts)
This is to protect traefik from DoS Attacks, see the following resources:
Currently its not possible to set this timeout on an per router/service level with traefik (resource 1, resource 2). Hoepfully they plan to implement this feature.
So the current possibilities are:
readTimeout
by setting it to 0
with the huge drawback of exposing your traefik to the vulnerability.Hope that helps somebody. Thanks to @lixmal for pointing the direction!
Cheers
You are completly right, thank you for the hint! I had some research before in the direction of
traefik
, but didnt find anything, so I thought its about the signaling service, even tho it's indeed a traefik misconfiguration and not related to netbird.Traefik has a default
readTimeout
of60s
. (https://doc.traefik.io/traefik/routing/entrypoints/#respondingtimeouts) This is to protect traefik from DoS Attacks, see the following resources:
- Default Read Timeout got introduced
- Disable Read Timeout makes traefik vulnerable to CVE-2024-28869
- CVE-2024-28869
Currently its not possible to set this timeout on an per router/service level with traefik (resource 1, resource 2). Hoepfully they plan to implement this feature.
So the current possibilities are:
- Live with the logs - it's still working fine and will be reconnecting immediatly after traefik cuts the connection. You might also increase the default duration by a few seconds to minimize the logs. Your decision if this is worth it - as said, its working fine and will be automatically reconnecting. You would just get a few less logs.
- Disable
readTimeout
by setting it to0
with the huge drawback of exposing your traefik to the vulnerability.- Switch from traefik to e.g. NGINX, that supports setting a custom read timeout per server/location.
Hope that helps somebody. Thanks to @lixmal for pointing the direction!
Cheers
Refer to @da-wilky 's suggestion, i also has updated the Traefik config as following.
entryPoints:
http:
address: ":80"
http:
redirections:
entryPoint:
to: https
scheme: https
https:
address: ":443"
transport:
respondingTimeouts:
readTimeout: 0
after retarting the sercies, peers are reconnecting every minute was gone! The corresponding risk also follows, of course, this balance can only be measured by myself. Such a great reminder, appreciate @da-wilky @lixmal 's guidance!
Hello,
Describe the problem
Im using docker to host my own netbird management service behind traefik reverse-proxy. Everything is working, but my signal service is permently logging reconnects of the peers. Im curious if this is the expected behavior.
Example logs
On the clients the following logs are produced:
So it reconnects every minute because of an rpc error. But just by googling I cant figure out why. Again - everything is still working fine - at least I cant recognize any errors apart those logs.
To Reproduce
Steps to reproduce the behavior:
Host your own netbird with traefik proxy.
volumes: netbird-mgmt: netbird-signal:
networks: traefik_net: external: true
Peers detail: peer1.netbird.selfhosted: NetBird IP:/32
Public key: BJjacf7HSPwCzkxFkUlFlWMBbXiFQyLzhc2z6YrBd3E=
Status: Disconnected
-- detail --
Connection type: P2P
Direct: false
ICE candidate (Local/Remote): host/srflx
ICE candidate endpoints (Local/Remote): :51820/:62082
Last connection update: 20 minutes, 7 seconds ago
Last WireGuard handshake: 2 minutes, 26 seconds ago
Transfer status (received/sent) 2.1 KiB/3.8 KiB
Quantum resistance: false
Routes: -
Latency: 0s
peer2.netbird.selfhosted: NetBird IP:/32
Public key: FGy/FK+e8p3qTFulkif4V+hI81U9I+QFKqttzBx7qE4=
Status: Disconnected
-- detail --
Connection type: P2P
Direct: false
ICE candidate (Local/Remote): host/srflx
ICE candidate endpoints (Local/Remote): :51820/:62082
Last connection update: 20 minutes, 7 seconds ago
Last WireGuard handshake: 2 minutes, 26 seconds ago
Transfer status (received/sent) 2.1 KiB/3.8 KiB
Quantum resistance: false
Routes: -
Latency: 0s
peer3.netbird.selfhosted: NetBird IP:
Public key: qXy69ZUHM8IQJIapY/5z54AGTa2UyXPSefDISbzyByA=
Status: Connected
-- detail --
Connection type: P2P
Direct: true
ICE candidate (Local/Remote): srflx/srflx
ICE candidate endpoints (Local/Remote): :51820/:51820
Last connection update: 34 minutes, 36 seconds ago
Last WireGuard handshake: 1 minute, 57 seconds ago
Transfer status (received/sent) 3.0 MiB/25.1 MiB
Quantum resistance: false
Routes: -
Latency: 12.297417ms
peer4.netbird.selfhosted: NetBird IP:
Public key: MvDBVV63GFtK+MdEy3lV9/73Fuw5eUaRmNfQTMEi3DQ=
Status: Connected
-- detail --
Connection type: P2P
Direct: true
ICE candidate (Local/Remote): host/srflx
ICE candidate endpoints (Local/Remote): :51820/:62082
Last connection update: 34 minutes, 38 seconds ago
Last WireGuard handshake: 2 minutes, 26 seconds ago
Transfer status (received/sent) 2.1 KiB/3.8 KiB
Quantum resistance: false
Routes: -
Latency: 12.580678ms
OS: linux/amd64 Daemon version: 0.28.4 CLI version: 0.28.4 Management: Connected to https://mydomain.com:443 Signal: Connected to https://mydomain.com:443 Relays: [stun:mydomain.com:3478] is Available [turn:mydomain.com:3478?transport=udp] is Available Nameservers: [1.1.1.1:53, 1.0.0.1:53] for [.] is Available FQDN: peer5.netbird.selfhosted NetBird IP:/16
Interface type: Kernel
Quantum resistance: false
Routes: -
Peers count: 2/4 Connected