Open rihards-simanovics opened 1 month ago
What I've also noticed after some time of tinkering, is that if the peer is added to a group, and that group is assigned a DNS NS of 127.0.0.53:53 it appears to work though only until connection to management server is reset - don't know whether it just resets the networking and hence why it starts working but after a couple of resets (set state to off
and on
of NS in dashboard) it appears to begin working.
Also, any time connection to management server is dropped, all (of 10 servers) are fine but this one (Ubuntu 22.04.4 server) with the usual fail to connect to a root DNS resolution server.
Under normal use case the connection would be more stable, but we are testing the resilience of the clients and their ability to communicate with other existing peers even if management server is down, fortunately this is the only server affected, unfortunately this server is our load balancer 🥲 for the other vHosts on other servers.
@rihards-simanovics Is your Ubuntu 22.04.4 server have DNS Server? If so, i have same problem. Looks like netbird will create dns port on it's ip in every peers. If I check on linux peers using command sudo lsof -i -P -n | grep :53
every peers has this
netbird 748 root 23u IPv4 21571 0t0 UDP netbird-ip:53
results.
In my DNS Server, it breaks DNS and i can't resolve to my DNS Server. I have to add my dns server to group and add to Disable DNS Management in DNS Settings for my DNS Server peers.
If I check on linux peers using command
sudo lsof -i -P -n | grep :53
every peers has thisnetbird 748 root 23u IPv4 21571 0t0 UDP netbird-ip:53
results.
Yes, as mentioned in my original issue desc. I do indeed have a DNS server on that machine, named
to be precise.
Just run the command sudo lsof -i -P -n | grep :53
, and no, it doesn't seem to completly override the named service, which is the DNS server, not even root DNS - but that's likely due to named
service already binding to it, I'm sure that on next VPN Man. server brownout a similar issue will happen.
Looking at the print out it looks as though named
service does eventually bind to the root DNS and everything returns to normal and Netbird settles for 127.0.0.153:53
.
/usr/sbin 1176998 amavis 17u IPv4 281883878 0t0 UDP 127.0.0.1:58359->127.0.0.53:53
/usr/sbin 1177005 amavis 17u IPv4 281947707 0t0 UDP 127.0.0.1:35339->127.0.0.53:53
/usr/sbin 1177262 amavis 17u IPv4 282512343 0t0 UDP 127.0.0.1:56948->127.0.0.53:53
netbird 1222403 root 24u IPv4 249471041 0t0 UDP 127.0.0.153:53
netbird 1222403 root 25u IPv4 322688715 0t0 UDP 127.0.0.1:42917->127.0.0.53:53
netbird 1222403 root 31u IPv4 307964250 0t0 UDP *:53335
/usr/sbin 1368247 amavis 17u IPv4 282508753 0t0 UDP 127.0.0.1:26319->127.0.0.53:53
named 1585814 bind 6u IPv4 249458218 0t0 UDP 100.90.79.155:53
named 1585814 bind 25u IPv4 219208391 0t0 UDP 127.0.0.1:53
named 1585814 bind 26u IPv4 219208392 0t0 UDP 127.0.0.1:53
named 1585814 bind 27u IPv4 219208393 0t0 TCP 127.0.0.1:53 (LISTEN)
named 1585814 bind 28u IPv4 219208394 0t0 TCP 127.0.0.1:53 (LISTEN)
named 1585814 bind 29u IPv4 219208395 0t0 UDP ext.ipv4.of.server:53
named 1585814 bind 30u IPv4 219208396 0t0 UDP ext.ipv4.of.server:53
named 1585814 bind 31u IPv4 219208397 0t0 TCP ext.ipv4.of.server:53 (LISTEN)
named 1585814 bind 32u IPv4 219208398 0t0 TCP ext.ipv4.of.server:53 (LISTEN)
named 1585814 bind 33u IPv4 219208399 0t0 UDP 172.17.0.1:53
named 1585814 bind 34u IPv4 219208400 0t0 UDP 172.17.0.1:53
named 1585814 bind 35u IPv4 219208401 0t0 TCP 172.17.0.1:53 (LISTEN)
named 1585814 bind 36u IPv4 219208402 0t0 TCP 172.17.0.1:53 (LISTEN)
named 1585814 bind 41u IPv4 249458219 0t0 UDP 100.90.79.155:53
named 1585814 bind 42u IPv4 249458220 0t0 TCP 100.90.79.155:53 (LISTEN)
named 1585814 bind 43u IPv4 249458221 0t0 TCP 100.90.79.155:53 (LISTEN)
named 1585814 bind 45u IPv6 219208411 0t0 UDP [::1]:53
named 1585814 bind 46u IPv6 219208412 0t0 UDP [::1]:53
named 1585814 bind 49u IPv6 219208415 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 50u IPv6 219208416 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 51u IPv6 219208417 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 52u IPv6 219208418 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 53u IPv6 219208419 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 54u IPv6 219208420 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 55u IPv6 219208421 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 56u IPv6 219208422 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 57u IPv6 219208423 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 58u IPv6 219208424 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 59u IPv6 219208425 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 60u IPv6 219208426 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 61u IPv6 219208427 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 62u IPv6 219208428 0t0 UDP [external:ipv6:of:server]:53
named 1585814 bind 63u IPv6 219208429 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 64u IPv6 219208430 0t0 TCP [external:ipv6:of:server]:53 (LISTEN)
named 1585814 bind 65u IPv6 219208431 0t0 UDP [fe80::1:3cff:feda:2296]:53
named 1585814 bind 66u IPv6 219208432 0t0 UDP [fe80::1:3cff:feda:2296]:53
named 1585814 bind 67u IPv6 219208433 0t0 TCP [fe80::1:3cff:feda:2296]:53 (LISTEN)
named 1585814 bind 68u IPv6 219208434 0t0 TCP [fe80::1:3cff:feda:2296]:53 (LISTEN)
systemd-r 1602236 systemd-resolve 13u IPv4 249770538 0t0 UDP 127.0.0.53:53
systemd-r 1602236 systemd-resolve 14u IPv4 249770539 0t0 TCP 127.0.0.53:53 (LISTEN)
systemd-r 1602236 systemd-resolve 15u IPv6 249770540 0t0 UDP [::1]:53
systemd-r 1602236 systemd-resolve 16u IPv6 249770541 0t0 TCP [::1]:53 (LISTEN)
/usr/sbin 1729524 amavis 17u IPv4 282509853 0t0 UDP 127.0.0.1:15604->127.0.0.53:53
/usr/sbin 1734785 amavis 17u IPv4 282519156 0t0 UDP 127.0.0.1:23851->127.0.0.53:53
/usr/sbin 1743217 amavis 17u IPv4 282555818 0t0 UDP 127.0.0.1:51948->127.0.0.53:53
/usr/sbin 1811148 amavis 17u IPv4 282648189 0t0 UDP 127.0.0.1:40822->127.0.0.53:53
/usr/sbin 1811171 amavis 17u IPv4 282639305 0t0 UDP 127.0.0.1:52933->127.0.0.53:53
/usr/sbin 1811245 amavis 16u IPv4 283277039 0t0 UDP 127.0.0.1:50352->127.0.0.53:53
My main point of using VPN is to create a secure network layer for my servers to communicate without the need for https, which requires a lot of setups on each server, and since Plesk does it in a few clicks, making Plesk server a reverse proxy has been my solution. Up until today 1am we've been using centralised VPN Pritunl
, which is based on OpenVPN, but we found that it was way too scary to rely on, as instead of just one load balancer being the point of failure, we also have the VPN server. So, when I heard of overlay VPN and P2P connectivity with WireGuard, as well as almost 0 reliance on the manager server being up, I immediately made the decision to transition. Was really straight forward thankfully.
Command sudo lsof -i -P -n | grep :53
is just for check what process using port 53. From what your result from command sudo lsof -i -P -n | grep :53
above, is that when your dns is working or not?
Command
sudo lsof -i -P -n | grep :53
is just for check what process using port 53. From what your result from commandsudo lsof -i -P -n | grep :53
above, is that when your dns is working or not?
yes, this is the working state. I'm not going to try and break it now as it's 9am in London and people are waking up but I can test in about 15 hrs to see what it's like when the Root DNS is not working well.
Ok then. Just to clarify, is your others 10 servers have dns server too or not?
I don't know if this fix your problem or not, but you can try add your dns server to group like nodns and from netbird dashboard, Go to DNS > DNS Settings and add nodns group on that. But this will "break netbird dns" and let your dns server running. (This is from what i try and make my dns server still working)
yes there is one other, which we use as a fallback, it also uses named
service, here is the command output:
named 2966849 bind 6u IPv4 9421747 0t0 UDP ext.ipv4.of.server:53
named 2966849 bind 26u IPv4 9055427 0t0 UDP 127.0.0.1:53
named 2966849 bind 27u IPv4 9055428 0t0 UDP 127.0.0.1:53
named 2966849 bind 28u IPv4 9055429 0t0 TCP 127.0.0.1:53 (LISTEN)
named 2966849 bind 30u IPv4 9055430 0t0 TCP 127.0.0.1:53 (LISTEN)
named 2966849 bind 32u IPv4 9421750 0t0 UDP ext.ipv4.of.server:53
named 2966849 bind 33u IPv4 9421751 0t0 TCP ext.ipv4.of.server:53 (LISTEN)
named 2966849 bind 34u IPv4 9421752 0t0 TCP ext.ipv4.of.server:53 (LISTEN)
named 2966849 bind 35u IPv6 9421784 0t0 UDP [ext:ipv6:of:server]:53
named 2966849 bind 36u IPv6 9421785 0t0 UDP [ext:ipv6:of:server]:53
named 2966849 bind 37u IPv6 9422183 0t0 TCP [ext:ipv6:of:server]:53 (LISTEN)
named 2966849 bind 38u IPv6 9422184 0t0 TCP [ext:ipv6:of:server]:53 (LISTEN)
named 2966849 bind 40u IPv6 9055439 0t0 UDP [::1]:53
named 2966849 bind 41u IPv6 9055440 0t0 UDP [::1]:53
named 2966849 bind 42u IPv6 9055441 0t0 TCP [::1]:53 (LISTEN)
named 2966849 bind 43u IPv6 9055442 0t0 TCP [::1]:53 (LISTEN)
named 2966849 bind 48u IPv6 9055447 0t0 UDP [fe80::250:56ff:fe3d:c4f8]:53
named 2966849 bind 49u IPv6 9055448 0t0 UDP [fe80::250:56ff:fe3d:c4f8]:53
named 2966849 bind 50u IPv6 9055449 0t0 TCP [fe80::250:56ff:fe3d:c4f8]:53 (LISTEN)
named 2966849 bind 51u IPv6 9055450 0t0 TCP [fe80::250:56ff:fe3d:c4f8]:53 (LISTEN)
named 2966849 bind 59u IPv4 9423238 0t0 UDP 100.90.4.66:53
named 2966849 bind 63u IPv4 9423239 0t0 UDP 100.90.4.66:53
named 2966849 bind 64u IPv4 9423240 0t0 TCP 100.90.4.66:53 (LISTEN)
named 2966849 bind 65u IPv4 9423241 0t0 TCP 100.90.4.66:53 (LISTEN)
netbird 3091556 root 22u IPv4 9423420 0t0 UDP 127.0.0.153:53
netbird 3091556 root 29u IPv4 9449030 0t0 UDP *:53437
systemd-r 3091801 systemd-resolve 13u IPv4 9421763 0t0 UDP 127.0.0.53:53
systemd-r 3091801 systemd-resolve 14u IPv4 9421764 0t0 TCP 127.0.0.53:53 (LISTEN)
it's also Ubuntu 22.04.4 LTS
but strangely it doesn't have the same problem. I'll do more testing in 15 hrs on dev servers once traffic reduces to a minimum, it doesn't seem to affect operation of most service but just some that use DNS to find the IP of DB servers.
Can you share your resolv config on /etc/resolv.conf on your server that have dns server? Is it overwritten by netbird?
getting this on both servers:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.
nameserver 127.0.0.53
search netbird.selfhosted
Can you check what in /etc/resolv.conf.original.netbird? And can you check netbird logs for error and warning on /var/log/netbird/client.log? Maybe others can solve your problem.
I think i can't solve your problem but maybe you can try this.
I don't know if this fix your problem or not, but you can try add your dns server to group like nodns and from netbird dashboard, Go to DNS > DNS Settings and add nodns group on that. But this will "break netbird dns" and let your dns server running. (This is from what i try and make my dns server still working)
Even i have exactly same issue, my local (custom) DNS is not getting resolving after install the netbird agent on ubuntu machine
Funny enough the issue appeared to "resolve" on its own, don't know if it's actually resolved or whether something got changed but I can say with absolute certainty that the management server software hasn't been updated, only the clients. One other thing I did is configure the is the Access control, and completely disabled the "All" group. Now the "Servers" group only has access to other servers, everything else is locked down.
@rihards-simanovics - were u able to solve this issue out with any work arounds?
the only logical solution is to create a DNS server policy that allows the use of 127.0.0.53:53
. Also, I didn't see this affecting more modern OS' such as ubuntu 24.04 that said the ones I had are not DNS servers so I can't say for sure yet.
Describe the problem
all DNS queries to 127.0.0.53:53 fail with a timeout. related to issue https://github.com/netbirdio/netbird/issues/2186
To Reproduce
Steps to reproduce the behavior:
Expected behavior
when running the nslookup on google.com this should come up:
Actual Behaviour
All queries to root DNS fail with a timeout
Are you using NetBird Cloud?
negative - all selfhosted.
NetBird managemnet server version:
unknown - latest as of 26th july 24
NetBird client version:
0.28.6
NetBird status -d output:
Screenshots
no screenshots please see outputs above
Additional context
Running on
Ubuntu Server 22.04.4 LTS
, withPlesk Obsidian 18.0.62 Update #2
(Web Host Edition) and DNS BIND.related to issue #2186