zerotier / ZeroTierOne

A Smart Ethernet Switch for Earth
https://zerotier.com
Other
14.5k stars 1.69k forks source link

zerotier network stuck in REQUESTING_CONFIGURATION status or cannot connect to any node in the zt network #1757

Open san3Xian opened 2 years ago

san3Xian commented 2 years ago

Sometimes found that the zerotier cannot connect to any node in the zt network, stopping zerotier-one and restarting it does not restore normal. At this time, the result of the zerotier-cli info is

# zerotier-cli info
200 info xxxxxxx 1.10.1 ONLINE
# zerotier-cli info -j
{
 "address": "xxxxxxxx",
 "clock": 1662482290098,
 "config": {
  "settings": {
   "allowSecondaryPort": true,
   "allowTcpFallbackRelay": true,
   "listeningOn": [
    "172.17.0.1/53",
    "192.168.4.5/53",
    "172.17.0.1/80",
    "192.168.4.5/80",
    "172.17.0.1/29009",
    "192.168.4.5/29009",
    "xxxxxxxxxxxxxxxx"
   ],
   "portMappingEnabled": true,
   "primaryPort": 80,
   "secondaryPort": 53,
   "softwareUpdate": "disable",
   "softwareUpdateChannel": "release",
   "tcpFallbackActive": true,
   "tertiaryPort": 29009
  }
 },
 "online": true,
 "planetWorldId": 149604618,
 "planetWorldTimestamp": 1644592324813,
 "publicIdentity": "xxxxxxxxxxxxxx,
 "tcpFallbackActive": false,
 "version": "1.10.1",
 "versionBuild": 0,
 "versionMajor": 1,
 "versionMinor": 10,
 "versionRev": 1
}

However, on the ZeroTier Central website the node is shown as non-online,then I executed the zerotier-cli leave and join commands to rejoin the network,and zerotier-cli listnetworks command result will change to REQUESTING_CONFIGURATION. It is interesting to note that if I deorbit and reorbit all the moon nodes at this point, everything will go back to normal,before that I confirmed the status of the moon nodes was normal by using the zerotier-cli peers command(DIRECT with correctly IP address), so I wrote a command to do this in bulk

moon_node=(`zerotier-cli peers|awk '/MOON/{print $1}'`) && for i in ${moon_node[*]};do zerotier-cli deorbit $i;done && for i in ${moon_node[*]};do zerotier-cli orbit $i $i;done

zerotier-one version: Found in both versions 1.10.1 and 1.8.9

mokeyish commented 2 years ago

Hi, You can create a new network then use this new nework id, but i also think is a bug, why the old network id cannot connect!!! REQUESTING_CONFIGURATION

mokeyish commented 2 years ago

all nodes in the network restart and then all nodes will show status REQUESTING_CONFIGURATION

JocPelletier commented 1 year ago

I have the same issue, my node is on the network I see it online with IP on zerotier dashboard, but the client is stuck at REQUESTING_CONFIGURATION

I created a new ZT Network, joined this one and got an IP. I have no special config on either network

hevisko commented 1 year ago

getting stuck woith same troubles. Is there a place/method to enable on Linux (RPI) and macOS some packet debugging to see why it's not connecting to the cotroller?

Ke1i commented 1 year ago

Same here for me on Debian.

xxxf@debbie:~$ sudo zerotier-cli listnetworks
200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>

xxxf@debbie:~$ sudo zerotier-cli join xxxxxxxxxxxxxx27
200 join OK

xxxf@debbie:~$ sudo zerotier-cli listnetworks
200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>
200 listnetworks xxxxxxxxxxxxxx27  xx:xx:xx:xx:xx:d7 REQUESTING_CONFIGURATION PRIVATE ztwdjaphiq -

xxxf@debbie:~$ sudo zerotier-cli info
200 info a7a758622c 1.10.2 ONLINE

xxxf@debbie:~$ sudo zerotier-cli peers
200 peers
<ztaddr>   <ver>  <role> <lat> <link> <lastTX> <lastRX> <path>
62f865ae71 -      PLANET    -1 RELAY
778cde7190 -      PLANET    -1 RELAY
cafe04eba9 -      PLANET    -1 RELAY
cafe9efeb9 -      PLANET    -1 RELAY

xxxf@debbie:~$ sudo lsof -i :9993
COMMAND   PID         USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
zerotier- 873 zerotier-one    6u  IPv4  19053      0t0  TCP localhost:9993 (LISTEN)
zerotier- 873 zerotier-one    7u  IPv6  19054      0t0  TCP localhost:9993 (LISTEN)
zerotier- 873 zerotier-one   12u  IPv4  19071      0t0  UDP 192.168.1.2:9993 
zerotier- 873 zerotier-one   13u  IPv4  19072      0t0  TCP 192.168.1.2:9993 (LISTEN)
zerotier- 873 zerotier-one   18u  IPv4  19750      0t0  UDP 172.16.16.1:9993 
zerotier- 873 zerotier-one   19u  IPv4  19751      0t0  TCP 172.16.16.1:9993 (LISTEN)
laduke commented 1 year ago

Hello, it looks like your node can't connect to anything over UDP; This is a common firewall issue and likely not the same issue as the original report. Check your debian firewall and internet router firewall.

Ke1i commented 1 year ago

Hello, it looks like your node can't connect to anything over UDP; This is a common firewall issue and likely not the same issue as the original report. Check your debian firewall and internet router firewall.

I don't have any firewalls installed. My phone (Android) on the same network can connect to the zt network without issue. Only the debian machine and a linux mint laptop (LMDE 5) can't. I'm getting the same "REQUESTING_CONFIGURATION PRIVATE.." error on both of them.

joseph-henry commented 1 year ago

In OP's case, his TCP fallback relay was active indicating that UDP traffic was blocked by something. The relay state reported by your node is indicating the same.

Can you run an iperf test between two machines on your LAN to ensure that UDP is passing unobstructed?

iperf3 --server
iperf3 --udp --client xxx.xxx.xxx.xxx --bitrate 1M

That will only catch issues between devices on your network but there may still be something blocking UDP to/from your network.

Ke1i commented 1 year ago

From the debian machine to linux mint laptop:

xxxf@debbie:~$ iperf3 --udp --client 192.168.1.109 --bitrate 1M
Connecting to host 192.168.1.109, port 5201
[  5] local 192.168.1.2 port 39364 connected to 192.168.1.109 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   1.00-2.00   sec   122 KBytes   997 Kbits/sec  86  
[  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
[  5]   0.00-10.04  sec  1.19 MBytes   996 Kbits/sec  0.502 ms  0/864 (0%)  receiver

iperf Done.

I also tried a docker image but still got the dreaded "REQUESTING_CONFIGURATION PRIVATE.." error despite getting a couple of ip addresses and a LEAF under "zerotier-cli peers". Below is the config I used:

sudo podman run -dt --name=zerotier --device=/dev/net/tun \
    --net=host \
    --cap-add=NET_ADMIN \
    --cap-add=SYS_ADMIN \
    -v /containers/zerotier:/var/lib/zerotier-one \
    -e NETWORK_ID=xxxxxxxxxxxxxx27 \
    docker.io/spikhalskiy/zerotier

Anyway I gave up and switched to Nebula.

joseph-henry commented 1 year ago

@hevisko @mokeyish @JocPelletier @Ke1i I've found one of my own nodes exhibiting similar issues and I believe our upcoming version 1.10.3 contains a patch to fix this, specifically relating to duplicate paths.

If anyone still has a node unable to get configuration I'd love to know if you see duplicate entries in the output of:

zerotier-cli -j listpeers
Ke1i commented 1 year ago

Because I've uninstalled zt from my machines, I was able to test the above via the docker image I mentioned earlier. I created a test network and did some testing:

/ # zerotier-cli info
200 info cd71ad3c0c 1.10.2 ONLINE

/ # zerotier-cli listpeers
200 listpeers <ztaddr> <path> <latency> <version> <role>
200 listpeers 62f865ae71 - -1 - PLANET
200 listpeers 778cde7190 103.195.103.66/9993;6795;161594 326 - PLANET
200 listpeers cafe04eba9 84.17.53.155/9993;334;161627 293 - PLANET
200 listpeers cafe9efeb9 104.194.8.134/9993;6795;161555 365 - PLANET

/ # zerotier-cli listnetworks
200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>
200 listnetworks 17d709436c56d5fb  fa:18:27:c1:7f:05 REQUESTING_CONFIGURATION PRIVATE ztks5v7fux -

/ # zerotier-cli -j listpeers
[
 {
  "address": "62f865ae71",
  "isBonded": false,
  "latency": -1,
  "paths": [],
  "role": "PLANET",
  "version": "-1.-1.-1",
  "versionMajor": -1,
  "versionMinor": -1,
  "versionRev": -1
 },
 {
  "address": "778cde7190",
  "isBonded": false,
  "latency": 326,
  "paths": [
   {
    "active": true,
    "address": "103.195.103.66/9993",
    "expired": false,
    "lastReceive": 1676440793475,
    "lastSend": 1676440993415,
    "localSocket": 140294304266864,
    "preferred": true,
    "trustedPathId": 0
   }
  ],
  "role": "PLANET",
  "version": "-1.-1.-1",
  "versionMajor": -1,
  "versionMinor": -1,
  "versionRev": -1
 },
 {
  "address": "cafe04eba9",
  "isBonded": false,
  "latency": 293,
  "paths": [
   {
    "active": true,
    "address": "84.17.53.155/9993",
    "expired": false,
    "lastReceive": 1676440793442,
    "lastSend": 1676440998873,
    "localSocket": 140294304266864,
    "preferred": true,
    "trustedPathId": 0
   }
  ],
  "role": "PLANET",
  "version": "-1.-1.-1",
  "versionMajor": -1,
  "versionMinor": -1,
  "versionRev": -1
 },
 {
  "address": "cafe9efeb9",
  "isBonded": false,
  "latency": 365,
  "paths": [
   {
    "active": true,
    "address": "104.194.8.134/9993",
    "expired": false,
    "lastReceive": 1676440793514,
    "lastSend": 1676440993415,
    "localSocket": 140294303200304,
    "preferred": true,
    "trustedPathId": 0
   }
  ],
  "role": "PLANET",
  "version": "-1.-1.-1",
  "versionMajor": -1,
  "versionMinor": -1,
  "versionRev": -1
 }
]

Again, no joy.

zerotier_no_members

I tried adding the node manually:

zerotier_manual_join

jjqtony commented 1 year ago

Same here for me on Ubuntu 22.04.

jjqtony commented 1 year ago

I have fixed it by deleting the managed ip in my.zerotier.com

mokeyish commented 1 year ago

Hi, You can create a new network then use this new nework id, but i also think is a bug, why the old network id cannot connect!!! REQUESTING_CONFIGURATION

This is because the network controller node is offline. You can see the connection failure with zerotier-cli peers. Even if your device nat penetrates successfully, it still cannot ping each other. It was found that the solution was to create a new network before, and it was still troublesome to reconfigure. But recently, I found that turning one of my devices into a network controller and keeping this device online can avoid this problem.

Everyone, you can try my recently developed zerotier controller web UI--ZeroTier Edge(lightweight,no docker/nodejs,less than 5MB), which can easily manage the network controller of your own devices. The UI is currently similar to my.zerotier.com, and it may be adjusted in the future. If there is a better solution if.

webgtx commented 1 year ago

I got the same issue on Fedora 37

I'm pretty sure that problem is not on the firewall side, because in the same network on the windows machine everything works just fine.

Firewall Rules:

Status: active

To                         Action      From
--                         ------      ----
SSH                        ALLOW       Anywhere                  
224.0.0.251 mDNS           ALLOW       Anywhere                  
80                         ALLOW       Anywhere                  
22                         ALLOW       Anywhere                   # SSH
9993/tcp                   ALLOW       Anywhere                  
9993/udp                   ALLOW       Anywhere                  
SSH (v6)                   ALLOW       Anywhere (v6)             
ff02::fb mDNS              ALLOW       Anywhere (v6)             
80 (v6)                    ALLOW       Anywhere (v6)             
22 (v6)                    ALLOW       Anywhere (v6)              # SSH
9993/tcp (v6)              ALLOW       Anywhere (v6)             
9993/udp (v6)              ALLOW       Anywhere (v6)             
webgtx commented 1 year ago

I tried to create new network as @mokeyish said above, didn't help me though

zerotier-cli listnetworks

200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>
200 listnetworks $MAC  $MAC2 REQUESTING_CONFIGURATION PRIVATE $NETID -
200 listnetworks $MAC  $MAC2 REQUESTING_CONFIGURATION PRIVATE $NETID -
mokeyish commented 1 year ago

I tried to create new network as @mokeyish said above, didn't help me though

zerotier-cli listnetworks

200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>
200 listnetworks $MAC  $MAC2 REQUESTING_CONFIGURATION PRIVATE $NETID -
200 listnetworks $MAC  $MAC2 REQUESTING_CONFIGURATION PRIVATE $NETID -

First, It should be at least one planet connected. using zerotier-cli peers to check connection.

Then all devices can connect to the controller node where the network ID is. If can't, Maybe the controller node IP of the network you created is blocked by the ISP.

Use zerotier-edge https://github.com/mokeyish/zerotier-edge/releases/tag/0.1.2 to turn the zerotier on your device as self-hosted controller node.

mokeyish commented 1 year ago

I have encountered this problem twice(last year and this year). I don’t want to create a new network anymore, and reconfiguring it is quite troublesome. I looked for other zerotier UIs and found that they were all deployed based on docker/nodejs. They were too heavy and took up a lot of resources, so I took the time to write zerotier-edge using rust+solidjs. Now I deploy it on my router to manage my private network of several devices.

andrewgdotcom commented 1 year ago

This has suddenly started happening for us in the last two weeks, several people have reported identical problems. Upgrading/reinstalling ZT and/or rebooting the machine sometimes helps, but not always.