zerotier / ZeroTierOne

A Smart Ethernet Switch for Earth
https://zerotier.com
Other
14.36k stars 1.68k forks source link

1.10.2 stops networking (Debian arm64/amd64, macOS Ventura 13.1, Windows 10 - 22H2) - not connecting to "control panel" #1859

Open hvisage opened 1 year ago

hvisage commented 1 year ago

If you still want to file a Bug Report

Please let us know

Working networking and nodes connecting to network, control panel, etc. with moons to assist in the remote setup

1.10.2 nodes stopped communications and doesn't allow network flows, even when there are peers listed, port 9993 flowing, but it still doesn't pop up on the controller interface

Install 1.10.2 - sometimes it stops working, and then that PC/VM/RPi don't conenct back, even removing and reinstalling and clearing old keys doesn't work. I've orbitted with moons, I deorbitted and reorbitted, that node just doesn't want to connect.

Reverted to 1.8.9/1.8.10 and the nodes are back online

I've asked, but haven't seen/found any way to enable debugging information to report those to the developers to assist.

Latest 1.10.2 is the failure Windows 10 22H2 (Parallels VM) macOS 13.1 Ventura Debian 11

hvisage commented 1 year ago

Before filing a Bug Report

I'll comment on these: Using these will ensure you get quicker support, and make this space available for code-related issues. Thank you!

If you are having a connection issue, it's much easier to diagnose through the discussion forum or the ticket system.

laduke commented 1 year ago

thanks for writing. which thread is your on the forum?

hvisage commented 1 year ago

Threads I believe are all related to this issue:

https://discuss.zerotier.com/t/correct-moon-update-procedure/11086/2 https://discuss.zerotier.com/t/zero-tier-1-10-1-machines-are-getting-disconnected-from-the-zerotier/11344/2 https://discuss.zerotier.com/t/node-loses-connectivity-but-can-see-peers-moons-and-data-flowing-to-fro-9993/11201 https://discuss.zerotier.com/t/zt-latest-pi-zerotier-cli-ssh-disconnects-on-interface-after-seconds/10972/2 https://discuss.zerotier.com/t/zerotier-cli-says-that-my-device-is-offline-after-successfully-joining/11118/2

There are some more threads about this issue, but I got silenced on the forum highlighting this

abirvepete commented 1 year ago

i got the same issue too. But I find more information or different problems. The zerotierone(1.10)will stops if i start the zerotierone service.the service just is working 3 seconds, my planet can receive the information of my windows that ip, zerotier version and latency, the windows also show the same as the planet shows. but the netstat -aon |findstr "9993" show the wrong thing,like 127.0.0.1:xxxx 127.0.0.1:9993 syn_sent xxxx during the working time,then he stop. i find the network adapters show me that the zerotier failed

joseph-henry commented 1 year ago

@hvisage Thanks for your persistence. Can you try clearing the contents of your peers.d before starting ZeroTier? This forces ZeroTier to restart the peer discovery process. In the past we've known about caching issues and I'm wondering if there's another we don't know about. This isn't a solution but it's something that could help diagnose.

Also, can you send us a zerotier-cli dump of two peers that cannot talk when they should be talking?

hvisage commented 1 year ago

Hi Joseph,

Thank you for getting back.

At present I've reverted to 1.8.9/10 on all the units, and they've been stable and being actively used.

I'll have to wait now for a bit quieter time to reload 1.10.2 to revert.

What I CAN tell you, is that it's not just two units not able to communicate between themselves, it's units that can't communicate with the "controller", ie, not shown on the control panel thus not able to talk to the network - I suspect it much more a controller related trouble than a interpeer trouble, as the peer isn't shown on the control panel when no comms.

I believe I've deorbitted, and cleared peers, I've even totally scrapped and reinstalled zerotier with similar results - sometimes just quicker to stop function if they came up at all

On 27 Jan 2023, at 23:24, Joseph Henry @.***> wrote:

@hvisage Thanks for your persistence. Can you try clearing the contents of your peers.d before starting ZeroTier? This forces ZeroTier to restart the peer discovery process. In the past we've known about caching issues and I'm wondering if there's another we don't know about. This isn't a solution but it's something that could help diagnose. Also, can you send us a zerotier-cli dump of two peers that cannot talk when they should be talking? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

hevisko commented 1 year ago

Can you try clearing the contents of your peers.d before starting ZeroTier? This forces ZeroTier to restart the peer discovery process. In the past we've known about caching issues and I'm wondering if there's another we don't know about. This isn't a solution but it's something that could help diagnose.

@joseph-henry current tests shows that to be a way to "fix" things with 1.10.2 - upgraded 1.8.9 to 1.10.2 - failed, cleared peers.d/* and restart and it "came right"

On my macAir (Intel) 1.10.2 "conencted" to the controller, but no traffic flowing, removed peers.d/* and `moons.d/*' didn't immediately made a difference, but the moment I oribitted the moons I've setup, traffic starts to flow.

Should I still upload some zerotier dumps?

joseph-henry commented 1 year ago

I'm glad to hear that this allowed traffic to flow again but I'm still very interested in getting to the bottom of this so yes dumps of two peers when they can't communicate would be helpful. Please send it via the ticketing system I linked above, it's confidential so please leave the IP addresses in the dump. Thanks for helping out.

esimotas commented 1 year ago

I have uploaded dumps of debian11 node running 1.10.2 with issue in ticket ZT-4831 (when offline and then also after deleting peers.d and restarting service in which case it became 'online')

hvisage commented 1 year ago

@joseph-henry Ticket created and dumps up loaded: ZT-4890