juanfont / headscale

An open source, self-hosted implementation of the Tailscale control server
BSD 3-Clause "New" or "Revised" License
23.58k stars 1.3k forks source link

[BUG] Lose Update to 0.23.0 (lost mtu) #2250

Open pstvasko opened 23 hours ago

pstvasko commented 23 hours ago

Is this a support request?

Is there an existing issue for this?

Current Behavior

The tunnel breaks on gate1.

Expected Behavior

Dont lost traffic

Steps To Reproduce

Hi. After updating to version 23, there is an issue in the Tailscale network. I have a complex network connecting two Tailscale installations: 100.64.0.0 - headscale1 - gate1 - gate2 - headscale2 - 100.80.0.0

When I download between 100.64.0.0 and 100.80.0.0, the speed reaches a maximum of 2 Gbps, and some issues start occurring. Packets stop flowing on the segment 100.64.0.0 - headscale1 (although if I reduce the MTU to 932, pings work).

There are about 500 clients in the network. Could you advise in which direction I should investigate?

Environment

- OS: AlmaLinux8
- Headscale version: 0.23.0
- Tailscale version: 1.76.6

Runtime environment

Anything else?

Headscale


2024-11-21T21:31:33Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:35Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:37Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:38Z ERR Failed to fetch node from the database with node key: nodekey:10715a5defd407c11146b436449e3fdc771d8e4adc68b8dac0077e5e3d64d370 handler=NoisePollNetMap
2024-11-21T21:31:39Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:40Z INF home/runner/work/headscale/headscale/hscontrol/auth_noise.go:44 > unsupported client connected client_version=58 min_version=61
2024-11-21T21:31:41Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:41Z INF home/runner/work/headscale/headscale/hscontrol/auth_noise.go:44 > unsupported client connected client_version=58 min_version=61
2024-11-21T21:31:42Z INF home/runner/work/headscale/headscale/hscontrol/auth.go:28 > Successfully sent auth url: https://headscale.*****/oidc/register/mkey:ad30ca2d2f62ca426624930d6455211e40554add598c1a99420ffc8e6a2d8c0c expiry=-62135596800 followup=https://headscale.*****/oidc/register/mkey:ad30ca2d2f62ca426624930d6455211e40554add598c1a99420ffc8e6a2d8c0c machine_key=[rTDKL] node=vm-po4 node_key=[QxxVm] node_key_old=[bYjMr]
2024-11-21T21:31:43Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:45Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771```

tailscale:

```22 00:31:10 tailscaled[1378041]: wgengine: idle peer [Jpvng] now active, reconfiguring WireGuard 
22 00:31:10 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 70/459 peers) 
22 00:31:20 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 69/459 peers) 
22 00:31:35 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:44438 => 100.80.0.2:9188) got RST by peer 
22 00:31:38 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:55990 => 100.80.0.2:9187) got RST by peer 
22 00:31:38 tailscaled[1378041]: control: NetInfo: NetInfo{varies=false hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link="" firewallmode="ipt-default"} 
22 00:31:53 tailscaled[1378041]: wgengine: idle peer [Qif6f] now active, reconfiguring WireGuard 
22 00:31:53 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 69/459 peers) 
22 00:32:05 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:36450 => 100.80.0.2:9188) got RST by peer 
22 00:32:08 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:43332 => 100.80.0.2:9187) got RST by peer 
22 00:32:10 tailscaled[1378041]: wgengine: idle peer [qoqFH] now active, reconfiguring WireGuard 
22 00:32:10 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 70/459 peers)```
kradalby commented 10 hours ago

Have you changed the Tailscale version between these nodes recently? I am not rolling out that it would be some parameter changed in Headscale, but I am surprised if that change would have any real impact on the client side, changes on the client would be more expected if I was to guess.

Can you try with multiple Tailscale versions vs multiple Headscale versions?