projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.04k stars 1.35k forks source link

Intermittent connection timeouts with wireguard #9496

Open iheng opened 3 days ago

iheng commented 3 days ago

hi, we're facing intermittent connection timeouts with Calico(3.28.2) when wireguard enabled. All work nodes have this issue when wireguard enabled. we're only use wireguard feature, we don't have network policy in place. we have set wireguardHostEncryptionEnabled: true

did tcpdump on source work node when timeout:

 1952 273.760089 10.101.183.71 → 10.101.105.134 TCP 76 [TCP Retransmission] 22000 → 8080 [SYN] Seq=0 Win=62307 Len=0 MSS=8901 SACK_PERM=1 TSval=3517018028 TSecr=0 WS=128
 1953 275.780198 10.101.183.71 → 10.101.105.134 TCP 76 [TCP Retransmission] 22000 → 8080 [SYN] Seq=0 Win=62307 Len=0 MSS=8901 SACK_PERM=1 TSval=3517020048 TSecr=0 WS=128
 1954 276.853040 10.101.183.71 → 10.101.105.134 TCP 76 13915 → 8080 [SYN] Seq=0 Win=62307 Len=0 MSS=8901 SACK_PERM=1 TSval=3517021121 TSecr=0 WS=128
 1955 277.860001 10.101.183.71 → 10.101.105.134 TCP 76 [TCP Retransmission] 13915 → 8080 [SYN] Seq=0 Win=62307 Len=0 MSS=8901 SACK_PERM=1 TSval=3517022128 TSecr=0 WS=128
 1956 279.872015 10.101.183.71 → 10.101.105.134 TCP 76 [TCP Retransmission] 13915 → 8080 [SYN] Seq=0 Win=62307 Len=0 MSS=8901 SACK_PERM=1 TSval=3517024140 TSecr=0 WS=128
 1957 280.970319 10.101.183.71 → 10.101.105.134 TCP 76 56769 → 8080 [SYN] Seq=0 Win=62307 Len=0 MSS=8901 SACK_PERM=1 TSval=3517025238 TSecr=0 WS=128

Expected Behavior

no connection timeout when wireguard enabled

Current Behavior

random connection timeout when wireguard enabled

Possible Solution

disable wireguard with following command solved issue calicoctl patch felixconfiguration default --type='merge' -p '{"spec":{"wireguardEnabled":false}}'

Steps to Reproduce (for bugs)

1. 2. 3. 4. not sure how to reproduce it

Context

Your Environment

mazdakn commented 2 days ago

@iheng can you check if you are having the same issue as this one: https://github.com/projectcalico/calico/issues/9223?

iheng commented 2 days ago

@mazdakn seems not. in my case, cpu usage is low , not exceed 180kps

image image