WireGuard / wireguard-vyatta-ubnt

WireGuard for Ubiquiti Devices
https://www.wireguard.com/
GNU General Public License v3.0
1.46k stars 68 forks source link

Packets getting lost #75

Open mLupine opened 3 years ago

mLupine commented 3 years ago

Hi,

I want to move from IPSec to WireGuard for connecting my home network to external VMs. The diagram below shows how I connected everything and you can find my configuration of the Ubiquiti EdgeRouter X and WireGuard on Debian on "VPS 1" and "VPS 2" machines.

VPS 1:

[Interface]
Address = 172.20.10.1/24
ListenPort = 51194
PrivateKey = vps1_priv

[Peer] # VPS 2
PublicKey = vps2_pub
AllowedIPs = 172.20.10.2/32
Persistentkeepalive = 25

[Peer] # EdgeRouter
PublicKey = erx_pub
AllowedIPs = 172.20.10.0/24,192.168.0.0/16
Persistentkeepalive = 25

VPS 2:

[Interface]
PrivateKey = vps2_priv
Address = 172.20.10.2/32

[Peer]
PublicKey = vps1_pub
AllowedIPs = 172.20.10.0/24,192.168.0.0/16
Endpoint = 192.0.15.1:51194
Persistentkeepalive = 25

EdgeRouter X:

wireguard wg0 {
    address 172.20.10.3/24
    listen-port 51194
    mtu 1420
    peer vps1_pub {
        allowed-ips 172.20.10.0/24
        endpoint 192.0.15.1:51194
        persistent-keepalive 25
    }
    private-key /config/auth/privatekey
    route-allowed-ips true
}

And, unfortunately, it doesn't work 😕. Here's how the connection from the home network to VPS 1 looks like:

PING 172.20.10.1 (172.20.10.1): 56 data bytes
64 bytes from 172.20.10.1: icmp_seq=0 ttl=63 time=18.288 ms
64 bytes from 172.20.10.1: icmp_seq=1 ttl=63 time=19.161 ms
Request timeout for icmp_seq 2
64 bytes from 172.20.10.1: icmp_seq=3 ttl=63 time=17.327 ms
64 bytes from 172.20.10.1: icmp_seq=4 ttl=63 time=25.630 ms
Request timeout for icmp_seq 5
64 bytes from 172.20.10.1: icmp_seq=6 ttl=63 time=17.390 ms
64 bytes from 172.20.10.1: icmp_seq=7 ttl=63 time=25.332 ms
64 bytes from 172.20.10.1: icmp_seq=8 ttl=63 time=17.289 ms
64 bytes from 172.20.10.1: icmp_seq=9 ttl=63 time=25.707 ms
64 bytes from 172.20.10.1: icmp_seq=10 ttl=63 time=21.837 ms
64 bytes from 172.20.10.1: icmp_seq=11 ttl=63 time=17.601 ms
64 bytes from 172.20.10.1: icmp_seq=12 ttl=63 time=24.578 ms
Request timeout for icmp_seq 13
Request timeout for icmp_seq 14
64 bytes from 172.20.10.1: icmp_seq=15 ttl=63 time=16.912 ms
--- 172.20.10.1 ping statistics ---
16 packets transmitted, 12 packets received, 25.0% packet loss
round-trip min/avg/max/stddev = 16.912/20.588/25.707/3.572 ms

And it's the same for VPS 2. Every few packets, one of them is dropped.

However, things look different when connecting from VPS 1 or 2 to the home network:

 Host                          Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 172.20.10.1                 0.0%    46   48.8  49.0  48.4  58.1   1.4
 2. 172.20.10.3                97.7%    45   65.0  65.0  65.0  65.0   0.0
 3. 192.168.10.11              97.7%    45   62.7  62.7  62.7  62.7   0.0

Only one in a few dozens packets doesn't get lost.

Routing is configured correctly (I think) and I also can't see the packets getting blocked anywhere on any firewall.

Here's a tcpdump of wg0 on the EdgeRouter X while pinging 172.20.10.2 from the local 192.168.10.10:

19:42:16.991708 IP (tos 0x0, ttl 63, id 57141, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 47973, seq 0, length 64
19:42:17.005659 IP (tos 0x0, ttl 64, id 9275, offset 0, flags [DF], proto ICMP (1), length 84)
    172.20.10.1 > 172.20.10.3: ICMP echo request, id 26534, seq 1, length 64
19:42:17.005901 IP (tos 0x0, ttl 64, id 59125, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.3 > 172.20.10.1: ICMP echo reply, id 26534, seq 1, length 64
19:42:17.005964 IP (tos 0x0, ttl 64, id 9321, offset 0, flags [DF], proto ICMP (1), length 84)
    172.20.10.1 > 172.20.10.3: ICMP echo request, id 26534, seq 2, length 64
19:42:17.006107 IP (tos 0x0, ttl 64, id 59126, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.3 > 172.20.10.1: ICMP echo reply, id 26534, seq 2, length 64
19:42:17.006159 IP (tos 0x0, ttl 64, id 9361, offset 0, flags [DF], proto ICMP (1), length 84)
    172.20.10.1 > 172.20.10.3: ICMP echo request, id 26534, seq 3, length 64
19:42:17.991840 IP (tos 0x0, ttl 63, id 9891, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 47973, seq 1, length 64
19:42:19.008138 IP (tos 0x0, ttl 63, id 11277, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 47973, seq 2, length 64
19:42:20.060420 IP (tos 0x0, ttl 63, id 58558, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 47973, seq 3, length 64
19:42:21.001030 IP (tos 0x0, ttl 63, id 2573, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 47973, seq 4, length 64

Barely a single reply every dozen requests. Also, here's the tcpdump from VPS 2 while executing the very same ping:

19:46:35.267235 IP (tos 0x0, ttl 63, id 11957, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 37735, seq 8, length 64
19:46:35.267302 IP (tos 0x0, ttl 64, id 20090, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.2 > 192.168.10.10: ICMP echo reply, id 37735, seq 8, length 64
19:46:37.273036 IP (tos 0x0, ttl 63, id 581, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 37735, seq 10, length 64
19:46:37.273146 IP (tos 0x0, ttl 64, id 20187, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.2 > 192.168.10.10: ICMP echo reply, id 37735, seq 10, length 64
19:46:39.283058 IP (tos 0x0, ttl 63, id 11822, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 37735, seq 12, length 64
19:46:39.283109 IP (tos 0x0, ttl 64, id 20297, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.2 > 192.168.10.10: ICMP echo reply, id 37735, seq 12, length 64
19:46:41.291610 IP (tos 0x0, ttl 63, id 50413, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 37735, seq 14, length 64
19:46:41.291679 IP (tos 0x0, ttl 64, id 20465, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.2 > 192.168.10.10: ICMP echo reply, id 37735, seq 14, length 64
19:46:42.297711 IP (tos 0x0, ttl 63, id 14016, offset 0, flags [none], proto ICMP (1), length 84)
    192.168.10.10 > 172.20.10.2: ICMP echo request, id 37735, seq 15, length 64
19:46:42.297787 IP (tos 0x0, ttl 64, id 20547, offset 0, flags [none], proto ICMP (1), length 84)
    172.20.10.2 > 192.168.10.10: ICMP echo reply, id 37735, seq 15, length 64

Do you have any suggestions on what I'm doing wrong?

Any help will be greatly appreciated.

M.

luisvd commented 1 year ago

I can confirm this is a recurring issue, it happens when the client’s handshake with the Router.

When there is more than one client on the edge router, the router will attend the handshake from one client, and during that time it will loss the packets, once the handshake is complete it will provide service to both of them, latter the client 2 will get a packet loss, wireguard will try to reconnect and so on.

The results, packets being lost for both clients, what I have seen so far, only one client can connect at a time to the router.

My configuration is the edgerouter and 2 vps connecting to the network. A little different from issue the configuration, however the packet loss happens during the handshake from one of the vps.