Closed vnxme closed 4 months ago
until I reboot the server/client/both.
Does it really take rebooting the whole system to fix it? What about only restarting the swgp-go client?
Could you please think whether there is any mechanism in the code prohibiting the tunnel from re-establishing in the circumstances I described?
It's a really simple protocol and I can't think of anything that would have caused this.
Or are there any debug steps I could follow to tell you more?
You could use tcpdump
to check if the server actually received any packets from the client.
I suspect there are external factors at play here. You mentioned that you need to wait for a few minutes before starting the server again. During the downtime, the server system likely responds to client packets with ICMP destination port unreachable messages. Hypothetically, there could be some firewall on the path that blocks the client after seeing a certain number of such messages. But I've never seen any setup like this IRL so I'm not really sure if this even is a thing network administers do.
Thank you for a swift reply!
Does it really take rebooting the whole system to fix it? What about only restarting the swgp-go client?
Actually I restart only a docker instance of swgp-go, not the whole system.
You could use
tcpdump
to check if the server actually received any packets from the client.
I used tcpdump
on the server host system, and I can see the incoming proxied packets from the client swgp-go instance. I don't have a docker image with both tcpdump
and swgp-go
to check whether the packets come inside the container. But thank you for an idea.
You mentioned that you need to wait for a few minutes before starting the server again.
Yes, it takes some time when I terminate docker containers and recreate them (downloading layers, etc.). I experimented with a simple container reboot - it is fast and I haven't managed to reproduce the problem.
Hypothetically, there could be some firewall on the path that blocks the client after seeing a certain number of such messages.
My instances are basically Linux VPSes, and I have a src-nat rule for outgoing traffic and a dst-nat rule to forward certain ports to the swgp-go containers, nothing else relevant to the proxied wireguard traffic. If there were any firewall rules prohibiting the traffic, the tunnels wouldn't re-establish after a simple reboot.
Well, I'm not very familiar with container networking and custom NAT rules, but it might still be helpful if you post the related configurations so more people can help with this.
I continued my experiments and noticed the problem occurs when for some reason unknown to me yet after container recreation some proxied wireguard packets forwarded with dst-nat on the host can't reach the swgp-go container due to some failed/invalid connection state.
So, you were right supposing it's a host/firewall problem. Thank you for you help!
Hi @database64128,
I've been using swgp-go for 8 month now. I have 5+ nodes with multiple proxied wireguard tunnels between them. Sometimes the same node acts as both client and server. I'm quite happy with how swgp-go generally works, but I've recently noticed the following problem:
proxyListen
) with the same config file in place (read as I switch it off, wait for a few minutes, then switch on), the client node (i.e. the one havingwgListen
) pointing to that server node seems to keep sending proxied packets to the server node, but the tunnel won't re-establish (destination WG interface of the server node receives no packets) until I reboot the server/client/both.Could you please think whether there is any mechanism in the code prohibiting the tunnel from re-establishing in the circumstances I described? Or are there any debug steps I could follow to tell you more?
Regards,