dndx / phantun

Transforms UDP stream into (fake) TCP streams that can go through Layer 3 & Layer 4 (NAPT) firewalls/NATs.
Apache License 2.0
1.71k stars 137 forks source link

Performance discussion #76

Closed Jimmy-Z closed 2 years ago

Jimmy-Z commented 2 years ago

At first I want to thank you for sharing such a useful tool, makes a night and day difference in my use case.

That said, I'm very curious about its performance, in my test, single connection performance (throughput and CPU usage) is roughly the same compared to udp2raw, while wireguard (inherently at about the same throughput with marginal differences) is consuming significantly less. One might argue wireguard works in kernel so there's less userspace-kernel overhead, but I also compared boringtun, which is cloudflare's userspace wireguard implementation, it still consumes significantly less CPU compared to phantun/udp2raw.

That's quite surprising to me since wireguard also does encryption/decryption while phantun doesn't.

One thing I could think of, fake-tcp has to maintain fake tcp connections statefully, wireguard doesn't have/need this, if this is indeed the main CPU hog, we could possibly use a simple-fake-tcp method: do not fake connections, just send packets statelessly, obviously this would need both sides to do stateless NAT thus harder to deploy, but might be beneficial in some use cases.

What do you think?

dndx commented 2 years ago

Hello,

TCP obfuscation inherently will have performance penalties, so running it with WireGuard will always be slower than pure WireGuard, even for user space implementations like Boringtun.

The goal of Phantun is to make multi core scalability incredibly well by removing all resource contentions, so you will always be able to saturate all CPU cores on a fast connection. Unfortunately TCP masquerade will always be stateful as there are sequence number and so on that needs to be kept track of.

Jimmy-Z commented 2 years ago

Thanks for the reply.

I didn't mean phantun + wireguard vs wireguard, I meant phantun alone vs wireguard, while working together, phantun alone uses more cpu than wireguard.

"TCP masquerade will always be stateful as there are sequence number and so on that needs to be kept track of." - if we want to fake TCP properly, but I'm suggesting the other way.

dndx commented 2 years ago

@Jimmy-Z If that is the case then maybe you can even just use some eBPF magic to modify the L4 headers directly. But this is not something Phantun wants to solve as we do care about NAT traversal as the primary use case.