bgins commented 10 months ago

Summary

Problem

We filter out private addresses because they are unreachable on the public Internet. This means Homestar behind a NAT or VPN are not reachable.

Impact

Limits where you can run a Homestar node.

Solution

Add AutoNAT behavior to probe for externally reachable addresses: https://libp2p.github.io/rust-libp2p/libp2p/autonat/index.html. If we cannot reach the node, it is behind a NAT.

We still want to connect to nodes that are behind a NAT. We can attempt hole punching using DCUtR or use circuit relay as a fallback when hole punching fails.

Tasks

[ ] Add AutoNAT behavior
[ ] Add circuit relay
[ ] Add DCUtR and integrate with circuit relay fallback

bgins commented 10 months ago

Related. libp2p tracking for NAT traversal: https://github.com/libp2p/rust-libp2p/issues/2052

bgins commented 4 months ago

Adding some design notes for how the pieces fit together.

Our goal is to improve connectivity in Homestar by implementing libp2p AutoNAT, Circuit Relay, and DCUtR hole punching.

Homestar supports a few types of connections today:

Public to public connections
Private to public connections

We would like to extend this to include private to private connections. These connections can be made through relay or directly after hole punching.

AutoNAT

The first step is to determine whether a node is publicly reachable. If a node is publicly reachable, we can make a direct connection without a relay or NAT traversal. In general, this applies to nodes on the public Internet, but some routers may be configured to pass traffic directly to a private node.

If a node cannot be be reached publicly, it is a private node and we need relay or hole punching.

Circuit Relay

Relay is the easiest and most reliable way to get traffic from one private node to another, but may result in low-bandwith, high-latency connections.

DCUtR/Hole punching

We would like to establish a direct connection between private nodes, traversing NATs where possible. This gives us a faster connection with less load on the relay server.

Note that this won't always work out. Trautwein reports a 70-80% success rate for NAT hole punching in libp2p: https://www.youtube.com/watch?v=bzL7Y1wYth8. When we are unable to traverse NATs, we will fall back to circuit relay.

Note that hole punching is generally considered unreliable. For example, the Iroh Hole Punching write up mentions their goal as:

Get as close to 100% connection rate as possible (even if that means falling back to a relayed connection)

The Tailscale NAT Traversal post mentions:

This is a good time to have the awkward part of our chat: what happens when we empty our entire bag of tricks, and we still can’t get through? A lot of NAT traversal code out there gives up and declares connectivity impossible. That’s obviously not acceptable for us; Tailscale is nothing without the connectivity.

We could use a relay that both sides can talk to unimpeded, and have it shuffle packets back and forth. But wait, isn’t that terrible?

Sort of. It’s certainly not as good as a direct connection, but if the relay is “near enough” to the network path your direct connection would have taken, and has enough bandwidth, the impact on your connection quality isn’t huge. There will be a bit more latency, maybe less bandwidth. That’s still much better than no connection at all, which is where we were heading

ipvm-wg / homestar

Networking: Add AutoNAT and NAT Traversal #398