Open stevefan1999-personal opened 1 year ago
Actually...I realized that this is very familiar to a tool that I used to use...Which is called flannel: flannel-io/flannel: flannel is a network fabric for containers, designed for Kubernetes (github.com)
As a Kubernetes engineer myself I obviously know how the CNI is built upon various tunneling tech (Calico use a mix of VXLAN and BGP to make EVPN), but I was never told what the specific implementation details are, and right here it is actually quite similar.
For flannel, etcd is indeed being used as a DHT to share peering details, and I couldn't believe what I thought was already done by others for a long time.
Maybe I should close this issue for now, let me get some time to sort out my mind.
I'd rather not add additional complexity to fastd - it is meant to be small with a reduced feature set to work on the cheapest embedded Linux hardware, not to cover all possible use cases of VPN tunnels (for a long time OpenWrt devices with 4MiB storage and 32MiB RAM for the whole system were fastd's primary target, although unfortunately OpenWrt has outgrown that class of device by now...)
In any case, C would not be my programming language of choice for less constrained environments and a lot of the feature ideas you mention.
Something similar has been implemented in OpenWrt on top of WireGuard https://forum.openwrt.org/t/new-wireguard-based-openwrt-vpn-implementation-unetd/136028 https://openwrt.org/docs/techref/unetd
Coming from a distributed computing perspective, I find it curious why can't we use a DHT to store all the peers' information -- like using Kademlia or even Raft to advertise/announce its IP addresses/available point of contacts, then we could do full mesh using this kind of dynamic configuration rather than statically placing peers ahead-of-time.
In fact, I'm about to experiment this with a FUSE filesystem that peers could self-exchange their network information, and use the include peers feature of fastd to dynamically reload them. Then I try to use any routing protocol like BATMAN-adv, OSPF or even BGP to calculate network paths -- to achieve high availability and fault tolerance during unusual network conditions.
The only problem left is how the would the nodr self-test their available endpoints. Some system of mine is behind NAT and do not have any port forwarding open, and some hole punching tech like WebRTC, STRN or TURN maybe needed. This would complicate routing though as the node information is likely transparent to the control plane at this point. What about having a gossip protocol?