FR: Support Subnet Router Load Balancing

HQarroum commented 1 year ago

What are you trying to do?

Context

We have setup a Subnet Router cluster of 3 EC2 instances (one per AZ) within an AWS VPC that routes traffic to that VPC CIDR. We use Subnet router failover to get high-availability by using a cluster of 3 subnet routers in case one of the subnet router instance fails.

What we want to do

We have a use-case where the subnet router will be handling a lot of traffic (>10Gbps) from edge devices to the VPC in the Cloud, we'd like to load balance the incoming traffic from edge to cloud across our cluster of multiple subnet routers in the VPC. Today the traffic flows as follow :

Edge Device part of Tailnet -> AWS Internet Gateway -> Subnet Router part of Tailnet in AWS VPC (Public Subnet) -> ECS Cluster in VPC (Private Subnet)

Our understanding is that today with Subnet router failover, the traffic will flow through one of the (predermined?) subnet routers and failover kicks in case of a failure. Our goal is in addition to getting HA, is to be able to partition our traffic across multiple subnet routers to increase the overall bandwidth to VPC services.

How should we solve this?

There are different ways we could see this working (this is worth discussing more in details to aligh with Tailscale's long-term vision) :

Implement a simple load-balancing strategy at the client-side that would have a knowledge of all subnet routers for each CIDR and loop TCP/UDP connections over them using a round-robin, random or more advanced strategy.
Allow a subnet-router to provide the IP address of a network entry point. Today, tailscale clients exchange their IP addresses with other peers over the DERP side-channel. What if a client would like to explicitely specify an IP as an entry point to how to reach that client? I would see 2 use-cases in doing that : 1. Allow subnet routers to be hosted behind a network load-balancer. 2. Allow tailscale clients in general to be reached behind anycast IP addresses (think CloudFlare Warp networks or AWS Global Accelerator) which would allow to get an "optimized" peer-to-peer connection to that client by having packaets flowing as much as possible through a cloud provider's network instead of the public internet. This second option is more of a "mad-science" thing for now and admittedly, probably needs its own feature request.

What is the impact of not solving this?

Customers might be limited in the bandwidth they are sending to a VPC from edge devices (vertical scaling is probably not the best way to tackle this from a cost and performance perspective).

Anything else?

Tailscale is awesome.

DentonGentry commented 1 year ago

It is true that if two subnet routers offer the same route, they are treated as a high availability group and only one is active at a time. We are likely to provide more options in the future, but right now a single router is active.

A way to work around this to get the behavior sought is to offer different routes, for example:

one router might offer the full route, 192.168.0.0/16
another might offer 192.168.128.0/15
a third might offer 192.168.12.0/24. if that is a particularly busy set of nodes
etc

HQarroum commented 1 year ago

Hi Denton,

Thanks for your suggestion. Yes this will work, and I will try that out in the meantime. But as this requires a profound knowledge of the target network topology and of the nodes deployed in it to provide accurate load balancing, I propose to let this FR open until it can be addressed by you guys.

Thanks!

DentonGentry commented 1 year ago

Related: Tailscale 1.36 substantially improved the throughput of the Linux client. https://tailscale.com/blog/throughput-improvements/

DentonGentry commented 1 year ago

The 1.40 release contained additional throughput improvements described in Surpassing 10Gb/s over Tailscale

tailscale / tailscale