Running a load-balanced Tor bridge

This post is about running multiple tor processes on one bridge, for better scaling on bridges that handle a lot of traffic. It is not a completely supported configuration, and requires a few workarounds. Most bridges do not need this. This setup is what is now running on the Snowflake bridge.

The usual way to run a pluggable transport bridge is to run a single tor process, with the ServerTransportPlugin option set to the path of a pluggable transport executable. The tor process is responsible for running and managing the pluggable transport process. This is how we ran the Snowflake bridge until a few weeks ago. In Snowflake, the pluggable transport executable is snowflake-server; it receives WebSocket connections from Snowflake proxies and forwards them to tor.

The number of Snowflake users rapidly increased after the partial blocking of Tor in Russia in December 2021, which increased the load on the Snowflake bridge. Eventually it reached a point where the tor process became a performance bottleneck. Because tor is single-threaded, once it reaches 100% of one CPU, that's the limit. Adding more CPUs or increasing the speed of the network connection will not increase overall performance.

For technical reasons relating to Tor, it's not currently possible to run multiple independent bridges and, say, have Snowflake proxies choose one at random. The basic reason is that a Tor client expects to connect to a bridge with a certain identity key, and will cancel the connection if the key is not as expected.

We brainstormed options in a thread on the tor-relays mailing list:

https://forum.torproject.net/t/tor-relays-how-to-reduce-tor-cpu-load-on-a-single-bridge/1483

The design we settled on is to run multiple tor processes (currently 4), all with the same identity key. They are technically distinct bridges, but they can all substitute for one another in terms of authenticating to clients. Instead of snowflake-server being run and managed by tor, it runs independently, as a normal system daemon managed by systemd. snowflake-server connects to the multiple instances of tor through a load balancer (we are using HAProxy, though we also prototyped successfully with Nginx). For the purposes of metrics, each instance of tor runs another component called extor-static-cookie, explained further below.

The whole configuration looks like this:

Diagram of the load-balanced bridge configuration, showing snowflake-server, haproxy, and four instances of tor+extor-static-cookie

Detailed installation instructions:

https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guides/Snowflake-Bridge-Installation-Guide?version_id=6de6facbb0fd047de978a561213c59224511445f

There are a couple of awkward details to deal with. The first is onion key rotation. Besides its long-term identity key, each tor bridge has an onion key that is used for circuit encryption. The onion key is changed every four weeks, so even if the multiple tor instances all start with the same onion keys, they will eventually diverge. As a workaround, we set filesystem permissions to prevent tor from rewriting its onion key files. The second detail is ExtORPort authentication. Extended ORPort (ExtORPort) is a protocol for attaching pluggable transport metadata to an incoming tor connection. It's the source of data for graphs like "Bridge users by transport" and "Bridge users by country". The problem is that connecting to the ExtORPort requires authenticating with a secret key, and every instance of tor regenerates the key every time it is restarted. snowflake-server would not know which ExtORPort authentication key to use through the load balancer. Our workaround for this is a shim called extor-static-cookie that presents an ExtORPort with a shared, predictable authentication key to snowflake-server, then re-authenticates using the authentication key of its particular instance of tor.

Currently, on the Snowflake bridge, all the above components run on the same host. But the decoupling of tor and snowflake-server creates more options for future expansion. For example, it would be possible to run snowflake-server on one host, and all the instances of tor on another, nearby host. The next big hurdle will be when snowflake-server outgrows the resources of a single host, since it manages a lot of session state that is not trivial to distribute.

net4people / bbs

Running a load-balanced Tor bridge #103