Code name: Turbo Tunnel Designing circumvention protocols for speed, flexibility, and robustness

In working on circumvention protocols, I have repeatedly felt the need for a piece that is missing from our current designs. This document summarizes the problems I perceive, and how I propose to solve them.

In short, I think that every circumvention transport should incorporate some kind of session/reliability protocol—even the ones built on reliable channels that seemingly don't need it. It solves all kinds of problems related to performance and robustness. By session/reliability protocol, I mean something that offers a reliable stream abstraction, with sequence numbers and retransmissions, like QUIC or SCTP. Instead of a raw unstructured data stream, the obfuscation layer will carry encoded datagrams that are provided by the session/reliability layer.

When I say that circumvention transports should incorporate something like QUIC, for example, I don't mean that QUIC UDP packets are what we should send on the wire. No—I am not proposing a new outer layer, but an additional inner layer. We take the datagrams provided by the session/reliability layer, and encode them as appropriate for whatever obfuscation layer we happen to be using. So with meek, for example, instead of sending an unstructured blob of data in each HTTP request/response, we would send a handful of QUIC packets, encoded into the HTTP body. The receiving side would decode the packets and feed them into a local QUIC engine, which would reassemble them and output the original stream. A way to think about it is that the the sequencing/reliability layer is the "TCP" to the obfuscation layer's "IP". The obfuscation layer just needs to deliver chunks of data, on a best-effort basis, without getting blocked by a censor. The sequencing/reliability layer builds a reliable data stream atop that foundation.

I believe this design can improve existing transports, as well as enable new transports that are now possible now, such as those built on unreliable channels. Here is a list of selected problems with existing or potential transports, and how a sequencing/reliability layer helps solve them:

Problem: Censors can disrupt obfs4 by terminating long-lived TCP connections, as Iran did in 2013, killing connections after 60 seconds.
- This problem exists because the obfs4 session is coupled with the TCP connection. The obfs4 session begins and ends exactly when the TCP connection does. We need an additional layer of abstraction, a virtual session that exists independently of any particular TCP connection. That way, if a TCP connection is terminated, it doesn't destroy all the state of the obfs4 session—you can open another TCP connection and resume where you left off, without needing to re-bootstrap Tor or your VPN or whatever was using the channel.
Problem: The performance of meek is limited because it is half-duplex: it never sends and receives at the same time. This is because, while the bytes in a single HTTP request arrive in order, the ordering of multiple simultaneous requests is not guaranteed. Therefore, the client sends a request, then waits for the server's response before sending another, resulting in a delay of an RTT between successive sends.
- The session/reliability layer provides sequence numbers and reordering. Both sides can send data whenever is convenient, or as needed for traffic shaping, and any unordered data will be put back in order at the other end. A client could even split its traffic over two or more CDNs, with different latency characteristics, and know that the server will buffer and reorder the encoded packets to correctly recover the data stream.
Problem: A Snowflake client can only use one proxy at a time, and that proxy may be slow or unreliable. Finding a working proxy is slow because each non-working one must time out in succession before trying another one.
- The problem exists because even though each WebRTC DataChannel is reliable (DataChannel uses SCTP internally), there's no ordering between multiple simultaneous DataChannels on separate Snowflake proxies. Furthermore, if and when a proxy goes offline, we cannot tell whether the last piece of data we sent was received by the bridge or not—the SCTP ACK information is not accessible to us higher in the stack—so even if we reconnect to the bridge through another proxy, we don't know whether we need to retransmit the last piece of data or not. All we can do is tear down the entire session and start it up again from scratch. As in the obfs4 case, this problem is solved by having an independent virtual session that persists across transient WebRTC sessions. An added bonus is the opportunity to use more than one proxy at once, to increase bandwidth or as a hedge against one of them disappearing.
Problem: DNS over HTTPS is an unreliable channel: it is reliable TCP up to the DoH server, but after that, recursive resolutions are plain old unreliable UDP. And as with meek, the ordering of simultaneous DNS-over-HTTPS requests is not guaranteed.
- Solved by retransmission in the session layer. There's no DNS pluggable transport yet, but I think some kind of retransmission layer will be a requirement for it. Existing DNS tunnel software uses various ad-hoc sequencing/retransmission protocols. I think that a proper user-space reliability layer is the "right" way to do it.
Problem: Shadowsocks opens a separate encrypted TCP connection for every connection made to the proxy. If a web page loads resources from 5 third parties, then the Shadowsocks client makes 5 parallel connections to the proxy.
- This problem is really about multiplexing, not session/reliability, but several candidate session/reliability protocols additionally offer multiplexing, for example streams in QUIC, streams in SCTP, or smux for KCP. Tor does not have this problem, because Tor already is a multiplexing protocol, with multiple virtual circuits and streams in one TCP/TLS connection. But every system could benefit from adding multiplexing at some level. Shadowsocks, for example, could open up one long-lived connection, and each new connection to the proxy would only open up a new stream inside the long-lived connection. And if the long-lived connection were killed, all the stream state would still exist at both endpoints and could be resumed on a new connection.

As an illustration of what I'm proposing, here's the protocol layering of meek (which sends chunks of the Tor TLS stream inside HTTP bodies), and where the new session/reliability layer would be inserted. Tor can remain oblivious to what's happening: just as before it didn't "know" that it was being carried over HTTP, it doesn't now need to know that it is being carried over QUIC-in-HTTP (for example).

[TLS]
[HTTP]
[session/reliability layer] ⇐ 🆕
[Tor]
[application data]

I've done a little survey and identified some suitable candidate protocols that also seem to have good Go packages:

QUIC with quic-go
KCP with kcp-go
SCTP with pion/sctp

I plan to evaluate at least these three candidates and develop some small proofs of concept. The overall goal of my proposal is to liberate the circumvention context from particular network connections and IP addresses.

Related work

The need for a session and sequencing layer has been felt—and dealt with—repeatedly in many different projects. It has not yet, I think, been treated systematically or recognized as a common need. Systems typically implement some form of TCP-like SEQ and ACK numbers. The ones that don't, are usually built on the assumption of one long-lived TCP connection, and therefore are really using the operating system's sequencing and reliability functions behind the scenes.

Here are are few examples:

Code Talker Tunnel (a.k.a. SkypeMorph) uses SEQ and ACK numbers and mentions selective ACK as a possible extension. I think it uses the UDP 4-tuple to distinguish sessions, but I'm not sure.
OSS used SEQ and ACK numbers and a random ID to distinguish sessions.
I wasted time in the early development of meek grappling with sequencing, before punting by strictly serializing requests, sacrificing performance for simplicity. meek uses an X-Session-Id HTTP header to distinguish sessions.
DNS tunnels all tend to do their own idiosyncratic thing. dnscat2, one of the better-thought-out ones, uses explicit SEQ and ACK numbers.

My position is that SEQ/ACK schemes are subtle enough and independent enough that they should be treated as a separate layer, not as an underspecified and undertested component of some specific system.

Psiphon can use obfuscated QUIC as a transport. It's directly using QUIC UDP on the wire, except that each UDP datagram is additionally obfuscated before being sent. You can view my proposal as an extension of this design: instead of always sending QUIC packets as single UDP datagrams, we allow them to be encoded/encapsulated into a variety of carriers.

MASQUE tunnels over HTTPS and can use QUIC, but is not really an example of the kind of design I'm talking about. It leverages the multiplexing provided by HTTP/2 (over TLS/TCP) or HTTP/3 (over QUIC/UDP). In HTTP/2 mode it does not introduce its own session or reliability layer (instead using that of the underlying TCP connection); and in HTTP/3 mode it directly exposes the QUIC packets on the network as UDP datagrams, instead of encapsulating them as an inner layer. That is, it's using QUIC as a carrier for HTTP, rather than HTTP as a carrier for QUIC. The main similarity I spot in the MASQUE draft is the envisioned connection migration which frees the circumvention session from specific endpoint IP addresses.

Mike Perry wrote a detailed summary of considerations for migrating Tor to use end-to-end QUIC between the client and the exit. What Mike describes is similar to what is proposed here—especially the subtlety regarding protocol layering. The idea is not to use QUIC hop-by-hop, replacing the TLS/TCP that is used today, but to encapsulate QUIC packets end-to-end, and use some other unreliable protocol to carry them hop-by-hop between relays. Tor would not be using QUIC as the network transport, but would use features of the QUIC protocol.

Anticipated questions

Q: Why don't VPNs like Wireguard have to worry about all this?
- A: Because they are implemented in kernel space, not user space, they are, in effect, using the operating system's own sequencing and reliability features. Wireguard just sees IP packets; it's the kernel's responsibility to notice, for example, when a TCP segment need to be retransmitted, and retransmit it. We should do in user space what kernel-space VPNs have been doing all along!
Q: You're proposing, in some cases, to run a reliable protocol inside another reliable protocol (e.g. QUIC-in-obfs4-in-TCP). What about the reputed inefficiency TCP-in-TCP?
- A: Short answer: don't worry about it. I think it would be premature optimization to consider at this point. The fact that the need for a session/reliability layer has been felt for so long by so many systems indicates that we should start experimenting, at least. There's contradictory information online as to whether TCP-in-TCP is as bad as they say, and anyway there are all kinds of performance settings we may tweak if it turns out to be a problem. But again: let's avoid premature optimization and not allow imagined obstacles to prevent us from trying.
Q: QUIC is not just a reliability protocol; it also has its own authentication and encryption based on TLS. Do we need that extra complexity if the underlying protocol (e.g. Tor) is already encrypted and authenticated independently?
- A: The transport and TLS parts of QUIC are specified separately (draft-ietf-quic-transport and draft-ietf-quic-tls), so in principle they are separable and we could just use the transport part without encryption or authentication, as if it were SCTP or some other plaintext protocol. In practice, quic-go assumes right in the API that you'll be using TLS, so separating them may be more trouble than it's worth. Let's start with simple layering that is clearly correct, and only later start breaking abstraction for better performance if we need to.
Q: What about traffic fingerprinting? If you simply obfuscate and send each QUIC packet as it is produced, you will leak traffic features through their size and timing, especially when you consider retransmissions and ACKs.
- A: Don't do that, then. There's no essential reason why the traffic pattern of the obfuscation layer needs to match that of the sequencing/reliability layer. Part of the job of the obfuscation layer is to erase such traffic features, if doing so is a security requirement. That implies, at least, not simply sending each entire QUIC packet as soon as it is produced, but padding, splitting, and delaying as appropriate. An ideal traffic scheduler would be independent of the underlying stream—its implementation would not even know how much actual data is queued to send at any time. But it's likely we don't have to be quite that rigorous in implementation, at least at this point.

This document is also posted at https://www.bamsoftware.com/sec/turbotunnel.html.

I think this is a great idea, and something we could definitely use in TapDance/Refraction Networking schemes. We generally multiplex long-lived sessions over many short-lived decoy connections, which works okay but is pretty hacky and can be unreliable (and potentially attackable).

I generally agree with the last question about traffic fingerprinting, but have a few notes, with the understanding that pretty much everyone in practice has punted on dealing with packet sizes/timings. But thinking into the future, it might also make sense to consider splitting the obfsucation layer into two sub-layers, one that deals with the wire protocol (e.g. we look like TLS as in TapDance or we encrypt everything as in obfs4...), and one that deals with the timing/packet sizes (like Slitheen does to mimic realistic web traffic). I think the timing/packet size layer could become tricky if the next layer protocol is this new packet-based session/reliability protocol, because the choice of the session/reliability protocol layer dictates at least a few constraints on the timing/packet size layer.

For example, if the session/reliability layer generates a 100 byte QUIC packet, but the timing/packet size layer determines it can only send 25 bytes next, it either has to A) send padding and hope that it will be able to send the 100 byte QUIC packet later or B) fragment the QUIC packet. Option A is tricky because now the timing/packet size layer has to be mindful of what kinds of packets the session/reliability layer spits out. Option B is tricky because now we need at least some kind of fragmentation/defragmentation logic. This is also complicated if the timing/packet size layer tries to play with timings, and your reliability layer has some constraints on retransmissions/ACK timings or tries to do things with RTT estimation.

These challenges and constraints are probably more a function of having a layer that tries to obfuscate packet size/timings, and aren't specific to the session/reliability layer; any protocol you tried to put on top of a packet size/timing-modifying layer might have to consider these. Still, I think it might be good to keep in mind what protocol layer we want to (eventually) be underneath Turbo Tunnel.

I generally agree with the last question about traffic fingerprinting, but have a few notes, with the understanding that pretty much everyone in practice has punted on dealing with packet sizes/timings. But thinking into the future, it might also make sense to consider splitting the obfsucation layer into two sub-layers, one that deals with the wire protocol (e.g. we look like TLS as in TapDance or we encrypt everything as in obfs4...), and one that deals with the timing/packet sizes (like Slitheen does to mimic realistic web traffic). I think the timing/packet size layer could become tricky if the next layer protocol is this new packet-based session/reliability protocol, because the choice of the session/reliability protocol layer dictates at least a few constraints on the timing/packet size layer.

I think you're right that a separate timing/size protocol is needed. Potentially a lightweight one, though the details may depend on the obfuscation scheme.

I am picturing something like this. Let every run of data within some discrete container (HTTP body, UDP datagram, TLS application record, etc.) be prefixed with a tag indicating whether it's padding or real data, and a length. They key is that the length is represented using a variable-size encoding, so every run length is possible, with no minimum. A strawman proposal is

first byte:  pcxxxxxx
later bytes: cxxxxxxx

The p bit is 1 for data or 0 for padding. The c (continuation) bit is 1 if there are more bytes in the tag, or 0 if this is the last byte in the tag. The x bits get concatenated to form the length. This would allow sending a run of any byte size—containing padding if nothing else, but possibly data if there is enough room. For example,

0 bytes
    (just don't send anything)
1 byte of padding (1 byte of tag prefix followed by 0 bytes)
    00000000
2 bytes of padding (1 byte of tag prefix followed by 1 byte)
    00000001 [1 arbitrary byte]
64 bytes of padding (1 byte of tag prefix followed by 63 bytes)
    00111111 [63 arbitrary bytes]
65 bytes of padding (2 bytes of tag prefix followed by 63 bytes)
    01000000 00111111 [63 arbitrary bytes]
1500 bytes of padding (2 bytes of tag prefix followed by 1498=10111011010₂ bytes)
    01001011 01011010 [1498 arbitrary bytes]
1000+500 bytes of padding
    01000111 01100110 [998 bytes] 01000011 01110010 [498 bytes]

100 bytes of data (total 102 bytes)
    11000000 01100100 [100 data bytes]
100 bytes of data + 26 bytes of padding (total 128 bytes)
    11000000 01100100 [100 data bytes] 00011010 [25 arbitrary bytes]

If your channel is continuous (not chopped up into discrete containers), then you don't even need a variable-length encoding like this. For example, obfs4's frame format is completely adequate for shaping to arbitrary sizes with byte-level precision, despite having a fixed 21-byte prefix on each frame. That's because obfs4 is carried inside a continuous stream, so you can send only a part of the prefix now, and the rest later.

For example, if the session/reliability layer generates a 100 byte QUIC packet, but the timing/packet size layer determines it can only send 25 bytes next, it either has to A) send padding and hope that it will be able to send the 100 byte QUIC packet later or B) fragment the QUIC packet. Option A is tricky because now the timing/packet size layer has to be mindful of what kinds of packets the session/reliability layer spits out. Option B is tricky because now we need at least some kind of fragmentation/defragmentation logic. This is also complicated if the timing/packet size layer tries to play with timings, and your reliability layer has some constraints on retransmissions/ACK timings or tries to do things with RTT estimation.

To me, option A sounds like the right choice here. Send an entire data packet when the packet size scheduler gives you enough room to do so, otherwise just send padding. It may be we can cajole the sequencing/reliability layer to give us a packet of no larger than a specific size on demand, like, "give me a packet whose total size is at most 80 bytes, or else return an error." The libraries I've looked at so far seem not to be architected that way; you don't "pull" packets from them, rather they "push" one on you via a callback, but maybe it's possible to achieve something similar by manipulating the MTU.

A case where option A would not work is when the carrier protocol never produces packets big enough to allow forward progress. I'm thinking here about DNS, which has a maximum upstream payload of about 140 bytes, and the handshake of QUIC, which demands a packet of at least 1200 bytes as an anti-amplification measure. Some kind of simple fragmentation scheme may be required in this case. An alternative would be to use a patched version of QUIC that does not enforce the 1200 bytes requirement, because reflected UDP amplification attacks would not apply in this context.

There's also complexity in how the packet size scheduler interacts with the obfuscation layer. Suppose we are using TLS for obfuscation, and the scheduler calls for a packet of 400 bytes. That doesn't mean that we can just send 400 payload bytes, because the QUIC header will add overhead. But neither can we just send a QUIC packet (+ padding) of 400 bytes including the header, because the TLS record protocol will add its own overhead. For a given packet size, we have to kind of reverse-engineer how much payload to send, that will result in a packet of the desired size. Arguably this requirement, though awkward, is good for obfuscation purposes, because an actual non-circumvention application protocol also does not have direct control over TLS record sizes: it sends chunks of data according to its own logic, and leaves the rest to the TLS library. Just so, a circumvention protocol that wants to avoid dead-parrot distinguishers and therefore tunnels through a real TLS library also should only exert control as far as the sizes of chunks it gives to the TLS library, and let the library take over from there. It's just tricky for us, because the censor is going to be working off of packet sizes from a pcap, which is several abstractions away from the layer which we program.

And I agree we will probably have to tinker with settings such as the default packet retransmit timeout. Interpacket time shaping is likely to give the sequencing/reliability layer the impression that it's running on a pathological network.

Q: You're proposing, in some cases, to run a reliable protocol inside another reliable protocol (e.g. QUIC-in-obfs4-in-TCP). What about the reputed inefficiency TCP-in-TCP?

A: Short answer: don't worry about it. I think it would be premature optimization to consider at this point. The fact that the need for a session/reliability layer has been felt for so long by so many systems indicates that we should start experimenting, at least. There's contradictory information online as to whether TCP-in-TCP is as bad as they say, and anyway there are all kinds of performance settings we may tweak if it turns out to be a problem. But again: let's avoid premature optimization and not allow imagined obstacles to prevent us from trying.

I found an IETF Transport Area Working Group draft, draft-pauly-tsvwg-tcp-encapsulation, that touches on this subject, particularly in Section 5.

https://datatracker.ietf.org/doc/html/draft-pauly-tsvwg-tcp-encapsulation-00#section-5

The use of an outer TCP context may cause signals from the network to be hidden from the inner TCP contexts. ... Generally, the main areas of concern are signals that inform loss recovery, Bufferbloat and delay avoidance, and head of line blocking between streams.

https://datatracker.ietf.org/doc/html/draft-pauly-tsvwg-tcp-encapsulation-00#section-5.1.2

Generally, TCP congestion controls and loss recovery algorithms are capable of recovering from loss events very efficiently, and the inner TCP contexts observe brief periods of added delay without much penalty.

A TCP congestion control should be selected and tuned to be able to gracefully handle extremely variable RTT values, which may already the case for some congestion controls, as RTT variance is often greatly increased in mobile and cellular networks.

Additionally, use of a TCP congestion control that considers delay to be a sign of congestion may help the coordination between inner and outer TCP contexts. LEDBAT [RFC6817] and BBR [I-D.cardwell-iccrg-bbr-congestion-control] are two examples of delay based congestion control that an inner TCP context could use to properly interpret loss events experienced by the outer TCP context.

There are further considerations for encapsulated segments that are then meant to be unencapsulated and forwarded verbatim outside the tunnel. These considerations do not apply in the model I envision, because the sequencing/reliability layer is strictly between the two obfuscation endpoints and its details are hidden from whatever protocol that makes use of it.

Care must be taken to ensure that any TCP congestion control in use is also appropriate for an inner context to use on any network segments that are traversed outside of the encapsulation.

Since any losses will be handled by the outer TCP context, it might seem reasonable to modify the the inner TCP contexts' loss recovery algorithms to prevent retransmissions, there are often network segments outside of the encapsulated segments that still rely on the inner contexts' loss recovery algorithms. Instead, spurious retransmissions can be reduced by ensuring that RTO values are tuned such that the outer TCP context will fully time out before any inner TCP contexts.

I've done a little survey and identified some suitable candidate protocols that also seem to have good Go packages:

QUIC with quic-go

KCP with kcp-go

SCTP with pion/sctp

My evaluation of these three candidate protocol implementations is in #14.

Are you aware of https://tools.ietf.org/html/draft-schinazi-masque-01?

It's effectively VPN over QUIC, without doubled guarantees. I believe it addresses some of your concerns.

Are you aware of https://tools.ietf.org/html/draft-schinazi-masque-01?

It's effectively VPN over QUIC, without doubled guarantees. I believe it addresses some of your concerns.

I know about MASQUE–it's actually linked to in the writeup above. But MASQUE is not really an example of what I'm talking about.

MASQUE observes that QUIC is likely to make a good cover protocol, because of its default encryption and anticipated wide use in HTTP/3. Furthermore, QUIC provides features that are convenient for implementing a proxy, such as multiplexing multiple streams into one connection. I think these observations are correct and I have nothing against the MASQUE proposal.

What I am proposing is orthogonal. Perhaps the best way to explain the difference is that MASQUE puts QUIC on the outside of the protocol stack; i.e., the part exposed to observation by the censor is actual UDP-encapsulated QUIC packets. In contrast, I'm saying that it's a good idea to use QUIC, or something like it, as an inner layer in a circumvention stack, regardless of what the outermost obfuscation layer may be. I am not proposing a specific new circumvention protocol based on QUIC, but a general design principle, that the features of a session/reliability protocol are useful for circumvention purposes, independent of the external tunnel that provides covertness. QUIC is one of these session/reliability protocols, and a promising one, but if QUIC is used in a turbo tunnel design, it will not be in the form of UDP datagrams exposed on the wire, but as packets encoded or encapsulated inside the covert tunnel. See here for an example of QUIC packets encapsulated in HTTP requests, and here for an example of QUIC packets encapsulated in an obfs4 stream.

MASQUE found a single protocol—UDP-based QUIC—that (1) is good for obfsucation; and (2) provides nice features for implementing a proxy, such as operation over lossy channels, stream multiplexing, and connection migration. One way to understand a turbo tunnel design is that it separates these two functions: you have one layer for obfuscation (obfs4, Shadowsocks, meek, etc.), and another layer, more or less independent, that provides those other features. You combine the flexibility to swap out different forms of obfuscation, with the speed and robustness benefits of a session/reliability protocol. UDP-based QUIC suffices if you want your obfuscation layer to look like HTTP/3, but it doesn't help if you want to build a DNS-based tunnel, or a WebRTC-based tunnel, or one that uses HTTP/1.1, for example.

UDP-based QUIC suffices if you want your obfuscation layer to look like HTTP/3, but it doesn't help if you want to build a DNS-based tunnel, or a WebRTC-based tunnel, or one that uses HTTP/1.1, for example.

This explanation made your proposal a lot clearer to me. Thanks!

I am picturing something like this. Let every run of data within some discrete container (HTTP body, UDP datagram, TLS application record, etc.) be prefixed with a tag indicating whether it's padding or real data, and a length. They key is that the length is represented using a variable-size encoding, so every run length is possible, with no minimum.

I've got an implementation of this scheme in the meek turbotunnel branch.

The public API is

func WriteData(w io.Writer, data []byte) (int, error)
func WritePadding(w io.Writer, n int) (int, error)
func ReadData(r io.Reader) ([]byte, error)

WriteData writes a data block of n bytes and returns the number of bytes written, which will be n+1, n+2, or n+3 because of the length prefix. WritePadding writes exactly n bytes of padding, including the length prefix as part of the padding. ReadData skips over padding and returns the first data block.

Although the length prefix encoding would allow for integers of any size, I decided to limit the range to [0, 0xfffff], which means that a length prefix will never consist of more than 3 bytes, and that decoded lengths will fit in a uint32.

There's another public function

func MaxDataForSize(n int) int

MaxDataForSize returns the largest size a data block passed to WriteData can be, without exceeding n bytes in encoded length. The intent here is to allow you to fill a buffer to an exact size by putting an as many data packets as possible, then padding to fill up the rest. Something like:

var buffer bytes.Buffer
requestedSize := 500
var size int
var err error
for err == nil && size < requestedSize {
    max := encapsulation.MaxDataForSize(requestedSize - size)
    var n int
    if max > 0 {
        var p [1024]byte
        n, err = source.Read(p[:])
        if err == nil {
            n, err = encapsulation.WriteData(buffer, p[:n])
        }
    } else {
        n, err = encapsulation.WritePadding(buffer, requestedSize - size)
    }
    size += n
}

However I think the MaxDataForSize design needs a rethink, because while it's fine when you can Read up to a maximum number of bytes, it's more awkward when you are dealing with discrete packets of determined sizes. See here in meek-client. What you want to do is keep reading packets and doing WriteData into the buffer until you read a packet that's bigger than MaxDataForSize on the remaining space in the buffer. Then "unread" that packet so it will be available again in the future, WritePadding up to the remaining size, and send your burst. In the meek branch I'm reading packets from a channel, and you can't "unread" a channel. You would need to build a data structure around the channel that yields a packet only if it doesn't exceed a given size.

FOCI paper on Turbo Tunnel

I've had a paper on Turbo Tunnel accepted at the upcoming Free and Open Communications on the Internet (FOCI) workshop. I've written a draft that I will be revising over the next week. I invite your comments. The final revision is due 2020-07-28.

Here is the current draft. This will also be the home of an HTML version of the paper, when I find time to do that. https://www.bamsoftware.com/papers/turbotunnel/

I'll tag the people who I know have GitHub accounts and are mentioned in the acknowledgements. If I acknowledged you, it probably means here or elsewhere that I found useful. @cohosh @arlolra @ValdikSS @fortuna @studentmain @ewust @sergeyfrolov @xtaci

Hi there! Just to respond to the points about MASQUE above, I think that right now MASQUE is focusing on enabling proxying over HTTP - which makes it a good candidate for a pluggable transport. However, I think that it would be possible to use MASQUE as a Turbo Tunnel session layer as well - the connection migration property of QUIC allows you to migrate a MASQUE connection across underlying pluggable transports without loss of connectivity. We mentioned this briefly in the Onion Routing section of the MASQUE Obfuscation doc.

If TCP-in-TCP performance is not a concern, why not Tor over TCP/IP over HTTP? This setup seems to satisfy the described requirements and there are good userspace TCP/IP stacks already.

I think the reason QUIC came to be is to take the control of the reliability layer back from kernel, because they wanted to do something interesting at this level but found it impossible to co-exist with existing reliable protocols in kernel and also chose not to do it in the TCP-in-TCP way. H/2 is already an attempt at reinventing 1/2 of TCP (and smux mentioned above reinvents the same part). It would be quite hard to create something different from QUIC and functionally as good as QUIC.

If TCP-in-TCP performance is not a concern, why not Tor over TCP/IP over HTTP? This setup seems to satisfy the described requirements and there are good userspace TCP/IP stacks already.

Yes, userspace TCP would work fine as an inner session/reliability layer. I am trying to emphasize that the specific choice of an inner session protocol does not matter so much--the really important idea is decoupling the session from the outer network connection. I did a lot of testing with KCP and QUIC, but there are probably many more equivalent options.

I think I considered userspace TCP early on, but did not find a good Go package for it, and for practical reasons I wanted the session protocol to be written in Go. Stream multiplexing turns out to be a really nice feature, though that could equally be accomplished with smux-in-TCP as with smux-in-KCP.

The "Tor over TCP/IP over HTTP" layering is basically what Turbo Tunnel in meek implements.

I think I considered userspace TCP early on, but did not find a good Go package for it

Assuming you aren't aware, gvisor/netstack (in-use by tailscale, outline-go-tun2socks, libsagernet, xjasonlyu's tun2socks, and firestack that I co-develop) is a pretty flexible golang TCP/IP implementation.

net4people / bbs

Turbo Tunnel: let's include a sequencing/reliability layer in our circumvention protocols #9

Related work

Anticipated questions

FOCI paper on Turbo Tunnel