getlantern / browsersunbounded

Interoperable browser-based P2P proxies for censorship circumvention
GNU General Public License v3.0
11 stars 0 forks source link

Clientside WebSocket keepalive and the inevitability of a new protocol layer #127

Open noahlevenson opened 1 year ago

noahlevenson commented 1 year ago

The challenge of WebSocket keepalive has yet again illustrated why a new protocol layer to implement Broflake concepts seems inevitable. It's worth discussing the background:

To reduce latency for censored end users, uncensored clients should be able to open WebSocket connections to the egress server long before they know they have any bytes requiring transportation. This means that uncensored clients may create yet-unused WebSocket connections which appear to middleboxes as idle. We observe middleboxes closing these connections. This results in discon/recon loops, as uncensored clients create new WebSocket connections, detect their closure, and reconnect, oscillating every 60 seconds or so.

This is easily mitigated with a WebSocket keepalive. The built-in WebSocket ping/pong is the desirable way to accomplish this. It is clearly desirable to implement ping on the clientside, so as to distribute the work of keepalive to connected clients rather than centralizing the work at the egress server.

However, browser clients do not support WebSocket ping, since it's not part of the JavaScript API. This leaves us with several possible solutions:

  1. Ping from the server instead of the client
  2. Try to send an unnecessary QUIC frame or some other garbage over the WebSocket link as a keepalive
  3. Roll our own ping/pong protocol
  4. Do not allow pre-allocation of WebSocket connections, at the cost of increased latency for censored end users
  5. Do nothing, and tolerate the discon/recon loops

We have currently opted for solution number 1, since we already had sufficiently low-level access to WebSocket reads and writes so as to implement a relatively optimized serverside keepalive in just a few LOC.

But for optimal scalability, we really ought to move this logic to the client. This means rolling our own ping/pong protocol.

Rolling our own ping/pong protocol means introducing Broflake control frames and a Broflake header, which requires a new protocol layer between WebSocket and QUIC that must be demuxed at the egress server. This layer is also where we'd implement the Broflake handshake (for version compatibility enforcement and future extensibility), and it's where we'd implement a solution for the deferred problem of backrouting in a multi-hop network.

See also:

https://github.com/getlantern/product/issues/37

https://github.com/getlantern/broflake/issues/16

myleshorton commented 10 months ago

But for optimal scalability, we really ought to move this logic to the client.

What would make that more scalable?

noahlevenson commented 10 months ago

@myleshorton With the logic in the server, the server is responsible for maintaining state and sending network requests for N keepalives. N grows with the number of connected clients. Implementation-wise, it's just a little timeout check on last received data that's associated with each connected client. But distributing that logic to the clients would remove the burden from the server entirely.

myleshorton commented 10 months ago

Got it. Is that actually measured to be a performance bottleneck though?

noahlevenson commented 10 months ago

Nah, it's a relatively minor scalability concern compared to the other scalability concerns.