Open noahlevenson opened 1 year ago
But for optimal scalability, we really ought to move this logic to the client.
What would make that more scalable?
@myleshorton With the logic in the server, the server is responsible for maintaining state and sending network requests for N keepalives. N grows with the number of connected clients. Implementation-wise, it's just a little timeout check on last received data that's associated with each connected client. But distributing that logic to the clients would remove the burden from the server entirely.
Got it. Is that actually measured to be a performance bottleneck though?
Nah, it's a relatively minor scalability concern compared to the other scalability concerns.
The challenge of WebSocket keepalive has yet again illustrated why a new protocol layer to implement Broflake concepts seems inevitable. It's worth discussing the background:
To reduce latency for censored end users, uncensored clients should be able to open WebSocket connections to the egress server long before they know they have any bytes requiring transportation. This means that uncensored clients may create yet-unused WebSocket connections which appear to middleboxes as idle. We observe middleboxes closing these connections. This results in discon/recon loops, as uncensored clients create new WebSocket connections, detect their closure, and reconnect, oscillating every 60 seconds or so.
This is easily mitigated with a WebSocket keepalive. The built-in WebSocket ping/pong is the desirable way to accomplish this. It is clearly desirable to implement ping on the clientside, so as to distribute the work of keepalive to connected clients rather than centralizing the work at the egress server.
However, browser clients do not support WebSocket ping, since it's not part of the JavaScript API. This leaves us with several possible solutions:
We have currently opted for solution number 1, since we already had sufficiently low-level access to WebSocket reads and writes so as to implement a relatively optimized serverside keepalive in just a few LOC.
But for optimal scalability, we really ought to move this logic to the client. This means rolling our own ping/pong protocol.
Rolling our own ping/pong protocol means introducing Broflake control frames and a Broflake header, which requires a new protocol layer between WebSocket and QUIC that must be demuxed at the egress server. This layer is also where we'd implement the Broflake handshake (for version compatibility enforcement and future extensibility), and it's where we'd implement a solution for the deferred problem of backrouting in a multi-hop network.
See also:
https://github.com/getlantern/product/issues/37
https://github.com/getlantern/broflake/issues/16