Closed tegefaulkes closed 11 months ago
@tegefaulkes I think you're also missing the backpressure problem. That should also be done here.
Relevant commentary https://github.com/MatrixAI/Polykey/pull/525#discussion_r1272026292 on websocket backpressure.
Backpressure needs to be properly applied to the readable streams. I may need to make a new issue for this. This needs to be further specced out.
https://chat.openai.com/share/8cb1acc0-88e9-4846-89f3-2bd3840dba6f
Yes, you're correct. The process to establish a WebSocket connection starts with an HTTP request, specifically an HTTP GET request that includes an "Upgrade: websocket" header. This is often referred to as the WebSocket handshake request.
The client sends this initial HTTP request to the server, requesting to upgrade the connection to a WebSocket. The request might look something like this:
GET /mychat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw== Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13 Origin: http://example.com
If the server supports the WebSocket protocol, it agrees to the upgrade and replies with an HTTP 101 status code (Switching Protocols), something like this:
HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk= Sec-WebSocket-Protocol: chat
After this handshake process, the connection is upgraded from HTTP to a WebSocket connection, and data can be sent as WebSocket frames, in both directions, between client and server.
This handshake process is necessary because it allows servers to handle both WebSocket and HTTP traffic on the same URL or port, which is especially useful if you want to add real-time features (which WebSockets are great for) to an existing HTTP-based application. It's also an integral part of the WebSocket protocol as defined in its specification.
I'm working on the PKE reactivity system
And I found that websocket transport actually means there's 3 layers of authentication possible.
In relation to our RPC system which is transport agnostic, it does make sense that our authentication at the RPC layer would have to be in the layer 3 payload protocol, because it's not aware of what the transport is. Same in QUIC where it's just a quic stream, but the quic connection may have done an MTLS earlier.
It does mean that when you're doing the RPC authentication, the need to setup as a "basic token" format as per the discussion on the PR is sort of unnecessary.
Since in that case we are simulating a http request header which is not required anymore.
I think the case of using websockets as a transport layer, we would skip layer 2, because quic doesn't have as a common denominator, but there's always layer 1.
Did you ensure that our current PK clients communicate over WSS to the PK agent server?
Putting our client service auth into layer 3 means that the the transport layer already establishes a websocket connection from TCP to HTTP to Websockets. Note that websocket frames are always binary.
If they fail authentication we need to close the stream gracefully.
Regarding graceful websocket closing on the server side. I see a pattern like this (this is taken out of PKE's websocket handler):
// A SocketStream is a Duplex stream with the `socket` object
wsHandler: (conn: SocketStream, request) => {
// This enables the websocket handler on
// `enterprise.polykey.com/`
// AND also on
// `org1.org2.enterprise.polykey.com/`.
// At some point we need to do authentication
// And this is just a demonstration for now
logger.info('Websocket Connection Received');
// If this is false, then we actually don't need to end auto
// But if it is true, we do need to!
logger.info(`Websocket Half Open ${conn.allowHalfOpen}`);
conn.on('drain', () => {
logger.info(`Resume websocket conn`);
conn.resume();
});
conn.on('data', (msg: Buffer) => {
console.log('INPUT DATA', JSON.stringify(msg.toString('utf-8')));
// If this is true, it is fully drained, otherwise backpressure!
const drained = conn.write(
JSON.stringify({
message: 'hello world'
}) + '\n'
);
logger.info(`Wrote to websocket conn, do we need to drain ${drained}, buffer size: ${conn.socket.bufferedAmount}`);
if (!drained) {
logger.info(`Pausing websocket conn`);
conn.pause();
}
});
conn.on('end', () => {
// This is only necessary if the socket has "allowHalfOpen"
logger.info('Websocket Connection Ended');
// We must end the connection!
conn.end();
});
}
Notice that upon receiving end, we call end, this is because allowHalfOpen
defaults to true.
These situations we dealt with earlier when we were working on the network/Proxy.ts
. At the same time when we are ending, like conn.end
, we are supposed to wait for the other side to also end for a proper terminal handshake. However we can "timeout" this wait and close/destroy if they don't send us back that end
. The point is that fin packets are supposed to be exchanged at all times.
Notice that on the server side, handling backpressure is quite easy. It's just a matter of writing, checking if the drained
boolean is true, if true, pause the connection's readable side, and when we have drained, we can resume the readable side.
This is because this is just an echo handler, where the writable data is being generated in response to the readable data.
If the writable data isn't being generated in response to readable data, then the backpressure has to push back all the way to whoever is calling the writing function. That's usually done just by providing a promise that the caller should be awaiting on.
So the above example is a better demonstration of how to actually apply backpressure when reading from the websocket stream.
However I believe you should have this already done when converting websocket connections to our web stream abstraction right?
I haven't had the time to fully review your entire websocket transport implementation and of the corresponding quic integration yet, so we're just moving ahead to the 6th deployment of testnet and seeing if there are issues that pop up in production, but these things should have been detailed in the spec too during implementation.
This may lead to factoring out both rpc
to js-rpc
and ws
to js-ws
.
As per discussion in PKE. The js-ws
might just be wrapper around ws
combined with a web stream IO abstraction around the websocket object. This enables us to re-use both js-ws
and js-rpc
inside PKE.
In the future the underlying ws
can be swapped out in js-ws
for a runtimeless implementation that can be compatible with mobile operating systems.
Right now neither ws
nor uws
natively support websocket over HTTP2.
Without this, right now all websocket connections require an independent TCP socket, an HTTP request and a HTTP 101 response, before getting a websocket connection. A WSS connection also requires TLS.
This adds quite significant overhead to creating new websocket connections especially since all of our RPC transactions are one to one for websocket connections.
Now it turns out that nodejs actually support the necessary protocol behaviour for HTTP2 to support this. The chrome browser also supports native websocket over http2. However firefox appears to have disabled it.
But the main problem is ws
, which would be used for now for PK CLI to PK Agent, or any usage of PolykeyClient
.
This is unlikely to ever be done by ws
as you can see in this thread: https://github.com/websockets/ws/issues/1458#issuecomment-1661582162
However I found https://github.com/szmarczak/http2-wrapper/blob/master/examples/ws/index.js that appears to do it. It uses node's http2 module, and then wraps it in such a way that websockets will use HTTP2 streams. If this hack is sufficient, it would be possible to encapsulate this hack inside js-ws
when we extract the websocket transport out.
It's also possible that by the time ws
gets around to doing it over http2, ws over http3 would be the main thing: https://www.rfc-editor.org/rfc/rfc9220.html.
For js-ws
, which we intend to use Polykey
it will primarily focus on node runtime, meaning server side. Browser side it doesn't make sense to provide anything except an abstraction of the WebSocket to a web stream, that is we would want to create a class WebSocketStream
to replicate QUICStream
in js-quic. That would js-ws
could still be used in the browser, but the underlying websocket implementation would just be whatever the browser provides.
Full alignment is not possible until https://github.com/MatrixAI/js-ws/issues/2 is achieved.
For now WebSocketStream
can be done later. Maybe we can do it this month not sure.
Instead this epic will focus only on what's left here. Further WS work will occur in js-ws.
Since you removed clientMaxReadableStreamBytes
in MatrixAI/Polykey#535, does that mean backpressure has been tested?
It's a bit tricky to test since it's an internal implementation detail. But I did see back-pressure happening while testing. I can double check and attempt to make a test for it.
A test should be made to check whether it applies on both ends. @amydevs can do this during the WS extraction.
@amydevs please have this and #4 fixed by #5, and please comment here on how this was solved based on today's discussion.
The backpressure has been revised, speced, and implemented in #5. Instead of calling .pause
on ws.WebSocket
, we are doing backpressure per stream, and forgoing backpressure on the entire connection. The reason for this is 'pausing' the entire WebSocketConnection
will unnecessary throttle all WebSocketStream
s where backpressure should only need to be applied per-stream.
Done with #5.
Specification
This issue is an epic for tracking the pending work for the
Websockets
code.Specifically an extraction of the behaviour to
js-ws
which behave similarly tojs-quic
.Additional context
Tasks
WebSocketStream
exists and we can mux and demux on top of a singleWebSocketConnection
WebSocketConection
equivalent toQUICConnection
, but as a 1 to 1 toWebSocket