MatrixAI / js-ws

Websocket Transport Library for TypeScript/JavaScript Applications
https://matrixai.github.io/js-ws/
Apache License 2.0
2 stars 0 forks source link

Extract src/websockets from Polykey to js-ws #5

Closed amydevs closed 1 year ago

amydevs commented 1 year ago

Description

This PR moves src/websockets from https://github.com/MatrixAI/Polykey to this repository. It also implements backpressure and stream multiplexing according to the spec in #2.

Issues Fixed

Tasks

Final checklist

ghost commented 1 year ago
👇 Click on the image for a new way to code review #### [![Review these changes using an interactive CodeSee Map](https://s3.us-east-2.amazonaws.com/maps.codesee.io/images/github/MatrixAI/js-ws/5/aeb63b1e/518e74468ec49064b9a389e8e390e564a9132082.svg)](https://app.codesee.io/r/reviews?pr=5&src=https%3A%2F%2Fgithub.com%2FMatrixAI%2Fjs-ws) #### Legend CodeSee Map legend
CMCDragonkai commented 1 year ago

Can this fix #2 and #4 at the same time?

tegefaulkes commented 1 year ago

We shouldn't kitchen sink PRs, Try do do them in a new PR if they're not required by changes in this one.

amydevs commented 1 year ago

When we are in a node environment, we are using the ws library. In these cases, the user is able to provide a verifyCallback to verify that the server-supplied peerCert is as expected. This is done by adding an event listener to the upgrade event on a ws.WebSocket event target. This facilitates connecting the PKE backend to PolyKey nodes (using self-signed certs).

However, this event is not exposed in the window.WebSocket object that exists within browsers. There is no way to allow a self-signed certificate without user intervention within browsers, hence we should throw an error is a verifyCallback if provided in a browser environment. This facilitates connecting the PKE frontend to the PKE backend (using certs issued by a valid CA).

CMCDragonkai commented 1 year ago

Yea the browser will be considered a deficient environment. The API should look the same but we should be strict on creation where we throw errors for unimplemented features because of browser limitations.

amydevs commented 1 year ago

There will need to be new classes created, I'm in the process of doing so at the moment. Mostly WSSocket and WSConnection.

The reason for the first being that in QUIC, a QUICSocket has a one to many relationship with QUICConnection. Meanwhile, a WebSocketConnection has a one to one relationship with a ws.WebSocket or window.WebSocket. Hence, a WSSocket class needs to be created to wrap and group ws.WebSockets, emulating how a QUICSocket can have multiple connections.

This will mean that the interfaces for WSServer and WSClient need to change. These classes originally contained WebSocketStream instances, which were one-to-one with ws.WebSocket. Essentially WebSocketStream is what really the WSConnection should be, barring the ReadableWritablePair, which should be sectioned off to WSStream instead.

WSConnection will have a one-to-many relationship with WSStream by implementing muxing.

amydevs commented 1 year ago

WSSocket will be simply a container for WSConnections and ws.WebSockets.

Furthermore, the creation of WSServer should no longer be asynchronous. The start should be the real binding of the https.Server instead. Instead of creating a WebSocketStream on connectionHandler, it should be now WSConnection instead. WSConnection should parse the the ws.WebSocket received messages for the creation of WSStreams

tegefaulkes commented 1 year ago

Feel free to ask me any questions about how the WebSockets and WebStreams work, I've worked with them a fair bit.

amydevs commented 1 year ago

Only the server should be pinging, and the client responds with pongs. We do this as ping/ponging is not available on the window.WebSocket instance. However, browsers will automatically respond with pongs to server-sent pings, rather than exposing the feature to the user. Since we cannot control this behaviour, we have to implement similar behaviour on our WebSocketClient in a nodejs context. Or we can implement application-level ping/ponging.

CMCDragonkai commented 1 year ago

Ignore the browser environment for a moment.

The websocket spec already defines how ping/pong works. I'm not actually sure which side is supposed to be sending the pings. Can you find out? Either way, keep alive mechanism must exist on both sides so that if the other side stops sending activity, we end up timing out.

According to a quick search, browser websockets might actually be doing the keep alive internally already, so when using browser websockets, you just throw an exception when the parameter is set. Similar to what we talked about already about configuration not available on the browser.

CMCDragonkai commented 1 year ago

@amydevs update the spec above with all the changes that is required. Including the spec on how the ping will work between client and server as drawn on the whiteboard.

amydevs commented 1 year ago

for client - server ping-pong keep alive here's what should happen: client and server both start pinging initially if the client receives a ping, stop pinging if the server receives a ping, keep pinging if the server receives two pings, stop pinging

this could be expanded upon by using the keepalive interval timer. for example, if the keepalive interval timer on a server is higher than on a client, the client can eagerly ping to establish keepalive responsibility

CMCDragonkai commented 1 year ago

Regarding benchmarking:

iperf -s # server side
iperf -c 127.0.0.1 -l 1024 # client side

image

Do this on your local system to test your localhost. Use the -l 1024 to use a message buffer size of 1024 bytes or 1 KiB. (We always use KiB, MiB, GiB, never KB, MB, GB).

On the Linux OS, the kernel will set the underlying TCP buffer size. We don't need to change this.

Get a baseline for just regular TCP doing the same thing, you can compare against iperf3. That gives you the overhead of JS.

Then use ws in binary mode and do the same thing with 1024 bytes, now you know the overhead of ws.

Then use js-ws and benchmark only the connection, not the stream. Now you now the overhead of js-ws's connection.

Then use js-ws and benchmark a single stream on single connection. Now you know the overhead over the muxing/demuxing. In this mode, you now have 1 additional buffer to control, and that's the per-stream buffer, keep this consistent and try different per-stream buffer sizes, doubling each time.

amydevs commented 1 year ago

my iperf3 results performed with

iperf3 -s # server side
iperf3 -c 127.0.0.1 -l 1024 # client side

image

CMCDragonkai commented 1 year ago

Browser support in a different PR. Until that is done, it won't be useful in PKE frontend. however PKE backend and PK can make use of this.

Make sure to go through this with @addievo so you 2 get the same architecture.

CMCDragonkai commented 1 year ago

So on the your laptop it's roughly 10 MiB/s throughput using 1 KiB message sizes for a websocketstream.

Further optimisation can definitely close the gap between websocket stream to websocket connection which is roughly 20 MiB/s for ws (which is something we cannot change atm).

So max theoretical performance will be 20 MiB/s for your laptop.

CMCDragonkai commented 1 year ago

This PR will require some review regarding the usage of events just so we are aligned here.

CMCDragonkai commented 1 year ago

Had to update some benchmark running code to incorporate this fix https://github.com/MatrixAI/Polykey/commit/8dd41f72ad377c90a6c2dab2b1371f9eb39a461d @amydevs

CMCDragonkai commented 1 year ago

Please change to using generateTLSConfig instead: https://github.com/MatrixAI/js-quic/pull/53#issuecomment-1721343197.

CMCDragonkai commented 1 year ago

Reason for why I couldn't import testsUtils: https://github.com/MatrixAI/js-quic/pull/53#issuecomment-1721340830

Basically use fast-check in tests/utils.ts. Not @fast-check/jest. But use @fast-check/jest in the actual tests.

CMCDragonkai commented 1 year ago

Please finish with TLS tests (create tasks above too):

Important to ensure that the TLS system is the same as js-quic. Make sure you can supply client certificates and server certificates optionally. Note that unlike js-quic, server certs is not mandatory. Make sure the verifyPeer option works the same way as js-quic though!!

Some challenges:

If we cannot get auto-downgrade working, then defaulting to HTTP on client without certs is a bad idea. Therefore... one could instead always use HTTPS. But I believe if this is the case, then it is mandatory for certs on server side. Which would match js-quic behaviour.

Also https://github.com/MatrixAI/js-quic/pull/53#issuecomment-1722666768 - remember that verifyCallback should be PromiseLike<void>.

CMCDragonkai commented 1 year ago

You don't have unidirectional streams so your stream id can just increment by the bidi rules? I think only client and server distinguished.

amydevs commented 1 year ago

You don't have unidirectional streams so your stream id can just increment by the bidi rules? I think only client and server distinguished.

yeah, i've changed it so that ids only increment at a rate of 2 rather than 4 since uni directional doesn't exist

CMCDragonkai commented 1 year ago

You only need 1 bit then? Not 2 bits like in QUIC?

CMCDragonkai commented 1 year ago

3.2.0 of js-events will default to undefined not null for the detail property if not specified. It's being built and released by the CI/CD atm.

amydevs commented 1 year ago

You only need 1 bit then? Not 2 bits like in QUIC?

1 bit for?

CMCDragonkai commented 1 year ago

So wss is prepended automatically inside WebSocketClient, so that way we are always defaulting to HTTPS. Going to drop HTTP support anyway.

CMCDragonkai commented 1 year ago

Standardise TLS between WS and QUIC: https://github.com/MatrixAI/Polykey/issues/551#issuecomment-1730732555

CMCDragonkai commented 1 year ago

@amydevs can you write final set of tasks above? And squash and merge!

CMCDragonkai commented 1 year ago

Good idea to document how your TLS behaves btw with all the potential failure scenarios.

amydevs commented 1 year ago

Good idea to document how your TLS behaves btw with all the potential failure scenarios.

i think i'm going to have to change how the TLS errors to be more specific rather than just internal, though the problem is that i can't really tell a socket error is because of a peer TLS failure in the case where verifyCallback is null, because ws simply errors that the server has dropped the socket

amydevs commented 1 year ago

i'll probably standardize the TLS errors like QUIC first and then document them

CMCDragonkai commented 1 year ago

Some things to port over from js-quic:

amydevs commented 1 year ago

I've added inline documentation to WebSocketConnection:

/**
 * Note that on TLS verification failure, {@link events.EventWebSocketConnectionError} is emitted with the following `event.detail`:
 * - If we failed to verify the peer, the `event.detail` will be an instance of {@link errors.ErrorWebSocketConnectionLocalTLS}.
 * - If the peer failed to verify us, the `event.detail` will be an instance of {@link errors.ErrorWebSocketConnectionPeer} with an error code of {@link utils.ConnectionErrorCode.AbnormalClosure}.
 *
 * The reason for this is that when the peer fails to verify us, Node only tells us that the TCP socket has been reset, but not why.
 */
CMCDragonkai commented 1 year ago

Hooray!