MatrixAI / js-rpc

Stream-based JSON RPC for JavaScript/TypeScript Applications
https://polykey.com
Apache License 2.0
4 stars 0 forks source link

RPC Raw Streams for Binary Data #2

Closed tegefaulkes closed 1 year ago

tegefaulkes commented 1 year ago

Specification

RPC methods that need to send binary data such as secrets data for the client RPC or GIT pack data for the agent RPC. This needs to make use of the raw streams in the RPC system.

Raw streams take a header message and then act as a normal binary stream. The header can include any metadata you need. The binary data can then be streamed. To ensure there are no problems with Endianness we can chunk the data into protobuf messages.

Some things to note

  1. Raw streams don't have a return header message/metadata currently. It could be implemented.
  2. Raw streams don't support middleware.
  3. If we use protobuf messages then we need a standard way of delimiting messages in the stream.

See https://github.com/MatrixAI/js-rpc/issues/2 for clarification on the usage of binary data formats like CBOR or Protobuf. It's only needed in limited usecases, and can be done as part of the RPC caller and handler.

When dropping to raw streams there should still be an exchange protocol between client and server.

The client sends a "header" message, and then should expect the server to send back a "header" message. Both should be in JSON. Only then the client can drop down to binary. However this can be just one of the ways of using it. This primarily applies to streaming: client streaming, server streaming and duplex streaming.

For example:

It seems we should enforce that there be a header message sent, even if it is empty like a {}.

Additional context

Tasks

  1. Spec out a standard way of sending raw file/binary data.
  2. Update all relevant handlers to use raw streaming for sending this data.
tegefaulkes commented 1 year ago

As just discussed. we need to add some changes to the RPC system here.

  1. for raw duplex streams we need to support the server responding with leading metadata with a JSONRPC message.
  2. Nice but not immediately needed, we can support middleware for the leading metadata messages for raw duplex streams.
  3. Trailing metadata is optional, but we do need a standard way of serialising binary data for the raw stream. We may need a standard way of delimiting messages as well.

Either of these points might be a new issues. TBD.

CMCDragonkai commented 1 year ago

Also this one before extracting out to Polykey CLI?

tegefaulkes commented 1 year ago

I'm only going to get to this at the trailing end of the agent migration. In fact, I may have to complete the agent migration PR before the CLI change just so we have a clean history to start from. But maybe not, CLI is reasonably separate from agent migration..

CMCDragonkai commented 1 year ago

Some notes for binary data formats for encapsulation:

The protobuf-es appears to support dynamic messages that don't require creating a proto file.

The creation of a proto file is sort of unnecessary in most of our API due to our usage of JSON RPC.

It would be rare for us to need protobuf. As per the comment in https://github.com/MatrixAI/Polykey/issues/495#issuecomment-1635143457 the protobuf is only needed for mixed messages (and in most cases we will use JSON with base64url) and/or chunked processing (that isn't already governed by underlying protocols like git and filesystem).

Furthermore CBOR is pure JS, it runs in the browser too. So CBOR is likely the better solution since it would be used in the DB with https://github.com/MatrixAI/js-db/issues/58 and then the same dependency can be shared. CBOR though is a less popular format... so third party clients may find it more difficult.

CMCDragonkai commented 1 year ago

We can't be sure that endianness actually matters here. Because we would just be sending bytes.

See the discussion here: https://chat.openai.com/share/8786aa4f-53c2-4b25-95a0-c29e7223da12

image

This basically means for file transfers, we should not need to care about endianness. This should apply to when we are doing vault cloning and pulling, and uploading/downloading files to the vault.

It would only matter if we are sending "structured" binary data that we expect to be interpreted by a program. But even then, this structured binary data would have to be as if we were to just serialise a C struct (literally raw memory) and send it over. Most binary formats now specify a specific endianness.

CMCDragonkai commented 1 year ago

So yes, if we were to send 1 as a ascii or utf-8 text, it would be fine. If we were to send 1 as a uint64_t, then endianness would be a problem. So to avoid the problem we would only either send 1 as utf-8 text OR within a protobuf/cbor message.

tegefaulkes commented 1 year ago

Maybe the terminology is a bit weird? We don't technically downgrade the stream. What we do is take the raw binary stream and pipe it into a JSON message parser and read out a single message. When then cancel the parser stream allowing the raw stream to be used again. That raw stream is then passed to the raw handler along with the initial message that was read.

Note that we only have 1 kind of raw stream which is a duplex stream. Client, server or unary behaviour has to be re-implemented within the raw handler.

All the normal style handlers are built on-top of the raw handler. It takes the raw stream and pipes it through the middle-ware and parsing streams before providing it to the handlers re-inserting the initial message into the stream while doing so.

We need to add the feature where a raw stream can respond with leading metadata. Ideally the middle-ware is used for leading metadata. In all cases the raw handler should respond immediately with leading metadata before using the stream. This may have to be done callback style.

CMCDragonkai commented 1 year ago

Oh I must have forgotten then. Only the duplex stream is raw.

Well in that case, duplex stream should allow the client and server to both send a header message.

However I think the leading header message should be required... before you continue using it in binary style.

One question is that I assume that ordering constraints don't need to exist. That is the client can immediately start sending binary data after sending the header without waiting for the response. Or the client can wait for the response. It's up to the client.

The question is whether we should force the existence of the leading JSON message first from both ends?

CMCDragonkai commented 1 year ago

Ok we can force the server stream to send a JSON message as the leading message before it can send anything else. If it has no header to send, it can send {}. But if it has nothing at all, then it can just close the stream.

CMCDragonkai commented 1 year ago

One issue with this idea is that if an error occurs and we are already at the raw stream for duplex, then there's no way to propagate errors anymore.

Unless there was also a possibility of trailing messages. But that's more difficult to do since it's not possible to know when the raw stream data finishes and the trailing message starts. Because as a raw stream all possible bytes are legal. Unless the raw stream data were to be encapsulated in cbor/protobuf or other message framing constructs.

CMCDragonkai commented 1 year ago

Regarding errors.

For regular JSON streams, errors are captured and converted to JSON and the other side will get it as an exception.

For raw streams, errors are not captured and converted to JSON. Instead it depends on the underlying stream transport on how these errors are propagated like QUICStream or websockets. This means information might get lost, and the other side gets a "lower level" stream error.

So I think we can go with this. But it is necessary to document and test the error behaviour of raw streams vs JSON streams for both QUIC streams and WebSocket streams. In particular I'd like have some common wrapper for stream errors that allows us to distinguish errors that occur at the RPC level, vs errors that are on the underlying stream (possibly caused by using raw streams).

CMCDragonkai commented 1 year ago

Further notes show that when using a stream. The underlying stream errors can only have a code. Well in the case of web sockets there can also be a data buffer. But to align the lowest common denominator between quic streams and web socket connections, it would have to be just a regular code.

Then we need to distinguish between stream level errors and RPC level errors. That can be done with a different exception class.

Then we need to consider UI behaviour. Suppose a user runs pk secrets cp. The expectation is that the file gets fully copied or it doesn't. There shouldn't be a partial copy. This means if a CTRL+C gets sent, that should propagate into the caller, there should be a reason that can be understood, and then the caller needs to decide to whether to throw an error into the stream, or to close the stream gracefully. If it closes gracefully, that would behave like a pipe. If it closes with an error, that would make the other side think the operation should be discarded. So the behaviour would be handler dependent.

@tegefaulkes can you add some context on what needs to be changed for streams to deal with this. I think the websocket connection currently doesn't actually encode any error code. And in the case of quic stream, you have to use reasonToCode and codeToReason. I believe these 2 should be unified.

The spec for this should be updated above!

tegefaulkes commented 1 year ago

I think we can copy the reasonToCode and codeToReason methods and add them to the websocket. The behaviour can be the same for the websockets and quic. So long as codeToReason creates a distinct error then we can distinguish betweem RPC level errors and underlying stream errors.

CMCDragonkai commented 1 year ago

Can you reference the part of the code that needs to be changed to support the error codes for web sockets?

CMCDragonkai commented 1 year ago

I think we need further elaboration on what the shared types will be for all of this.

tegefaulkes commented 1 year ago

The only handlers that would make use of this currently are the two agent git handlers vaultsGitInfoGet and vaultsGitPackGet. These will be updated in agent migration phase 2.

CMCDragonkai commented 1 year ago

There will also be uploading/downloading files from filesystem into the Polykey Agent. That will be necessary for client service.

CMCDragonkai commented 1 year ago

@tegefaulkes while playing around with websocat I noticed that it's possible to have 2 modes for websockets, binary and text mode. This apparently changes the encoding of the payload? Not sure.

But if we are using JSON RPC, then we should use text mode. Is it possible to then switch to binary mode when we drop down to raw streams?

websocat -b # binary mode, see websocat --help
websocat -t # text mode, this is the default
CMCDragonkai commented 1 year ago

I asked chatgpt this, and the ws library switches between text and binary mode depending on the type of data you are writing to the connection. So it is possible to always use binary mode by just wrapping any text as a utf-8 encoded buffer.

image.png

CMCDragonkai commented 1 year ago

We should make sure that we are always using binary mode for simplicity then.

CMCDragonkai commented 1 year ago

I think the only thing that needs to be done in PK right now is to ensure that there are leading messages from both ends even when dropping to binary protocol.

However further development of this depends on the error management between connection-errors, stream-errors and RPC-call-errors: MatrixAI/js-rpc#3. This extra development would make sense to be done in js-rpc after RPC is factored out.

CMCDragonkai commented 1 year ago

Because of that, I'm moving this to js-rpc, and instead a new issue should be created just to ensure leading messages should occur. That will be part of agent migration phase 2 in terms of fixing vault/git related RPC calls @tegefaulkes.

CMCDragonkai commented 1 year ago

Vault RPC was updated to use the proper raw streaming.

Which means header message protocol is updated.

However there aren't any tests on edge cases and corner cases where {} or some other random JSON value is sent as the initial message.

Any JSON value is a valid header message.

From the handlers point of view any valid JSON can be set to the params: ... property.

On the stream level, the header messages are always JSONRPC records with the initial one as a JSONRPC request, and reverse is a JSONRPC response.

CMCDragonkai commented 1 year ago

Currently we don't have tests that test badly implemented clients/servers. Where they are not sending a proper JSONRPC message. These need to be tested and it should a simple connection or stream error depending on the abstraction level that is available.

@amydevs there should be tests added to this repo, so that we get automatic closure of the stream in case of a bad message.

If the application level protocol is not followed, it should be an application protocol error. That might just be an error code plus message on the stream abstraction provided by the transport level.

Alternatively we could provide a JSONRPC response back depending on the situation. But what if the server sends back something wrong. Early termination at the stream level is probably sufficient.

CMCDragonkai commented 1 year ago

This is connected to the #3 and https://github.com/MatrixAI/js-ws/issues/4.

CMCDragonkai commented 1 year ago

More handlers in Polykey-CLI needs to be checked if they are using raw streams. For example uploading secrets from local disk to the PK agent via the PK CLI. It should be using raw streams but that can be reviewed soon on the PK CLI.

CMCDragonkai commented 1 year ago

This was done in https://github.com/MatrixAI/Polykey/pull/535.