Closed tegefaulkes closed 1 year ago
As just discussed. we need to add some changes to the RPC system here.
JSONRPC
message.Either of these points might be a new issues. TBD.
Also this one before extracting out to Polykey CLI?
I'm only going to get to this at the trailing end of the agent migration. In fact, I may have to complete the agent migration PR before the CLI change just so we have a clean history to start from. But maybe not, CLI is reasonably separate from agent migration..
Some notes for binary data formats for encapsulation:
The protobuf-es appears to support dynamic messages that don't require creating a proto file.
The creation of a proto file is sort of unnecessary in most of our API due to our usage of JSON RPC.
It would be rare for us to need protobuf. As per the comment in https://github.com/MatrixAI/Polykey/issues/495#issuecomment-1635143457 the protobuf is only needed for mixed messages (and in most cases we will use JSON with base64url) and/or chunked processing (that isn't already governed by underlying protocols like git and filesystem).
Furthermore CBOR is pure JS, it runs in the browser too. So CBOR is likely the better solution since it would be used in the DB with https://github.com/MatrixAI/js-db/issues/58 and then the same dependency can be shared. CBOR though is a less popular format... so third party clients may find it more difficult.
We can't be sure that endianness actually matters here. Because we would just be sending bytes.
See the discussion here: https://chat.openai.com/share/8786aa4f-53c2-4b25-95a0-c29e7223da12
This basically means for file transfers, we should not need to care about endianness. This should apply to when we are doing vault cloning and pulling, and uploading/downloading files to the vault.
It would only matter if we are sending "structured" binary data that we expect to be interpreted by a program. But even then, this structured binary data would have to be as if we were to just serialise a C struct (literally raw memory) and send it over. Most binary formats now specify a specific endianness.
So yes, if we were to send 1
as a ascii or utf-8 text, it would be fine. If we were to send 1
as a uint64_t
, then endianness would be a problem. So to avoid the problem we would only either send 1
as utf-8 text OR within a protobuf/cbor message.
Maybe the terminology is a bit weird? We don't technically downgrade the stream. What we do is take the raw binary stream and pipe it into a JSON message parser and read out a single message. When then cancel the parser stream allowing the raw stream to be used again. That raw stream is then passed to the raw handler along with the initial message that was read.
Note that we only have 1 kind of raw stream which is a duplex stream. Client, server or unary behaviour has to be re-implemented within the raw handler.
All the normal style handlers are built on-top of the raw handler. It takes the raw stream and pipes it through the middle-ware and parsing streams before providing it to the handlers re-inserting the initial message into the stream while doing so.
We need to add the feature where a raw stream can respond with leading metadata. Ideally the middle-ware is used for leading metadata. In all cases the raw handler should respond immediately with leading metadata before using the stream. This may have to be done callback style.
Oh I must have forgotten then. Only the duplex stream is raw.
Well in that case, duplex stream should allow the client and server to both send a header message.
However I think the leading header message should be required... before you continue using it in binary style.
One question is that I assume that ordering constraints don't need to exist. That is the client can immediately start sending binary data after sending the header without waiting for the response. Or the client can wait for the response. It's up to the client.
The question is whether we should force the existence of the leading JSON message first from both ends?
Ok we can force the server stream to send a JSON message as the leading message before it can send anything else. If it has no header to send, it can send {}
. But if it has nothing at all, then it can just close the stream.
One issue with this idea is that if an error occurs and we are already at the raw stream for duplex, then there's no way to propagate errors anymore.
Unless there was also a possibility of trailing messages. But that's more difficult to do since it's not possible to know when the raw stream data finishes and the trailing message starts. Because as a raw stream all possible bytes are legal. Unless the raw stream data were to be encapsulated in cbor/protobuf or other message framing constructs.
Regarding errors.
For regular JSON streams, errors are captured and converted to JSON and the other side will get it as an exception.
For raw streams, errors are not captured and converted to JSON. Instead it depends on the underlying stream transport on how these errors are propagated like QUICStream
or websockets. This means information might get lost, and the other side gets a "lower level" stream error.
So I think we can go with this. But it is necessary to document and test the error behaviour of raw streams vs JSON streams for both QUIC streams and WebSocket streams. In particular I'd like have some common wrapper for stream errors that allows us to distinguish errors that occur at the RPC level, vs errors that are on the underlying stream (possibly caused by using raw streams).
Further notes show that when using a stream. The underlying stream errors can only have a code. Well in the case of web sockets there can also be a data buffer. But to align the lowest common denominator between quic streams and web socket connections, it would have to be just a regular code.
Then we need to distinguish between stream level errors and RPC level errors. That can be done with a different exception class.
Then we need to consider UI behaviour. Suppose a user runs pk secrets cp
. The expectation is that the file gets fully copied or it doesn't. There shouldn't be a partial copy. This means if a CTRL+C
gets sent, that should propagate into the caller, there should be a reason that can be understood, and then the caller needs to decide to whether to throw an error into the stream, or to close the stream gracefully. If it closes gracefully, that would behave like a pipe. If it closes with an error, that would make the other side think the operation should be discarded. So the behaviour would be handler dependent.
@tegefaulkes can you add some context on what needs to be changed for streams to deal with this. I think the websocket connection currently doesn't actually encode any error code. And in the case of quic stream, you have to use reasonToCode
and codeToReason
. I believe these 2 should be unified.
The spec for this should be updated above!
I think we can copy the reasonToCode
and codeToReason
methods and add them to the websocket. The behaviour can be the same for the websockets and quic. So long as codeToReason
creates a distinct error then we can distinguish betweem RPC level errors and underlying stream errors.
Can you reference the part of the code that needs to be changed to support the error codes for web sockets?
I think we need further elaboration on what the shared types will be for all of this.
The only handlers that would make use of this currently are the two agent git handlers vaultsGitInfoGet
and vaultsGitPackGet
. These will be updated in agent migration phase 2.
There will also be uploading/downloading files from filesystem into the Polykey Agent. That will be necessary for client service.
@tegefaulkes while playing around with websocat
I noticed that it's possible to have 2 modes for websockets, binary and text mode. This apparently changes the encoding of the payload? Not sure.
But if we are using JSON RPC, then we should use text mode. Is it possible to then switch to binary mode when we drop down to raw streams?
websocat -b # binary mode, see websocat --help
websocat -t # text mode, this is the default
I asked chatgpt this, and the ws
library switches between text and binary mode depending on the type of data you are writing to the connection. So it is possible to always use binary mode by just wrapping any text as a utf-8 encoded buffer.
We should make sure that we are always using binary mode for simplicity then.
I think the only thing that needs to be done in PK right now is to ensure that there are leading messages from both ends even when dropping to binary protocol.
However further development of this depends on the error management between connection-errors, stream-errors and RPC-call-errors: MatrixAI/js-rpc#3. This extra development would make sense to be done in js-rpc
after RPC is factored out.
Because of that, I'm moving this to js-rpc
, and instead a new issue should be created just to ensure leading messages should occur. That will be part of agent migration phase 2 in terms of fixing vault/git related RPC calls @tegefaulkes.
Vault RPC was updated to use the proper raw streaming.
Which means header message protocol is updated.
However there aren't any tests on edge cases and corner cases where {}
or some other random JSON value is sent as the initial message.
Any JSON value is a valid header message.
From the handlers point of view any valid JSON can be set to the params: ...
property.
On the stream level, the header messages are always JSONRPC records with the initial one as a JSONRPC request, and reverse is a JSONRPC response.
Currently we don't have tests that test badly implemented clients/servers. Where they are not sending a proper JSONRPC
message. These need to be tested and it should a simple connection or stream error depending on the abstraction level that is available.
@amydevs there should be tests added to this repo, so that we get automatic closure of the stream in case of a bad message.
If the application level protocol is not followed, it should be an application protocol error. That might just be an error code plus message on the stream abstraction provided by the transport level.
Alternatively we could provide a JSONRPC
response back depending on the situation. But what if the server sends back something wrong. Early termination at the stream level is probably sufficient.
This is connected to the #3 and https://github.com/MatrixAI/js-ws/issues/4.
More handlers in Polykey-CLI needs to be checked if they are using raw streams. For example uploading secrets from local disk to the PK agent via the PK CLI. It should be using raw streams but that can be reviewed soon on the PK CLI.
This was done in https://github.com/MatrixAI/Polykey/pull/535.
Specification
RPC methods that need to send binary data such as secrets data for the client RPC or GIT pack data for the agent RPC. This needs to make use of the raw streams in the RPC system.
Raw streams take a header message and then act as a normal binary stream. The header can include any metadata you need. The binary data can then be streamed. To ensure there are no problems with Endianness we can chunk the data into protobuf messages.
Some things to note
See https://github.com/MatrixAI/js-rpc/issues/2 for clarification on the usage of binary data formats like CBOR or Protobuf. It's only needed in limited usecases, and can be done as part of the RPC caller and handler.
When dropping to raw streams there should still be an exchange protocol between client and server.
The client sends a "header" message, and then should expect the server to send back a "header" message. Both should be in JSON. Only then the client can drop down to binary. However this can be just one of the ways of using it. This primarily applies to streaming: client streaming, server streaming and duplex streaming.
For example:
It seems we should enforce that there be a header message sent, even if it is empty like a
{}
.Additional context
Tasks