MatrixAI / js-rpc

Stream-based JSON RPC for JavaScript/TypeScript Applications
https://polykey.com
Apache License 2.0
4 stars 0 forks source link

RPC message parser can support arbitrary whitespace such as '\n' within and between JSON RPC messages #1

Open tegefaulkes opened 1 year ago

tegefaulkes commented 1 year ago

Specification

Currently the agnostic RPC message parser separates the messages based on the}{ boundry., which makes it difficult to read messages since they are in a single line.

The objective is to improve readability by formatting a message with newlines and whitespaces, and ensuring that the parser is able to handle these characters.

// OLD
{some: 'message'}{other: 'message'}...

// new
{some: 'message'}
{other: 'message'}
...

This should be possible by adjusting the separator parameter which should be able to discard the newline and other whitespace characters when constructing the JsonStreamParser. (Constructing the stream)

Additional context

Related MatrixAI/Polykey#249 Related MatrixAI/Polykey#498

Tasks

CMCDragonkai commented 1 year ago

Arbitrary newlines and spaces should be possible.

CMCDragonkai commented 1 year ago

Any whitespace that could exist... can be automatically discarded and should not hit the buffer limit.

CMCDragonkai commented 1 year ago

Might need to just check what jsonstream parser supports. It could be limited to only a single character delimiter or no delimiters, if it can take in variable length delimitation of arbitrary whitespace, that would also be useful.

CMCDragonkai commented 1 year ago

Regarding websocket transport, each websocket frame is already a separate message frame.

Therefore, when we are translating websocket connections to web streams, we are simply concatenating the messages in the stream.

Due to being transport agnostic, the stream has no idea that the underlying transport, in this case websocket is already doing message framing. It just appears as a continuous byte stream to the RPC handler.

This issue is therefore about enabling the ability to parse optional unlimited whitespace in between complete JSON RPC messages.

In fact... it should be possible to parse unlimited whitespace anywhere that the JSON spec allows. This would depend on the jsonstream parser and what it is capable of doing.

Note that after the HTTP handshake, every websocket frame is already binary on the wire-level. It's not a text protocol like HTTP 1.1.

This means even if \n were to exist possibly due to something like conn.write(jsonMessage + '\n');, it would not actually aid in debugging the client service. This is because if you were to use wireshark or something similar, you would still have to decode the websocket binary frames first and then get to the payload which you would have to parse.

The addition of \n is there not necessary when using websocket as a transport for the RPC which is the case for client service.

However for quic as the transport, there's no equivalent to the websocket frame.

Anyway... the problem is 2 things:

  1. Enabling \n isn't really going to increase the ability to debug RPC messages because of the underlying transport. At the end of the day, something is going to be looking at the transport protocol and having to stitch together complete JSON messages.
  2. However being able to accept arbitrary whitespace including newlines would make our RPC system more robust to third party JSON RPC implementations... it would make it less strict in this sense.

It is important that if whitespace were to exist inside the JSON stream, that these whitespace characters are dropped to avoid a build up of memory. If they are in fact dropped, they should not count towards our buffer limit. This needs to be verified.

CMCDragonkai commented 1 year ago

Going to move this to js-rpc as it would be done over there.

CMCDragonkai commented 1 year ago

Hey @addievo don't close issues directly. Only when a PR is merged is the issue auto-closed.

CMCDragonkai commented 1 year ago

Since the PR hasn't been merged. You need to say Fixes #1 in the list. No need for checkboxes. The template was designed to be good enough.

tegefaulkes commented 1 year ago

The core goal of this issue is to allow the stream JSON parser to allow for arbitrary white-space for it's input.

What the binaryToJsonMessageStream middleware does is takes a stringified JSON stream and converts it to JSON objects. Right now the input stream is expected to be separated by nothing. That is the messages are formatted with {...message}{...message}.

We want to allow for arbitrary white space in the stream when separating the messages. So any combination of white space should be valid on the stream. For example the following should be valid

// Normal separation
{...message}{...message}{...message}

// Spaces
{...message} {...message}    {...message}

// New lines
{...message}
{...message}
{...message}

// Any combination
{...message}                                  {...message}
                       {...message}
{...message}

Note that this is on the input to the parser and stream transform.

This is to allow flexibility with 3rd parties using the RPC API. I for example could connect to the rpc through netcat and write my own hand crafted JSONRPC messages that are separated by newlines.

addievo commented 1 year ago

Whitespace removal - possible change required in jsonstreamparser. See https://github.com/juanjoDiaz/streamparser-json/issues/35 - I suspect this requires going into the source code and modifying the tokeniser so it tokenises a whitespace token based on an array of candidate characters. Extend the separator option to have an array of potential strings instead of a single string like ['\n', '\t', '\r']. Alternatively, allow separator to instead be a regular expression. Remember that the tokeniser should be capable of greedily eating up all the separators into 1 token.

CMCDragonkai commented 1 year ago

Will depend on https://github.com/juanjoDiaz/streamparser-json/issues/35. Putting this into the to do to revisit.

addievo commented 1 year ago

https://github.com/juanjoDiaz/streamparser-json/pull/36

CMCDragonkai commented 1 year ago

juanjoDiaz/streamparser-json#36

@addievo what does that have to do with this problem?

juanjoDiaz commented 6 months ago

In case you don't follow the streamparser-json repo, it turns our that using random whitespaces as separators was always allowed by simply setting the separator to and empty string ''

CMCDragonkai commented 2 months ago

@amydevs can you verify if this is is already done as per the comment by @juanjoDiaz?