MatrixAI / js-quic

QUIC Networking for TypeScript & JavaScript
https://matrixai.github.io/js-quic/
Apache License 2.0
13 stars 1 forks source link

Create QUIC library that can be exposed to JS and uses the Node `dgram` module #1

Closed CMCDragonkai closed 1 year ago

CMCDragonkai commented 1 year ago

Specification

We need QUIC in order to simplify our networking stack in PK.

QUIC is a superior UDP layer that can make use of any UDP socket, and create multiplexed reliable streams. It is also capable of hole punching either just by attempting to send ping frames on the stream, or through the unreliable datagrams.

Our goals is to make use of a QUIC library, something that is compilable to desktop and mobile operating systems, expose its functionality to JS, but have the JS runtime manage the actual sockets.

On NodeJS, it can already manage the underlying UDP sockets, and by relying on NodeJS, it will also ensure that these sockets will mix well with the concurrency/parallelism used by the rest of the NodeJS system due to libuv and thus avoid creating a second IO system running in parallel.

On Mobile runtimes, they may not have a dgram module readily available. In such cases, having an IO runtime to supply the UDP sockets may be required. But it is likely there are already existing libraries that provide this like https://github.com/tradle/react-native-udp.

The underlying QUIC library there is expected to be agnostic to the socket runtime. It will give you data that you need to the UDP socket, and it will take data that comes of the UDP socket.

However it does need 2 major duties:

  1. The multiplexing and managing of streams.
  2. The encryption/decryption TLS side

Again if we want to stay cross platform, we would not want to bind into Node.js's openssl crypto. It would require instead that the library can take a callback of crypto routines to use. However I've found that this is generally not the case with most existing QUIC libraries. But let's see how we go with this.

Additional context

QUIC and NAPI-RS

Sub issues

Tasks

  1. [x] Experiment with Neon or Napi-rs
  2. [x] Experiment with quiche by cloudflare
  3. [x] Create bridge code plumbing UDP sockets and QUIC functions
  4. [x] Create self signed TLS certificate during development - https://github.com/MatrixAI/js-quic/issues/1#issuecomment-1356229150
  5. ~Extract out the TLS configuration so that it can be set via in-memory PEM variable and key variable. - 2 day~ - see #2
  6. ~Decide whether application protocols are necessary here, or abstract the quiche Config so the user can decide this (especially since this is not a HTTP3 library). - 0.5 day~ - see #13
  7. [x] Fix the timeout events and ensure that when a timeout occurs, that the connection gets cleaned up, and we are not inadvertently clearing the timeout due to null. Right now when a quiche client connects to the server, even after closing, the server side is keeping the connection alive. - 1 day
  8. [x] We need lifecycle events for QUICConnection and QUICStream and QUICServer and QUICSocket. This will allow users to hook into the destruction of the object, and perhaps remove their event listeners. These events must be post-facto events. - 0.5 day
  9. ~[ ] Test the QUICStream and change to BYOB style, so that way there can be a byte buffer for it. Testing code should be able generator functions similar to our RPC handlers. - 1 day~ - see #5.
  10. [x] Complete the QUICClient with the shared socket QUICSocket. - 3 day
  11. ~Test the multiplexing/demultipexing of the UDP socket with multiple QUICClient and a single QUICServer. - 1 day~ - See #14
  12. ~[ ] Test the error handling of the QUIC stream, and life cycle and destruction routines. - 1 day~ - #10
  13. ~Benchmark this system, by sending lots of data through. - 1 day~ - See #15
  14. ~Propagate the rinfo from the UDP datagram into the conn.recv() so that the streams (either during construction or otherwise) can have its rinfo updated. Perhaps we can just "set" the rinfo properties of the connection every time we do a conn.recv(). Or... we just mutate the conn parameters every time we receive a UDP packet.~ - See #16
  15. ~Ensure that when a user asks stream.connection they can acquire the remote information and also all the remote peer's certificate chain.~ - See #16
  16. ~[ ] Integrate dgram sending and recving for hole punching logic.~ - see #4
  17. ~[ ] Integrate the napi program into the scripts/prebuild.js so we can actually build the package in a similar to other native packages we are using like js-db~ - see #7
CMCDragonkai commented 1 year ago

Ok I have managed to get the ability for quiche to send a packet. Actually I haven't sent the packet to the UDP socket, but I can see how quiche attempts to create an initial packet for me plumb into the UDP socket.

This involved several challenges.

The first is that self.0.send takes in a reference to a mutable buffer. There's a preference to avoid copying in Rust lang similar to C and C++.

This buffer can actually be taken from the outside back in JS land. This might be a better idea, as it minimises the amount of copy pasting going on.

Alternatively we can create the array in Rust, then return as a Buffer. However the send call returns the number of bytes written.

Therefore the buffer is pre-allocated to be MAX_DATAGRAM_SIZE and it is subsequently written to.

This means later the array needs to be sliced according the to the write size and then returned as a buffer.

This means we generally 2 styles of calling the Rust functions. We can call them the way rust expected to be called, by passing in a buffer as a parameter and returning the write length as well. It turns out that socket.send actually easily supports taking a buffer, offset and the length to be used: https://nodejs.org/api/dgram.html#socketsendmsg-offset-length-port-address-callback. In terms of speed, it would make use to re-use the same buffer over and over.

The other way is more idiomatic JS, which would mean that the function should return the buffer to be sent that is properly sliced up.

I think we should preserve the rust style, and then do any JS-specific idiomatic abstractions on the JS side.

At the same time, the send can also return an error which indicates that it is Done. If it is done, then there are no more packets to send. Using a JS style call, we can just return undefined in that situation. However I'm not entirely sure if this is actually possible using napi-rs. It seems you have return a Rust value, and that it automatically gets converted to JS via the napi-rs macros.

With rust style calls, the done is returned as an exception which is kind of strange, but also doable. But this is very much different from how JS should be used, and it means we also have to catch non-done exceptions.

Another thing to realise is that when you have situations where you have a buffer of some fixed size, and a then write length indicating how much of that buffer was used. This will always eventually lead to a slicing operation.

When doing any slicing, the function's stack will end up with a pointer (reference) to the slice, it won't do any copying.

Rust does not have dynamically allocated arrays that sit on the stack, so you have to use vectors instead. Vectors are heap-allocated arrays that can be dynamically changed. String is also heap allocated string relative to &str. At the same time, Rust manages the memory of vectors and knows when to deallocate them automatically.

So if we wanted to slice out from the array and also return it as a Buffer to JS, then it has to be turned into a vector first, and that vector will be dynamically allocated simply through the !vec macro.

Finally another problem is sending structs back to JS as objects. This requires hte usage of #[napi(object)] macro. However this only works on structs have all the fields public.

The send info contains information that is necessary to know how to send the packet data to the UDP socket.

It however ends up containing std::net::SocketAddr and std::time::Instant both which contain private fields. So these types cannot be easily automatically converted to JS object. It could probably be done if I implement some sort of interface for the SocketAddr. Like https://docs.rs/napi/latest/napi/trait.NapiRaw.html and https://docs.rs/napi/latest/napi/trait.NapiValue.html.

Alternatively I created my own Host struct that just contains a String and u16. This results in a conversion from the SocketAddr to Host and wrapping std::time::Instant into External::new() because it's supposed to be opaquely compared at the rust-level.

This led to a few other issues:

  1. You cannot declare nested structs in Rust, you have to give each one a separate name.
  2. There does not appear to be a way to send Rust tuples to JS, napi-rs does not understand how to deal with tuples. I would have thought that they just become arrays on the JS side... but no.
  3. Rust enums cannot be represented either, they become objects on the JS side. This is because Rust enums are actually ADTs. They have a tag, and then the subsequent value/values.

So with this all in mind, I ended up creating another custom struct just so that so I can get the value out of conn.send.

This reveals a structure that always contains the initial packet. This is probably the packet that leads to connection negotiation. If you call it again right now, you get a JS error with Done as the message.

{
  out: <Buffer c3 00 00 00 01 10 98 f4 9d 84 85 7a dd 8e f6 ef fe 25 d7 84 b3 e1 14 d3 46 1a ab 5d d2 25 f2 61 c7 0d b0 d7 83 10 88 f0 b3 f8 eb 00 40 e8 58 ad 8e 1e ... 1150 more bytes>,
  info: {
    from: { hostname: '127.0.0.1', port: 55551 },
    to: { hostname: '127.0.0.1', port: 55552 },
    at: [External: 2d723f0]
  }
}

Now we can just plumb this into the dgram module to send.

CMCDragonkai commented 1 year ago

When should the send method be called? Apparently it should be triggered under an under a number of different conditions:

https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.send

And it should be looped in fact, until it is done.

The conditions are:

The application should call send() multiple times until Done is returned, indicating that there are no more packets to send. It is recommended that send() be called in the following cases:

  • When the application receives QUIC packets from the peer (that is, any time recv() is also called).
  • When the connection timer expires (that is, any time on_timeout() is also called).
  • When the application sends data to the peer (for example, any time stream_send() or stream_shutdown() are called).
  • When the application receives data from the peer (for example any time stream_recv() is called).

Once is_draining() returns true, it is no longer necessary to call send() and all calls will return Done.

CMCDragonkai commented 1 year ago

So I can imagine attaching callback to things like on_timeout to trigger "sending", and any time we receive packets, and send data to the peer or receive data to the peer.

We could call this "socketFlushing", it will loop until done. The socket flushing would be triggered every time any of the above things happen... which in turn will flush all the packets to the socket.... will it wait for the socket to have actually sent it by waiting on the callback? I'm not sure if that's necessary...

I guess it depends. On the QUIC stream level, it should be considered independent, you don't wait for the socket to be flushed before returning to the user. But one could await a promise will resolve when they really want to have the entire socket flushed.

CMCDragonkai commented 1 year ago

Ok I can see that there is no support for tuples at all here: https://github.com/napi-rs/napi-rs/tree/main/examples/napi. Also not for tuple-structs which I use in a newtype style pattern.

The only thing closest is to create an array like so:

#[napi]
fn to_js_obj(env: Env) -> napi::Result<JsObject> {
  let mut arr = env.create_array(0)?;
  arr.insert("a string")?;
  arr.insert(42)?;
  arr.coerce_to_object()
}

On the generated TS it looks like a object.

I reckon it would be dynamically deconstructable as an array. Pretty useful here, to avoid having to define one-off structs.

CMCDragonkai commented 1 year ago

I've settled on just returning an Array. It beats having to redefine the output type all the time.

CMCDragonkai commented 1 year ago

The streams in quiche doesn't have its own data type. Instead they are identified by a stream ID. The connection struct provides the methods https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.stream_recv and you have to keep track of the stream IDs somehow.

So we would expect that upon creating an RPC call, it would call into the network to create a new stream for a connection.

It seems creating a new stream is basically performing this call: https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.stream_send

You select a stream ID like 0, and you just keep incrementing that number to ensure you have unique stream IDs. The client-side just increments the number, the server side just receives the streams.

Note that once we have a stream, the stream is fully duplex (not half-duplex), in-order and reliable.

CMCDragonkai commented 1 year ago

The quiche::connect and quiche::accept should be both factory method of creating the Connection class.

However in the case of the server side, you can't create a new connection until you've taken a specific packet from the UDP socket and parsed it first. You have to check the internal packet information to decide whether to accept a new connection, or to simply plumb it in as if you received the data for a given stream.

So the order of operations:

  1. Receive packet from UDP socket.
  2. Check if the packet is a new connection, if so, create the new connection, otherwise retrieve an existing connection.
  3. Call conn.recv() to plumb the packet data into some relevant stream.

This does mean we maintain clients hashmap of ConectionId => Connection.

CMCDragonkai commented 1 year ago

I created a function to randomly generate the connection ID.

It's a random 20 bytes.

Instead of using Rust's ring library to get the random data. I'm just calling into a passed in JS function to acquire the random data. This unifies the random data usage directly from the Node runtime (and reduces how much rust code we are using here).

/// Creates random connection ID
///
/// Relies on the JS runtime to provide the randomness system
#[napi]
pub fn create_connection_id<T: Fn(Buffer) -> Result<()>>(
  get_random_values: T
) -> Result<External<quiche::ConnectionId<'static>>> {
  let scid = [0; quiche::MAX_CONN_ID_LEN].to_vec();
  let scid = Buffer::from(scid);
  get_random_values(scid.clone()).or_else(
    |err| Err(Error::from_reason(err.to_string()))
  )?;
  let scid = quiche::ConnectionId::from_vec(scid.to_vec());
  eprintln!("New connection with scid {:?}", scid);
  return Ok(External::new(scid));
}

However I realised that since it's just a 20 byte buffer. Instead of passing an External<quiche::ConnectionId<'static>> around.

We can just take a Buffer directly that we expect to be of length MAX_CONN_ID_LEN.

Meaning the connection ID is just made in JS side, and then just passed in.

On the rust side, they can use quiche::ConnectionId::from_ref(&scid) which means there's no need to to keep any memory on the Rust side. It's all just on the JS side.

This basically means we generally try to keep as much data on the JS side, and manage minimal amounts of memory on the Rust side.

CMCDragonkai commented 1 year ago

The <'static> is used because it's sort of a dummy lifetime, because the ConnectionId is actually an enum of either an owned vector or a reference to some memory that exists elsewhere. The owned vector doesn't have a lifetime annotation.

CMCDragonkai commented 1 year ago

A better alternative to just Buffer would be our own ConnectionId newtype.

But I can't seem to create good constructor for it that would take a callback. Problem in https://github.com/napi-rs/napi-rs/issues/1374.

CMCDragonkai commented 1 year ago

On the connecting side, there's an option for the server name.

If QUIC uses the standard TLS verification, we could imagine that this will not work for us. I'm not entirely sure.

We need to provide our custom TLS verification logic. I haven't found how to do this yet with QUIC but it should be possible.


Found it: https://github.com/cloudflare/quiche/issues/326#issuecomment-577281881

It is in fact possible.

The standard certificate verification must be disabled, and you have to fetch the certificate itself to verify.

CMCDragonkai commented 1 year ago

The stream methods are mostly done on the rust side.

I've found that the largest type I can make use of is i64 not u32. This gets closer to what JS numbers are capable of, although not all of i64 can be represented in JS. but those are the edges, is unlikely to reach those.

So I've converted it to those. I'm still using as instead of the "safer" variants of try_into or try_from because it's just more convenient to coerce the numbers.

I've also found how to write the FromNapiValue trait. Basically it ends up calling NAPI-related C++ functions that convert the JS value (known as NAPI value) into a C-like value, and that value can be stored in Rust.

So basically whenever napi-rs can't automatically derive the marshalling code, we have to implement them like FromNapivalue and ToNapiValue.

I think the napi macros also expose ways of writing out exactly what the generated types should be. I'm still not entirely sure we want to do use the generated types from napi-rs because we can do it like in js-db where we just write out our TypeScript types manually.

CMCDragonkai commented 1 year ago

On the JS side, I'm hoping to manage most of the state there. That means things like a map of connection IDs to connection objects, and maps of stream IDs to stream data.

That way the Rust part stays as pure as possible. We would only have 1 UDP socket for all of this. So that would simplify the the network management.

CMCDragonkai commented 1 year ago

It turns out in order to send dgrams, you need to have a connection object created. However if we aren't calling conn.send, we aren't sending the initial packet for handshaking purposes.

This means, we can use conn.dgram_send for the hole punching packets first, before performing the conn.send which would initiate the handshaking process.

One thing I'm not sure about is how the TLS interacts with the dgram send. For hole punching packets we don't care encrypting the data that much, but if we can encrypt the hole punching packets that would nice. However this would at the very least require some handshake for the TLS to work... and if we are hole punching then we cannot have already done the TLS handshake.

CMCDragonkai commented 1 year ago

I've started modularising the rust code for consumption on the TS side.

Some things I've found out.

For enums, there are 2 ways to expose third party enums to the TS side.

The first solution is to something like this:

use napi::bindgen_prelude::{
  FromNapiValue,
  sys
};

#[napi]
pub struct CongestionControlAlgorithm(quiche::CongestionControlAlgorithm);

impl FromNapiValue for CongestionControlAlgorithm {
  unsafe fn from_napi_value(env: sys::napi_env, value: sys::napi_value) -> Result<Self> {
    let value = i64::from_napi_value(env, value)?;
    match value {
      0 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::Reno)),
      1 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::CUBIC)),
      2 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::BBR)),
      _ => Err(Error::new(
        Status::InvalidArg,
        "Invalid congestion control algorithm value".to_string(),
      )),
    }
  }
}

The basic idea is the new type pattern, but then we have to implement the FromNapiValue trait, and marshal numbers from JS side to the equivalent Rust values.

The problem here is that the generated code results in class Shutdown {} which is just a object type.

On the TS side, there's no indication of how you're supposed to construct this object.

We would most likely have to create a constructor or factory method implementation that enables the ability to construct CongestionControlAlgorithm like CongestionControlAlgorithm.reno() or CongestionControlAlgorithm.bbr().

However this is kind of verbose for what we want.

Another alternative is to use the https://napi.rs/docs/concepts/types-overwrite on the method that uses CongestionControlAlgorithm.

#[napi(ts_arg_type = "0 | 1 | 2")] algo: CongestionControlAlgorithm,

However I found another way.

Redefine the macro:

/// Equivalent to quiche::CongestionControlAlgorithm
#[napi]
pub enum CongestionControlAlgorithm {
  Reno = 0,
  CUBIC = 1,
  BBR = 2,
}

Then this is converted to TS like:

/** Equivalent to quiche::CongestionControlAlgorithm */
export const enum CongestionControlAlgorithm {
  Reno = 0,
  CUBIC = 1,
  BBR = 2
}

Then we use it, we use the match to map all the value from our enum to the third party enum.

  #[napi]
  pub fn set_cc_algorithm(&mut self, algo: CongestionControlAlgorithm) -> () {
    return self.0.set_cc_algorithm(match algo {
      CongestionControlAlgorithm::Reno => quiche::CongestionControlAlgorithm::Reno,
      CongestionControlAlgorithm::CUBIC => quiche::CongestionControlAlgorithm::CUBIC,
      CongestionControlAlgorithm::BBR => quiche::CongestionControlAlgorithm::BBR,
    });
  }

This application of the match can be a bit verbose... so if we want to do multiple times, it can be abstracted to a internal function next to the enum.

Maybe we can provide a Into trait implementation? https://doc.rust-lang.org/rust-by-example/conversion/from_into.html That might help automate this process.

CMCDragonkai commented 1 year ago

By doing this:

/// Equivalent to quiche::CongestionControlAlgorithm
#[napi]
pub enum CongestionControlAlgorithm {
  Reno = 0,
  CUBIC = 1,
  BBR = 2,
}

impl From<CongestionControlAlgorithm> for quiche::CongestionControlAlgorithm {
  fn from(algo: CongestionControlAlgorithm) -> Self {
    match algo {
      CongestionControlAlgorithm::Reno => quiche::CongestionControlAlgorithm::Reno,
      CongestionControlAlgorithm::CUBIC => quiche::CongestionControlAlgorithm::CUBIC,
      CongestionControlAlgorithm::BBR => quiche::CongestionControlAlgorithm::BBR,
    }
  }
}

impl From<quiche::CongestionControlAlgorithm> for CongestionControlAlgorithm {
  fn from(item: quiche::CongestionControlAlgorithm) -> Self {
    match item {
      quiche::CongestionControlAlgorithm::Reno => CongestionControlAlgorithm::Reno,
      quiche::CongestionControlAlgorithm::CUBIC => CongestionControlAlgorithm::CUBIC,
      quiche::CongestionControlAlgorithm::BBR => CongestionControlAlgorithm::BBR,
    }
  }
}

We get bidirectional transformation of the enums and we can use algo.into() generically.

CMCDragonkai commented 1 year ago

The pacing at gives us an Instant. Which is not consistent across platforms. On non-macos, non-ios and non-windows, this can be converted into a timespec.

#[cfg(not(any(target_os = "macos", target_os = "ios", target_os = "windows")))]
fn std_time_to_c(time: &std::time::Instant, out: &mut timespec) {
    unsafe {
        ptr::copy_nonoverlapping(time as *const _ as *const timespec, out, 1)
    }
}

#[cfg(any(target_os = "macos", target_os = "ios", target_os = "windows"))]
fn std_time_to_c(_time: &std::time::Instant, out: &mut timespec) {
    // TODO: implement Instant conversion for systems that don't use timespec.
    out.tv_sec = 0;
    out.tv_nsec = 0;
}

A time spec is basically a timestamp with seconds component and a nano seconds component.

There isn't any portable rust functions right now that convert a std::time::Instant to something like milliseconds where I could just turn it into a JS timestamp that being of Date.

On the other hand it is intended to be an opaque type, so that's why I have it as an External, however this doesn't help us figure out how to pace the sending of the packet data.

One idea is to just do what the ffi.rs does and convert it to a timespec and then extract out the underlying second and nanosecond data and convert to milliseconds or Date object on JS.

However there's no information how to do this for the macos, ios and windows case. So it's kind of useless atm.

CMCDragonkai commented 1 year ago

I think pacing isn't necessary since it isn't used in the examples, so we can leave it out for now. But the generated types won't be usable as it is referring to Instant.

I'm not entirely sure how generated types are supposed to work with External. Unless you're supposed to just create a dummy class type for that type that it is returning.

I'm likely end up writing our own TS file instead of relying on the generated code, but the generated code is useful for debugging.

CMCDragonkai commented 1 year ago

Ok so I'm starting to experiment with the rust library by calling it from the JS.

One thing I'm unclear about is that this library being @matrixai/quic will essentially be about exposing the quiche library directly, or is intending to wrap it into its own set of abstractions.

In this library do we also create the socket and thus expose a socket runtime, such as creating a class Server and thus class Client.

It seems so, because quiche would expose quiche::accept to create a server connection, and quiche::connect to create a client connection.

We know that in both cases, they can use the same socket address, thus using 1 socket address to act as both client and server.

If we do this, we might just create a class Connection. However we already have this as provided by the native library.

So if we just expose the QUICHE design, then the actual socket management will be in PK.

CMCDragonkai commented 1 year ago

Ok so then only in the tests would we be making use of the sockets.

CMCDragonkai commented 1 year ago

Ok as I have been creating the QUIC server, I've found that there are many interactions with the socket during the connection establishment. It makes sense that this library would handle this complexity too. Things like address validation, stateless retries and other things all happen during connection establishment.

CMCDragonkai commented 1 year ago

Ok I'm at the point where it's time to deal with streams.

So therefore this library does have to do:

  1. Handle the interaction between the connection establishment and socket.
  2. Convert the QUIC streams into actual Web/DOM streams

So there's quite a bit of work still necessary on the JS side.

CMCDragonkai commented 1 year ago

The ALPNs may not be relevant unless we are working with HTTP3. It also seems we could just refer to our own special strings, since our application level protocol isn't HTTP3.

CMCDragonkai commented 1 year ago

We could do something like polykey1 as the ALPN. Where 1 would be the application layer protocol which would be RPC oriented...? I'm not sure if this is even needed.

CMCDragonkai commented 1 year ago

In the server.rs examples in quiche, we see a sort of structure like this:

loop {
  poll(events, timeout); // This is blocking on any event and timeout event
  loop {
    // If no events occurred, then it was a timeout that occurred
    if (!events) {
      for (const conn of conns) {
        // Calls the `on_timeout()` when any timeout event occurs
        conn.on_timeout();
      }
      break;
    }
    // Receive data from UDP socket
    packet = socketRecv();
    Header.parse(packet)
    // Do a bunch of validation and other stuff
    conn = Connection.accept();
    // Plumb into connection receive
    conn.recv();
    if (conn.inEarlyData() || conn.isEstablished()) {
      // handle writable streams (which calls streamSend())
      // handle readable streams (which calls streamRecv() then streamSend())
      // Note all of this is being buffered up in-memory, nothing gets sent out to the sockets yet
    }
  }
  // Now we process things to be sent out
  for (const conn of conns) {
    conn.send();
    socketTo();
  }
  // Now we GC all the conns that are dead
}

In JS we have to do things by registering handlers against event emitters, we cannot block the main thread. The above can be translated into several possible events.

  1. On socket message
  2. On connection timeout (any connection) (although we can be more specific)
  3. On stream data being written
  4. On connection being dead

For the first event, this is easy to do. The UDP dgram socket is already an event emitter, we simply register socket.on('message', handleMessage); and we can do the first part of the loop above which is to parse the packet data, and then decide whether to do retries or other protocol actions, then eventually accept a new connection or plumb it. The final plumbing is a bit strange... in terms of processing the streams. In particular right now it is iterating over streams that are writable and then iterating over streams that are readable. In terms of JS, this would make sense as an event propagation, that is quiche doesn't expose a readable event for the stream objects (which aren't even objects, they are IDs relative to quiche). Thus if we iterate over the streams, we would just then convert these to events on the actual stream. On the JS side it makes sense to create stream objects, and then emit those events as we iterate over readable and writable. This would only work for streamRecv(). For the streamWrite() we would have to do a "sink" that basically means every time a user writes stream.write() it ends up calling streamWrite(). However there's no callback here, it's all synchronous. So how does one know when their stream write is being flushed to the socket? There's no way to know because quiche decides on the multiplexing, could that mean stream.write(data, () => { /* resolves only when buffered into quiche, does not mean data is flushed to the socket */ })?

CMCDragonkai commented 1 year ago

The above structure also implies that upon timeout, and readable socket events, that we also start to send packets out. The // Now we process things to send out. That works because it is assumed that handling writable and readable streams ends up calling streamSend which then puts data in quiche's buffer that needs to be sent out.

However in JS again, this doesn't seem idiomatic. Instead we should have an event emitter that emits events upon any streamSend occurring, that indicates there's data to be sent out. After all any data we buffer into quiche should eventually be flushed.

Upon doing so, we would want to then perform the conn.send() which plumbs the data out and then socketTo().

Note how it iterates over all connections in an attempt to find if there's any data to send. It would only make sense to send data from connections that have stream sends.

Unless... it's possible that that connections have data to be sent even if no stream has sent anything. For example things like keepalive packets. I'm not sure. But in that case, shouldn't that be done on the basis of timeouts?

All of this requires some prototyping.

CMCDragonkai commented 1 year ago

I started working on the web stream abstraction.

In the web streams there's no default DuplexStream to construct. It has to be done by yourself.

But this ends up being easy to do:

class MyStream<R = any, W = any> implements ReadableWritablePair<R, W> {
  public readable: ReadableStream<R>;
  public writable: WritableStream<W>;
  public constructor() {
    this.readable = new ReadableStream();
    this.writable = new WritableStream();
  }
}

In our case, we have created a QUICStream that just uses Uint8Array as the R and W.

Now for the readable side it has to be constructed while "wrapping" a push-based data source.

Quiche's QUIC streams are in fact push-based... this is because we start at the UDP socket, where we attach an event handler for the message event. This triggers quiche processing, which eventually leads us to iterating over connection.readable() stream IDs which tells us all the quic streams IDs that have data that is readable. At this point we would have to do connection.streamRecv and plumb this data into the ReadableStream<Uint8Array> above.

With the help of https://exploringjs.com/nodejs-shell-scripting/ch_web-streams.html#using-a-readablestream-to-wrap-a-push-source-or-a-pull-source

As a quick prototype, I setup a readable stream like this:

this.readable = new ReadableStream({
  type: 'bytes',
  start(controller) {
    events.addEventListener('data', (event) => {
      controller.enqueue(event.detail);
      if (controller.desiredSize <= 0) {
        events.pause();
      }
    });
    events.addEventListener('end', () => {
      controller.close();
    }, { once: true });
    events.addEventListener('error', (event) => {
      controller.error(event.detail);
    }, { once: true });
  },
  pull() {
    events.resume();
  },
  cancel() {
    events.destroy();
  }
});

Note that the default queuing strategy is CountQueuingStrategy with a high water mark of 1. However I was not able to use node's new CountQueuingStrategy, I think there's a bug in NodeJS. The reason we stick with the default anyway is because the flow control would already be configured on quiche, and there's no need to do further complicated byte length flow control on the web streams, instead it will be 1 chunk at a time being passed to the reader of the readable stream.

Now what is events. It is custom EventTarget that will emit events. I like to think of it as a "derived event emitter". Remember that eventemitters are not as flexible as subjects/observables. If we think of the socket as an "event emitter"/push-based data source, this implies that the "push-flow" is preserved through quiche processing which then translates to event emitters for every stream that is readable.

Socket Events -> Stream 1 Events -> Web Stream 1
                 Stream 2 Events -> Web Stream 2
                 Stream 3 Events -> Web Stream 3
                 ...             -> ...

So 1 single socket event, could translate to multiple stream events, and each stream event is pushed to the each web stream.

I considered using 1 event emitter for all stream events, and indexing each event by the stream ID, but it seemed more idiomatic to just create separate EventTarget object for each stream, and then the event names become easier to work with.

Now the UDP socket is an EventEmitter in NodeJS. But quiche's streams are not actually event emitters. So we have to create an events object for each stream to bridge this data flow.

If everything was an observable, this could probably be done in a higher order way... transforming socket events into an observable..., then passing it into a transformer/demuxer pipeline which then goes to multiple web streams. But I'm going to avoid using any rxjs magic here and just use plain old JS stuff to be portable.

The pause and resume is used for controlling the backpressure on the quiche stream. If we are pausing, it means we stop reading from the quiche stream. As soon as we resume, we must immediately attempt to read from the quiche stream. If there's nothing to be read, that's fine we just wait for the next event which will trigger some data.


Integrating this back into the QUIC structure above, allows the message handler to basically create EventTarget objects and emit events on them for subsequent processing. So there will be some side-effects going on in this code base... That will be what our // handle readable streams is really doing.

CMCDragonkai commented 1 year ago

For the writable side, it's a bit simpler. We only need to provide the write handler which will plumb the data back into a QUIC stream with connection.streamSend.

this.writable = new WritableStream({
  write(chunk) {
    // Replace this with `streamSend`
    console.log(chunk);
  }
});

Finally the entire QUICStream object can be constructed in one of 2 ways:

  1. Upon receiving a connection - and the creation of new streams
  2. When starting a new stream to specific connection

Because 1. is a "push based dataflow", it would make sense to provide a higher-order handler that is called upon the creation of a new stream.

// Receiving a new stream
quic.on('stream', handleStream);

// Start a new stream to some connection
const stream = quic.streamStart();

Interestingly push-based dataflow ultimately look like event handlers.

CMCDragonkai commented 1 year ago

Thinking about the event architecture more https://github.com/MatrixAI/js-quic/issues/1#issuecomment-1343746712, there are more events than just the UDP socket event. We will need to attach an event handler for all of those possible events together. To do that we should be able to "create" a QUICServer instance that internally attaches event handlers to all those possible events.

CMCDragonkai commented 1 year ago

Important to remember that event emitters and event targets are fully synchronous. They are not async pub/sub. Which can be different from observables.

CMCDragonkai commented 1 year ago

Discovered a bug with napi-rs: https://github.com/napi-rs/napi-rs/issues/1390. Need to make sure where we have factory methods, we don't use [#napi(constructor)] macro on top of the struct itself.

CMCDragonkai commented 1 year ago

I've made a start on QUICServer class. It currently extends the EventTarget so I can make it similar to other nodejs objects that emit events, but I didn't want to use EventEmitter so it can be portable to other runtimes later.

This class will have a number of protected handlers, each which will be attached to all the possible events that may occur in the process of running QUIC.

Furthermore, the binding to a host is worked out, by using ip-num to determine whether the host is IPv4 or IPv6 and defaulting to IPv4 if it is a host name. The binding may still fail, so we need to race towards a possible failure.

Then we get the resolved host and port as properties.

That's when we can then start handling messages. In a way that starts all the other possible events because we have to have at least 1 connection before we can have any of the other connection events.

I imagine the handleMessage will translate to further events like stream events. If it is a new stream, a new QUICStream object will be created and then emitted as an event on the QUICServer. It is up to the user to register a handler for quicServer.addEventListener('stream', (stream) => { ... }).

I wonder what we would do if we have no handlers for the streams? We would still need to register it as state, and eventually garbage collect them? I'm not sure. Perhaps without a stream handler, all new streams would be immediately closed. Would need to check what exactly is supposed to happen with respect to stream lifecycles.

CMCDragonkai commented 1 year ago

I asked ChatGPT what happens if you create an HTTP server with http.createServer without a request listener...

If you create an HTTP server using http.createServer() without passing a request listener function, the server will not have any way to handle incoming HTTP requests. As a result, any HTTP requests sent to the server will simply be ignored and will hang until they time out.

Therefore we can have a similar behaviour, if we don't have anything that is handling the streams, then they get created... but they should eventually timeout.

However due to https://github.com/cloudflare/quiche/issues/908 there is no such thing as "stream timeouts" in the QUIC spec. And the config.set_max_idle_timeout is only for the entire connection: https://datatracker.ietf.org/doc/html/draft-ietf-quic-transport-34#section-10.1

Therefore timing out streams is an application-level concern, it can be considered that one could have an infinite stream....

But if we leave infinite streams around along with the possibility of not having a on-stream handler, then we have a possible resource-leak here.

So either we:

  1. Setup per-stream timeouts just like how the HTTP server has timeouts for requests/responses
  2. Ensure that when we construct the QUIC server, that a stream handler is provided.

Note that there is still limits of how many streams can exist for any given connection and if the connection timesout, it ends up closing all the streams anyway.

In fact... technically the server side for HTTP servers doesn't actually timeout the request anyway. It waits for the client to time it out. So yes there can be dangling resources in NodeJS too. Which means we could just accumulate stream objects internally... although without any handler, then I'm guessing any created QUICStream objects would just be GCed. So maybe this is not a problem.

CMCDragonkai commented 1 year ago

Furthermore the HTTP server has this https://nodejs.org/api/http.html#servertimeout which sets the timeout for all requests. The default is 0 which means there's actually no timeouts for requests. So we can replicate this model.

CMCDragonkai commented 1 year ago

This means abstraction-wise HTTP server's "request/response" transaction is equivalent to QUIC server's "stream".

CMCDragonkai commented 1 year ago

I'm using the cargo run --bin quiche-client -- 'http://127.0.0.1:55555' in the quiche project for testing the QUIC server. I'm not sure how to run quiche's example code, but the app code tries to perform an HTTP3 request. On our quic server perspective, any data started is just another QUIC stream. Would you be able to create an HTTP3 server? Sure... you could, but that's not what we care about. That would have to done on top of the streams.

CMCDragonkai commented 1 year ago

I'm actually running the QUICServer now and I've gotten to the stateless retry. And I can see the protocol flow now:

┌────────┐                                ┌────────┐
│ Client │                                │ Server │
└───┬────┘                                └────┬───┘
    │          ┌───────────────────┐           │
  1.├─────────►│Initial            ├──────────►│
    │          │version: 3132799674│           │
    │          │token: []          │           │
    │          │dcid: A            │           │
    │          │scid: B            │           │
    │          └───────────────────┘           │
    │                                          │
    │          ┌───────────────────┐           │
  2.│◄─────────┤Version Negotiate  │◄──────────┤
    │          │version: 1         │           │
    │          └───────────────────┘           │
    │                                          │
    │          ┌───────────────────┐           │
    │          │Initial            │           │
    │          │version: 1         │           │
  3.├─────────►│token: []          ├──────────►│
    │          │dcid: A            │           │
    │          │scid: B            │           │
    │          └───────────────────┘           │
    │                                          │
    │          ┌───────────────────┐           │
    │          │Retry              │           │
  4.│◄─────────┤token: [A]         │◄──────────┤
    │          │scid: C            │           │
    │          └───────────────────┘           │
    │                                          │
    │          ┌───────────────────┐           │
    │          │Initial            │           │
    │          │version: 1         │           │
  5.├─────────►│token: [A]         ├──────────►│
    │          │dcid: C            │           │
    │          │scid: B            │           │
    │          └───────────────────┘           │
    │                                          │
    │                                          │

Note that after the stateless retry, it passes a signed token containing the original DCID, but it also passes the key-derived connection ID from the original DCID as the new_scid.

The client is supposed to use the new SCID and pass back that token.

One thing that is confusing to me is that the subsequent initial packet has the new SCID appear in header.dcid and not header.scid so I asked this here: https://github.com/cloudflare/quiche/issues/1370

At any case, after stage 5., server now has both the original DCID odcid AND the originally key-derived connection ID.

CMCDragonkai commented 1 year ago

It appears that QUIC server requires a TLS certificate and TLS key even if we are not verifying the peer. If we don't load the TLS certificate or TLS key, when we do conn.recv() we will get a TlsFail error.

[2022-12-17T11:48:19.531713597Z ERROR quiche_server] d9fb4a3b84d6d01156eaa167bd6028c3a527ba60 recv failed: TlsFail
CMCDragonkai commented 1 year ago

So even if you do config.verify_peer(false) you still need to provide a TLS certificate and key because TLS is something that is built into the protocol. Still it seems kind of confusing here, because then a random certificate and key should be sufficient too if we are not verifying peers. Of course the issue is you don't know whether the client will want to verify or not.

We can think of the TLS certificate and key as something provides the server an "identity". It seems it must always be generated.

CMCDragonkai commented 1 year ago

And with respect to https://github.com/cloudflare/quiche/issues/1250. Let's assume then certs and keys are necessary.

Then we can use step to generate a self-signed certificate for development purposes. Let's generate it for localhost and 127.0.0.1.

step certificate create \
  localhost localhost.crt localhost.key \
  --profile self-signed \
  --subtle \
  --no-password \
  --insecure \
  --force \
  --san 127.0.0.1 \
  --san ::1 \
  --not-after 31536000s

The above generates a cert like:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 105144264462316474925316115295518905462 (0x4f1a0c7a47d87b139dac28b12e0cd476)
    Signature Algorithm: ECDSA-SHA256
        Issuer: CN=localhost
        Validity
            Not Before: Dec 17 12:11:00 2022 UTC
            Not After : Dec 17 12:11:00 2023 UTC
        Subject: CN=localhost
        Subject Public Key Info:
            Public Key Algorithm: ECDSA
                Public-Key: (256 bit)
                X:
                    6c:ca:34:fd:43:9e:a3:c4:96:07:3a:aa:d2:83:37:
                    bd:99:15:df:34:4f:ff:e9:be:e3:f8:4d:85:7c:73:
                    2a:91
                Y:
                    b1:5b:10:fc:22:83:a3:19:9a:80:a6:8d:36:0e:8d:
                    73:78:86:25:58:67:26:b2:60:f3:c4:97:aa:7b:11:
                    98:50
                Curve: P-256
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature
            X509v3 Extended Key Usage:
                Server Authentication, Client Authentication
            X509v3 Subject Key Identifier:
                81:56:BA:8B:E2:DF:0E:73:1E:5F:E4:82:61:43:4C:09:4F:D4:47:98
            X509v3 Subject Alternative Name:
                IP Address:127.0.0.1, IP Address:::1
    Signature Algorithm: ECDSA-SHA256
         30:45:02:21:00:d9:de:15:52:d2:8d:c1:a9:b2:a1:39:ff:c2:
         4b:8d:33:75:63:f5:02:d5:25:90:c7:ec:e6:09:41:76:70:69:
         b0:02:20:4e:87:92:67:8b:78:6f:e5:2d:9a:7b:b2:48:82:31:
         93:3d:34:fc:34:61:83:1c:2b:04:d8:3c:d1:51:f5:61:45
CMCDragonkai commented 1 year ago

In production usage, we would probably want to do this without having to load from a file. However QUIC config only provides ways to load from file, or to take a boringssl context object. I'm not even sure if that would work here.

So that could mean PK may need to produce a temporary file and then call these functions to load. This library could facilitate this, produce a temporary file somehow and then just load it from there, then delete the temporary file.

CMCDragonkai commented 1 year ago

And we have a successful connection between the pre-existing quiche-client and our JS/Rust quicserver.ts prototype!

To test:

npm run napi-build
npm run ts-node -- ./quicserver.ts

Then in quiche project:

cargo run --bin quiche-client -- --no-verify 'http://127.0.0.1:55555'

This is using the certificates that have been put into ./tmp/localhost.crt and ./tmp/localhost.key.

Without the --no-verify the client will fail immediately after the server first responds properly with conn.send, it probably checks the contents of the cert and decides this is invalid, and sends a Handshake packet which probably indicates an error or a closing of the connection.

Furthermore I can see that after the first conn.recv() the TLS handshake hasn't even started. So the connection is not in early data nor established. After sending the response with conn.send which is 1200 bytes.

  WE GOT A CONNECTION! Connection {}
  Connection Trace ID b263221dc542ab564ed3dddcf1c1d3adfb34a367
  RECV INFO {
    to: { addr: '127.0.0.1', port: 55555 },
    from: { addr: '127.0.0.1', port: 50194 }
  }
  CONNECTION processed this many bytes: 1200
  NOT In early data or is established!
  SENDING loop
    SENDING 1200 {
      from: { addr: '127.0.0.1', port: 55555 },
      to: { addr: '127.0.0.1', port: 50194 },
      at: [External: 37621b0]
    }
    SENDING 0 null
    NOTHING TO SEND

The next packet that the client sends which is still an Initial packet, and I'm guessing this is where the config.enableEarlyData() comes in, with actual data.

In this case the connection is now in early data or established, and there is both a writable and readable stream:

  EXISTING CONNECTION
  Connection Trace ID b263221dc542ab564ed3dddcf1c1d3adfb34a367
  RECV INFO {
    to: { addr: '127.0.0.1', port: 55555 },
    from: { addr: '127.0.0.1', port: 50194 }
  }
  CONNECTION processed this many bytes: 1350
  WRITABLE stream 0
  READABLE stream 0
  SENDING loop
    SENDING 516 {
      from: { addr: '127.0.0.1', port: 55555 },
      to: { addr: '127.0.0.1', port: 50194 },
      at: [External: 38f6be0]
    }
    SENDING 0 null
    NOTHING TO SEND

At this point we can see that a stream can be writable... and the stream also has data to be read.

CMCDragonkai commented 1 year ago

At this point we need to hook the stream events into the dom/web streams we were prototyping earlier. I've believe we don't require any kind of partial data objects, it should all be managed as web stream backpressure.

CMCDragonkai commented 1 year ago

Right now the above all happens in one handler handleMessage, as we mentioned earlier, getting a message is just 1 event. We should handle multiple events and emit other events that can trigger other handlers. So handling message event could be split into other events once we have a connection or not. We can also factor out the connection setup into its own function...

CMCDragonkai commented 1 year ago

I'm trying to connect the QUIC stream to the WritableStream.

So in our QUICStream class we have both a readable and writable properties. The writable is a WritableStream. This stream ultimately connects the RPC writer to the QUIC stream.

RPC Writer -> WritableStream -> QUIC stream

It is the latter half that is confusing to implement.

In the rust quic server example, they manage this by storing a map of partial responses. The handle_writable first looks up to see if there's any partial responses. If there is, it will perform conn.stream_send. This may send less bytes than what is needed to be sent. In that case, the remaining response is saved to be used again later.

Now it turns out the reason why you cannot just keep pushing bytes into conn.stream_send even if you are flushing the QUIC connection to the UDP socket is because the other peer's send buffer may still be full.

This means we have a few complications:

  1. The WritableStream has to have a backpressure that slows down the RPC writer, the RPC writer can just wait for the ready promise before doing anything further.
  2. This backpressure mechanism has to be derived from any situation where the conn.streamSend returns a sent length less than the input chunk.
  3. If the write() method of the WritableStream is a promise, only when the promise is resolved will the WritableStream proceed to continue writing.
  4. It seems that if write() can be made to only resolve when the chunk is fully sent out, then that would be a sufficient backpressure.
  5. The problem is that for a chunk to be fully sent out, you have to wait for next "event" before attempting another conn.streamSend. This "event" could another UDP socket message, or it could be a QUIC connection timeout event.
  6. Therefore, how does the write() promise wait for such a thing in an asynchronous way?
CMCDragonkai commented 1 year ago

When we iterate over conn.writable(), what it is telling us is that these streams have capacity to be written to. Ultimately the fact that a stream is writable is an push-flow event that is pushed from the UDP socket datagram being received.

This can be translated to an event being emitted like events.dispatchEvent(new QUICStreamWritableEvent({ detail: streamId })).

Could this event somehow result in the write() promise end up resolving?

For that to occur, we would need the write() promise to await on an event handler to be called, and where the event results in the remaining chunk to be completely sent to the QUIC stream.

CMCDragonkai commented 1 year ago

That may look like something like:

async function streamSendFully(chunk) {
  let sentLength;
  try {
    sentLength = conn.streamSend(streamId, chunk);
  } catch (e) {
    throw e;
  }
  if (sentLength < chunk.length) {
    await new Promise((resolve) => {
      streamEvents.addEventListener(
        'writable',
        () => {
          resolve();
        }
      );
    });
    return await streamSendFully(
      chunk.subarray(sentLength)
    );
  }
}

new WritableStream({
  async write(chunk: Uint8Array, controller) {
    await streamSendFully(chunk);
  }
});

Notice that the streamSendFully is essentially awaiting on the writable event before it attempts to finish the streamSendFully.

This writable event will need to come from an EventTarget for a specific stream.

In our QUICServer we are going to have a map of QUICConnection (which itself wraps the JS bridged object from rust), which must then maintain QUICStream objects.

Since our QUICStream is going to wrap ReadableStream and WritableStream. Then we can potentially have this event just be emitted against the same QUICStream object.

Thus:

async function streamSendFully(chunk) {
  const sentLength = conn.streamSend(streamId, chunk);
  if (sentLength < chunk.length) {
    await new Promise((resolve) => {
      this.addEventListener(
        'writable',
        () => {
          resolve();
        },
        { once: true }
      );
    });
    return await streamSendFully(
      chunk.subarray(sentLength)
    );
  }
}

Some things above still need consideration:

  1. How does stream closing affect this?
  2. How do we deal with fin? Or 0-length chunks?
  3. Should there be a situation where the internal promise can be rejected?
  4. Do you need to do return await streamSendFully(...) at the end or can we just do return streamSendFully?

Also note that the above snippet was iterated from a recursive callback function.


The result here is that we have a promise that only resolves when the chunk is fully sent onto the stream, which itself depends on a writable event being emitted to the QUICStream object, which can only occur when conn.stream() iteration occurs which derives this "event" from a receiving a UDP socket message or a QUIC connection timeout.

CMCDragonkai commented 1 year ago

The mixing of promises and event handlers is always going to be a bit complicated because promises are pull-oriented API while event handlers is push-oriented API.

And in this case, it looks like we have situations where a write() promise pulled to be resolved can only be resolved while waiting for push-events... and may not be resolved if those push-events never occur. Failure scenarios could mean the stream gets closed or there is an error.

We could also use a lock like for plugging and unplugging similar to what we did using js-async-locks in other areas of the code. Using a lock in this way is basically very similar to monitors https://en.wikipedia.org/wiki/Monitor_(synchronization).

The write() ends up being blocked until the handle message triggers an event indicating the stream is ready to be written again. We also have Barrier and Semaphore available to be used.

CMCDragonkai commented 1 year ago

Note that in order to run the quiche/apps:

cd quiche/apps
cargo run --bin quiche-server
cargo run --bin quiche-client -- --no-verify https://127.0.0.1:4433
CMCDragonkai commented 1 year ago

By extending QUICStream with EventTarget, we get something like this:

class QUICStream extends EventTarget implements ReadableWritable<Uint8Array, Uint8Array> { ... }

QUIC streams in quiche doesn't have any kind of object. They are just identified by the streamId.

So this object is what will be exposed to the end user when they get new streams to work with.

We can see that this is a duplex stream, so web-stream functions can work on it.

In its constructor, we construct the this.readable = new ReadableStream and this.writable = new WritableStream.

Now as per the above dicussion, I'm prototyping the method streamSendFully that is an asynchronous function that only resolves when all the chunks are fully sent. Right now there's no specialised error handling.

It does this by checking that if the sent length is less than the input data, it will proceed to wait for the writable event to be emitted before it proceeds. This is our "plug"/lock.

The idea is that the QUIC connection would emit a writable event to the QUIC stream to unplug it.

However using events here might leak too much implementation details. Alternatives might be to use a function and a lock from async-locks, but this should be sufficient to prototype this.

Actually we should revisit the plug/unplug pattern: https://github.com/MatrixAI/Polykey/blob/91287ab57f9cc34c337f1dc9f7523ea122aa96e5/src/nodes/Queue.ts#L76-L99

Furthermore the writable stream: