Closed CMCDragonkai closed 1 year ago
Ok I have managed to get the ability for quiche to send a packet. Actually I haven't sent the packet to the UDP socket, but I can see how quiche attempts to create an initial packet for me plumb into the UDP socket.
This involved several challenges.
The first is that self.0.send
takes in a reference to a mutable buffer. There's a preference to avoid copying in Rust lang similar to C and C++.
This buffer can actually be taken from the outside back in JS land. This might be a better idea, as it minimises the amount of copy pasting going on.
Alternatively we can create the array in Rust, then return as a Buffer. However the send
call returns the number of bytes written.
Therefore the buffer is pre-allocated to be MAX_DATAGRAM_SIZE
and it is subsequently written to.
This means later the array needs to be sliced according the to the write size and then returned as a buffer.
This means we generally 2 styles of calling the Rust functions. We can call them the way rust expected to be called, by passing in a buffer as a parameter and returning the write length as well. It turns out that socket.send
actually easily supports taking a buffer, offset and the length to be used: https://nodejs.org/api/dgram.html#socketsendmsg-offset-length-port-address-callback. In terms of speed, it would make use to re-use the same buffer over and over.
The other way is more idiomatic JS, which would mean that the function should return the buffer to be sent that is properly sliced up.
I think we should preserve the rust style, and then do any JS-specific idiomatic abstractions on the JS side.
At the same time, the send
can also return an error which indicates that it is Done
. If it is done, then there are no more packets to send. Using a JS style call, we can just return undefined
in that situation. However I'm not entirely sure if this is actually possible using napi-rs. It seems you have return a Rust value, and that it automatically gets converted to JS via the napi-rs macros.
With rust style calls, the done is returned as an exception which is kind of strange, but also doable. But this is very much different from how JS should be used, and it means we also have to catch non-done exceptions.
Another thing to realise is that when you have situations where you have a buffer of some fixed size, and a then write length indicating how much of that buffer was used. This will always eventually lead to a slicing operation.
When doing any slicing, the function's stack will end up with a pointer (reference) to the slice, it won't do any copying.
Rust does not have dynamically allocated arrays that sit on the stack, so you have to use vectors instead. Vectors are heap-allocated arrays that can be dynamically changed. String is also heap allocated string relative to &str
. At the same time, Rust manages the memory of vectors and knows when to deallocate them automatically.
So if we wanted to slice out from the array and also return it as a Buffer
to JS, then it has to be turned into a vector first, and that vector will be dynamically allocated simply through the !vec
macro.
Finally another problem is sending structs back to JS as objects. This requires hte usage of #[napi(object)]
macro. However this only works on structs have all the fields public.
The send info contains information that is necessary to know how to send the packet data to the UDP socket.
It however ends up containing std::net::SocketAddr
and std::time::Instant
both which contain private fields. So these types cannot be easily automatically converted to JS object. It could probably be done if I implement some sort of interface for the SocketAddr
. Like https://docs.rs/napi/latest/napi/trait.NapiRaw.html and https://docs.rs/napi/latest/napi/trait.NapiValue.html.
Alternatively I created my own Host
struct that just contains a String
and u16
. This results in a conversion from the SocketAddr
to Host
and wrapping std::time::Instant
into External::new()
because it's supposed to be opaquely compared at the rust-level.
This led to a few other issues:
So with this all in mind, I ended up creating another custom struct just so that so I can get the value out of conn.send
.
This reveals a structure that always contains the initial packet. This is probably the packet that leads to connection negotiation. If you call it again right now, you get a JS error with Done
as the message.
{
out: <Buffer c3 00 00 00 01 10 98 f4 9d 84 85 7a dd 8e f6 ef fe 25 d7 84 b3 e1 14 d3 46 1a ab 5d d2 25 f2 61 c7 0d b0 d7 83 10 88 f0 b3 f8 eb 00 40 e8 58 ad 8e 1e ... 1150 more bytes>,
info: {
from: { hostname: '127.0.0.1', port: 55551 },
to: { hostname: '127.0.0.1', port: 55552 },
at: [External: 2d723f0]
}
}
Now we can just plumb this into the dgram module to send.
When should the send
method be called? Apparently it should be triggered under an under a number of different conditions:
https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.send
And it should be looped in fact, until it is done.
The conditions are:
The application should call send() multiple times until Done is returned, indicating that there are no more packets to send. It is recommended that send() be called in the following cases:
- When the application receives QUIC packets from the peer (that is, any time recv() is also called).
- When the connection timer expires (that is, any time on_timeout() is also called).
- When the application sends data to the peer (for example, any time stream_send() or stream_shutdown() are called).
- When the application receives data from the peer (for example any time stream_recv() is called).
Once is_draining() returns true, it is no longer necessary to call send() and all calls will return Done.
So I can imagine attaching callback to things like on_timeout
to trigger "sending", and any time we receive packets, and send data to the peer or receive data to the peer.
We could call this "socketFlushing", it will loop until done. The socket flushing would be triggered every time any of the above things happen... which in turn will flush all the packets to the socket.... will it wait for the socket to have actually sent it by waiting on the callback? I'm not sure if that's necessary...
I guess it depends. On the QUIC stream level, it should be considered independent, you don't wait for the socket to be flushed before returning to the user. But one could await a promise will resolve when they really want to have the entire socket flushed.
Ok I can see that there is no support for tuples at all here: https://github.com/napi-rs/napi-rs/tree/main/examples/napi. Also not for tuple-structs which I use in a newtype style pattern.
The only thing closest is to create an array like so:
#[napi]
fn to_js_obj(env: Env) -> napi::Result<JsObject> {
let mut arr = env.create_array(0)?;
arr.insert("a string")?;
arr.insert(42)?;
arr.coerce_to_object()
}
On the generated TS it looks like a object
.
I reckon it would be dynamically deconstructable as an array. Pretty useful here, to avoid having to define one-off structs.
I've settled on just returning an Array
. It beats having to redefine the output type all the time.
The streams in quiche doesn't have its own data type. Instead they are identified by a stream ID. The connection struct provides the methods https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.stream_recv and you have to keep track of the stream IDs somehow.
So we would expect that upon creating an RPC call, it would call into the network to create a new stream for a connection.
It seems creating a new stream is basically performing this call: https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.stream_send
You select a stream ID like 0, and you just keep incrementing that number to ensure you have unique stream IDs. The client-side just increments the number, the server side just receives the streams.
Note that once we have a stream, the stream is fully duplex (not half-duplex), in-order and reliable.
The quiche::connect
and quiche::accept
should be both factory method of creating the Connection
class.
However in the case of the server side, you can't create a new connection until you've taken a specific packet from the UDP socket and parsed it first. You have to check the internal packet information to decide whether to accept a new connection, or to simply plumb it in as if you received the data for a given stream.
So the order of operations:
conn.recv()
to plumb the packet data into some relevant stream.This does mean we maintain clients
hashmap of ConectionId => Connection
.
I created a function to randomly generate the connection ID.
It's a random 20 bytes.
Instead of using Rust's ring library to get the random data. I'm just calling into a passed in JS function to acquire the random data. This unifies the random data usage directly from the Node runtime (and reduces how much rust code we are using here).
/// Creates random connection ID
///
/// Relies on the JS runtime to provide the randomness system
#[napi]
pub fn create_connection_id<T: Fn(Buffer) -> Result<()>>(
get_random_values: T
) -> Result<External<quiche::ConnectionId<'static>>> {
let scid = [0; quiche::MAX_CONN_ID_LEN].to_vec();
let scid = Buffer::from(scid);
get_random_values(scid.clone()).or_else(
|err| Err(Error::from_reason(err.to_string()))
)?;
let scid = quiche::ConnectionId::from_vec(scid.to_vec());
eprintln!("New connection with scid {:?}", scid);
return Ok(External::new(scid));
}
However I realised that since it's just a 20 byte buffer. Instead of passing an External<quiche::ConnectionId<'static>>
around.
We can just take a Buffer
directly that we expect to be of length MAX_CONN_ID_LEN
.
Meaning the connection ID is just made in JS side, and then just passed in.
On the rust side, they can use quiche::ConnectionId::from_ref(&scid)
which means there's no need to to keep any memory on the Rust side. It's all just on the JS side.
This basically means we generally try to keep as much data on the JS side, and manage minimal amounts of memory on the Rust side.
The <'static>
is used because it's sort of a dummy lifetime, because the ConnectionId
is actually an enum of either an owned vector or a reference to some memory that exists elsewhere. The owned vector doesn't have a lifetime annotation.
A better alternative to just Buffer
would be our own ConnectionId
newtype.
But I can't seem to create good constructor for it that would take a callback. Problem in https://github.com/napi-rs/napi-rs/issues/1374.
On the connecting side, there's an option for the server name.
If QUIC uses the standard TLS verification, we could imagine that this will not work for us. I'm not entirely sure.
We need to provide our custom TLS verification logic. I haven't found how to do this yet with QUIC but it should be possible.
Found it: https://github.com/cloudflare/quiche/issues/326#issuecomment-577281881
It is in fact possible.
The standard certificate verification must be disabled, and you have to fetch the certificate itself to verify.
The stream methods are mostly done on the rust side.
I've found that the largest type I can make use of is i64
not u32
. This gets closer to what JS numbers are capable of, although not all of i64
can be represented in JS. but those are the edges, is unlikely to reach those.
So I've converted it to those. I'm still using as
instead of the "safer" variants of try_into
or try_from
because it's just more convenient to coerce the numbers.
I've also found how to write the FromNapiValue
trait. Basically it ends up calling NAPI-related C++ functions that convert the JS value (known as NAPI value) into a C-like value, and that value can be stored in Rust.
So basically whenever napi-rs can't automatically derive the marshalling code, we have to implement them like FromNapivalue
and ToNapiValue
.
I think the napi macros also expose ways of writing out exactly what the generated types should be. I'm still not entirely sure we want to do use the generated types from napi-rs
because we can do it like in js-db where we just write out our TypeScript types manually.
On the JS side, I'm hoping to manage most of the state there. That means things like a map of connection IDs to connection objects, and maps of stream IDs to stream data.
That way the Rust part stays as pure as possible. We would only have 1 UDP socket for all of this. So that would simplify the the network management.
It turns out in order to send dgrams, you need to have a connection object created. However if we aren't calling conn.send
, we aren't sending the initial packet for handshaking purposes.
This means, we can use conn.dgram_send
for the hole punching packets first, before performing the conn.send
which would initiate the handshaking process.
One thing I'm not sure about is how the TLS interacts with the dgram send. For hole punching packets we don't care encrypting the data that much, but if we can encrypt the hole punching packets that would nice. However this would at the very least require some handshake for the TLS to work... and if we are hole punching then we cannot have already done the TLS handshake.
I've started modularising the rust code for consumption on the TS side.
Some things I've found out.
For enums, there are 2 ways to expose third party enums to the TS side.
The first solution is to something like this:
use napi::bindgen_prelude::{
FromNapiValue,
sys
};
#[napi]
pub struct CongestionControlAlgorithm(quiche::CongestionControlAlgorithm);
impl FromNapiValue for CongestionControlAlgorithm {
unsafe fn from_napi_value(env: sys::napi_env, value: sys::napi_value) -> Result<Self> {
let value = i64::from_napi_value(env, value)?;
match value {
0 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::Reno)),
1 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::CUBIC)),
2 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::BBR)),
_ => Err(Error::new(
Status::InvalidArg,
"Invalid congestion control algorithm value".to_string(),
)),
}
}
}
The basic idea is the new type pattern, but then we have to implement the FromNapiValue
trait, and marshal numbers from JS side to the equivalent Rust values.
The problem here is that the generated code results in class Shutdown {}
which is just a object type.
On the TS side, there's no indication of how you're supposed to construct this object.
We would most likely have to create a constructor or factory method implementation that enables the ability to construct CongestionControlAlgorithm
like CongestionControlAlgorithm.reno()
or CongestionControlAlgorithm.bbr()
.
However this is kind of verbose for what we want.
Another alternative is to use the https://napi.rs/docs/concepts/types-overwrite on the method that uses CongestionControlAlgorithm
.
#[napi(ts_arg_type = "0 | 1 | 2")] algo: CongestionControlAlgorithm,
However I found another way.
Redefine the macro:
/// Equivalent to quiche::CongestionControlAlgorithm
#[napi]
pub enum CongestionControlAlgorithm {
Reno = 0,
CUBIC = 1,
BBR = 2,
}
Then this is converted to TS like:
/** Equivalent to quiche::CongestionControlAlgorithm */
export const enum CongestionControlAlgorithm {
Reno = 0,
CUBIC = 1,
BBR = 2
}
Then we use it, we use the match
to map all the value from our enum to the third party enum.
#[napi]
pub fn set_cc_algorithm(&mut self, algo: CongestionControlAlgorithm) -> () {
return self.0.set_cc_algorithm(match algo {
CongestionControlAlgorithm::Reno => quiche::CongestionControlAlgorithm::Reno,
CongestionControlAlgorithm::CUBIC => quiche::CongestionControlAlgorithm::CUBIC,
CongestionControlAlgorithm::BBR => quiche::CongestionControlAlgorithm::BBR,
});
}
This application of the match
can be a bit verbose... so if we want to do multiple times, it can be abstracted to a internal function next to the enum.
Maybe we can provide a Into
trait implementation? https://doc.rust-lang.org/rust-by-example/conversion/from_into.html That might help automate this process.
By doing this:
/// Equivalent to quiche::CongestionControlAlgorithm
#[napi]
pub enum CongestionControlAlgorithm {
Reno = 0,
CUBIC = 1,
BBR = 2,
}
impl From<CongestionControlAlgorithm> for quiche::CongestionControlAlgorithm {
fn from(algo: CongestionControlAlgorithm) -> Self {
match algo {
CongestionControlAlgorithm::Reno => quiche::CongestionControlAlgorithm::Reno,
CongestionControlAlgorithm::CUBIC => quiche::CongestionControlAlgorithm::CUBIC,
CongestionControlAlgorithm::BBR => quiche::CongestionControlAlgorithm::BBR,
}
}
}
impl From<quiche::CongestionControlAlgorithm> for CongestionControlAlgorithm {
fn from(item: quiche::CongestionControlAlgorithm) -> Self {
match item {
quiche::CongestionControlAlgorithm::Reno => CongestionControlAlgorithm::Reno,
quiche::CongestionControlAlgorithm::CUBIC => CongestionControlAlgorithm::CUBIC,
quiche::CongestionControlAlgorithm::BBR => CongestionControlAlgorithm::BBR,
}
}
}
We get bidirectional transformation of the enums and we can use algo.into()
generically.
The pacing at
gives us an Instant
. Which is not consistent across platforms. On non-macos, non-ios and non-windows, this can be converted into a timespec
.
#[cfg(not(any(target_os = "macos", target_os = "ios", target_os = "windows")))]
fn std_time_to_c(time: &std::time::Instant, out: &mut timespec) {
unsafe {
ptr::copy_nonoverlapping(time as *const _ as *const timespec, out, 1)
}
}
#[cfg(any(target_os = "macos", target_os = "ios", target_os = "windows"))]
fn std_time_to_c(_time: &std::time::Instant, out: &mut timespec) {
// TODO: implement Instant conversion for systems that don't use timespec.
out.tv_sec = 0;
out.tv_nsec = 0;
}
A time spec is basically a timestamp with seconds component and a nano seconds component.
There isn't any portable rust functions right now that convert a std::time::Instant
to something like milliseconds where I could just turn it into a JS timestamp that being of Date
.
On the other hand it is intended to be an opaque type, so that's why I have it as an External
, however this doesn't help us figure out how to pace the sending of the packet data.
One idea is to just do what the ffi.rs
does and convert it to a timespec
and then extract out the underlying second and nanosecond data and convert to milliseconds or Date
object on JS.
However there's no information how to do this for the macos, ios and windows case. So it's kind of useless atm.
I think pacing isn't necessary since it isn't used in the examples, so we can leave it out for now. But the generated types won't be usable as it is referring to Instant
.
I'm not entirely sure how generated types are supposed to work with External
. Unless you're supposed to just create a dummy class type for that type that it is returning.
I'm likely end up writing our own TS file instead of relying on the generated code, but the generated code is useful for debugging.
Ok so I'm starting to experiment with the rust library by calling it from the JS.
One thing I'm unclear about is that this library being @matrixai/quic
will essentially be about exposing the quiche library directly, or is intending to wrap it into its own set of abstractions.
In this library do we also create the socket and thus expose a socket runtime, such as creating a class Server
and thus class Client
.
It seems so, because quiche
would expose quiche::accept
to create a server connection, and quiche::connect
to create a client connection.
We know that in both cases, they can use the same socket address, thus using 1 socket address to act as both client and server.
If we do this, we might just create a class Connection
. However we already have this as provided by the native library.
So if we just expose the QUICHE design, then the actual socket management will be in PK.
Ok so then only in the tests would we be making use of the sockets.
Ok as I have been creating the QUIC server, I've found that there are many interactions with the socket during the connection establishment. It makes sense that this library would handle this complexity too. Things like address validation, stateless retries and other things all happen during connection establishment.
Ok I'm at the point where it's time to deal with streams.
So therefore this library does have to do:
So there's quite a bit of work still necessary on the JS side.
The ALPNs may not be relevant unless we are working with HTTP3. It also seems we could just refer to our own special strings, since our application level protocol isn't HTTP3.
We could do something like polykey1
as the ALPN. Where 1
would be the application layer protocol which would be RPC oriented...? I'm not sure if this is even needed.
In the server.rs
examples in quiche, we see a sort of structure like this:
loop {
poll(events, timeout); // This is blocking on any event and timeout event
loop {
// If no events occurred, then it was a timeout that occurred
if (!events) {
for (const conn of conns) {
// Calls the `on_timeout()` when any timeout event occurs
conn.on_timeout();
}
break;
}
// Receive data from UDP socket
packet = socketRecv();
Header.parse(packet)
// Do a bunch of validation and other stuff
conn = Connection.accept();
// Plumb into connection receive
conn.recv();
if (conn.inEarlyData() || conn.isEstablished()) {
// handle writable streams (which calls streamSend())
// handle readable streams (which calls streamRecv() then streamSend())
// Note all of this is being buffered up in-memory, nothing gets sent out to the sockets yet
}
}
// Now we process things to be sent out
for (const conn of conns) {
conn.send();
socketTo();
}
// Now we GC all the conns that are dead
}
In JS we have to do things by registering handlers against event emitters, we cannot block the main thread. The above can be translated into several possible events.
For the first event, this is easy to do. The UDP dgram socket is already an event emitter, we simply register socket.on('message', handleMessage);
and we can do the first part of the loop above which is to parse the packet data, and then decide whether to do retries or other protocol actions, then eventually accept a new connection or plumb it. The final plumbing is a bit strange... in terms of processing the streams. In particular right now it is iterating over streams that are writable and then iterating over streams that are readable. In terms of JS, this would make sense as an event propagation, that is quiche doesn't expose a readable
event for the stream objects (which aren't even objects, they are IDs relative to quiche). Thus if we iterate over the streams, we would just then convert these to events on the actual stream. On the JS side it makes sense to create stream objects, and then emit those events as we iterate over readable and writable. This would only work for streamRecv()
. For the streamWrite()
we would have to do a "sink" that basically means every time a user writes stream.write()
it ends up calling streamWrite()
. However there's no callback here, it's all synchronous. So how does one know when their stream write is being flushed to the socket? There's no way to know because quiche decides on the multiplexing, could that mean stream.write(data, () => { /* resolves only when buffered into quiche, does not mean data is flushed to the socket */ })
?
The above structure also implies that upon timeout, and readable socket events, that we also start to send packets out. The // Now we process things to send out
. That works because it is assumed that handling writable and readable streams ends up calling streamSend
which then puts data in quiche's buffer that needs to be sent out.
However in JS again, this doesn't seem idiomatic. Instead we should have an event emitter that emits events upon any streamSend
occurring, that indicates there's data to be sent out. After all any data we buffer into quiche should eventually be flushed.
Upon doing so, we would want to then perform the conn.send()
which plumbs the data out and then socketTo()
.
Note how it iterates over all connections in an attempt to find if there's any data to send. It would only make sense to send data from connections that have stream sends.
Unless... it's possible that that connections have data to be sent even if no stream has sent anything. For example things like keepalive packets. I'm not sure. But in that case, shouldn't that be done on the basis of timeouts?
All of this requires some prototyping.
I started working on the web stream abstraction.
In the web streams there's no default DuplexStream
to construct. It has to be done by yourself.
But this ends up being easy to do:
class MyStream<R = any, W = any> implements ReadableWritablePair<R, W> {
public readable: ReadableStream<R>;
public writable: WritableStream<W>;
public constructor() {
this.readable = new ReadableStream();
this.writable = new WritableStream();
}
}
In our case, we have created a QUICStream
that just uses Uint8Array
as the R
and W
.
Now for the readable side it has to be constructed while "wrapping" a push-based data source.
Quiche's QUIC streams are in fact push-based... this is because we start at the UDP socket, where we attach an event handler for the message
event. This triggers quiche processing, which eventually leads us to iterating over connection.readable()
stream IDs which tells us all the quic streams IDs that have data that is readable. At this point we would have to do connection.streamRecv
and plumb this data into the ReadableStream<Uint8Array>
above.
With the help of https://exploringjs.com/nodejs-shell-scripting/ch_web-streams.html#using-a-readablestream-to-wrap-a-push-source-or-a-pull-source
As a quick prototype, I setup a readable stream like this:
this.readable = new ReadableStream({
type: 'bytes',
start(controller) {
events.addEventListener('data', (event) => {
controller.enqueue(event.detail);
if (controller.desiredSize <= 0) {
events.pause();
}
});
events.addEventListener('end', () => {
controller.close();
}, { once: true });
events.addEventListener('error', (event) => {
controller.error(event.detail);
}, { once: true });
},
pull() {
events.resume();
},
cancel() {
events.destroy();
}
});
Note that the default queuing strategy is CountQueuingStrategy
with a high water mark of 1. However I was not able to use node's new CountQueuingStrategy
, I think there's a bug in NodeJS. The reason we stick with the default anyway is because the flow control would already be configured on quiche, and there's no need to do further complicated byte length flow control on the web streams, instead it will be 1 chunk at a time being passed to the reader of the readable stream.
Now what is events
. It is custom EventTarget
that will emit events. I like to think of it as a "derived event emitter". Remember that eventemitters are not as flexible as subjects/observables. If we think of the socket as an "event emitter"/push-based data source, this implies that the "push-flow" is preserved through quiche processing which then translates to event emitters for every stream that is readable.
Socket Events -> Stream 1 Events -> Web Stream 1
Stream 2 Events -> Web Stream 2
Stream 3 Events -> Web Stream 3
... -> ...
So 1 single socket event, could translate to multiple stream events, and each stream event is pushed to the each web stream.
I considered using 1 event emitter for all stream events, and indexing each event by the stream ID, but it seemed more idiomatic to just create separate EventTarget
object for each stream, and then the event names become easier to work with.
Now the UDP socket is an EventEmitter
in NodeJS. But quiche's streams are not actually event emitters. So we have to create an events object for each stream to bridge this data flow.
If everything was an observable, this could probably be done in a higher order way... transforming socket events into an observable..., then passing it into a transformer/demuxer pipeline which then goes to multiple web streams. But I'm going to avoid using any rxjs magic here and just use plain old JS stuff to be portable.
The pause
and resume
is used for controlling the backpressure on the quiche stream. If we are pausing, it means we stop reading from the quiche stream. As soon as we resume, we must immediately attempt to read from the quiche stream. If there's nothing to be read, that's fine we just wait for the next event which will trigger some data.
Integrating this back into the QUIC structure above, allows the message
handler to basically create EventTarget
objects and emit events on them for subsequent processing. So there will be some side-effects going on in this code base... That will be what our // handle readable streams
is really doing.
For the writable side, it's a bit simpler. We only need to provide the write
handler which will plumb the data back into a QUIC stream with connection.streamSend
.
this.writable = new WritableStream({
write(chunk) {
// Replace this with `streamSend`
console.log(chunk);
}
});
Finally the entire QUICStream
object can be constructed in one of 2 ways:
Because 1.
is a "push based dataflow", it would make sense to provide a higher-order handler that is called upon the creation of a new stream.
// Receiving a new stream
quic.on('stream', handleStream);
// Start a new stream to some connection
const stream = quic.streamStart();
Interestingly push-based dataflow ultimately look like event handlers.
Thinking about the event architecture more https://github.com/MatrixAI/js-quic/issues/1#issuecomment-1343746712, there are more events than just the UDP socket event. We will need to attach an event handler for all of those possible events together. To do that we should be able to "create" a QUICServer
instance that internally attaches event handlers to all those possible events.
Important to remember that event emitters and event targets are fully synchronous. They are not async pub/sub. Which can be different from observables.
Discovered a bug with napi-rs: https://github.com/napi-rs/napi-rs/issues/1390. Need to make sure where we have factory methods, we don't use [#napi(constructor)]
macro on top of the struct itself.
I've made a start on QUICServer
class. It currently extends the EventTarget
so I can make it similar to other nodejs objects that emit events, but I didn't want to use EventEmitter
so it can be portable to other runtimes later.
This class will have a number of protected handlers, each which will be attached to all the possible events that may occur in the process of running QUIC.
Furthermore, the binding to a host is worked out, by using ip-num
to determine whether the host is IPv4 or IPv6 and defaulting to IPv4 if it is a host name. The binding may still fail, so we need to race towards a possible failure.
Then we get the resolved host
and port
as properties.
That's when we can then start handling messages. In a way that starts all the other possible events because we have to have at least 1 connection before we can have any of the other connection events.
I imagine the handleMessage
will translate to further events like stream events. If it is a new stream, a new QUICStream
object will be created and then emitted as an event on the QUICServer
. It is up to the user to register a handler for quicServer.addEventListener('stream', (stream) => { ... })
.
I wonder what we would do if we have no handlers for the streams? We would still need to register it as state, and eventually garbage collect them? I'm not sure. Perhaps without a stream handler, all new streams would be immediately closed. Would need to check what exactly is supposed to happen with respect to stream lifecycles.
I asked ChatGPT what happens if you create an HTTP server with http.createServer
without a request listener...
If you create an HTTP server using http.createServer() without passing a request listener function, the server will not have any way to handle incoming HTTP requests. As a result, any HTTP requests sent to the server will simply be ignored and will hang until they time out.
Therefore we can have a similar behaviour, if we don't have anything that is handling the streams, then they get created... but they should eventually timeout.
However due to https://github.com/cloudflare/quiche/issues/908 there is no such thing as "stream timeouts" in the QUIC spec. And the config.set_max_idle_timeout
is only for the entire connection: https://datatracker.ietf.org/doc/html/draft-ietf-quic-transport-34#section-10.1
Therefore timing out streams is an application-level concern, it can be considered that one could have an infinite stream....
But if we leave infinite streams around along with the possibility of not having a on-stream handler, then we have a possible resource-leak here.
So either we:
Note that there is still limits of how many streams can exist for any given connection and if the connection timesout, it ends up closing all the streams anyway.
In fact... technically the server side for HTTP servers doesn't actually timeout the request anyway. It waits for the client to time it out. So yes there can be dangling resources in NodeJS too. Which means we could just accumulate stream objects internally... although without any handler, then I'm guessing any created QUICStream
objects would just be GCed. So maybe this is not a problem.
Furthermore the HTTP server has this https://nodejs.org/api/http.html#servertimeout which sets the timeout for all requests. The default is 0
which means there's actually no timeouts for requests. So we can replicate this model.
This means abstraction-wise HTTP server's "request/response" transaction is equivalent to QUIC server's "stream".
I'm using the cargo run --bin quiche-client -- 'http://127.0.0.1:55555'
in the quiche
project for testing the QUIC server. I'm not sure how to run quiche's example code, but the app code tries to perform an HTTP3 request. On our quic server perspective, any data started is just another QUIC stream. Would you be able to create an HTTP3 server? Sure... you could, but that's not what we care about. That would have to done on top of the streams.
I'm actually running the QUICServer
now and I've gotten to the stateless retry. And I can see the protocol flow now:
┌────────┐ ┌────────┐
│ Client │ │ Server │
└───┬────┘ └────┬───┘
│ ┌───────────────────┐ │
1.├─────────►│Initial ├──────────►│
│ │version: 3132799674│ │
│ │token: [] │ │
│ │dcid: A │ │
│ │scid: B │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
2.│◄─────────┤Version Negotiate │◄──────────┤
│ │version: 1 │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Initial │ │
│ │version: 1 │ │
3.├─────────►│token: [] ├──────────►│
│ │dcid: A │ │
│ │scid: B │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Retry │ │
4.│◄─────────┤token: [A] │◄──────────┤
│ │scid: C │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Initial │ │
│ │version: 1 │ │
5.├─────────►│token: [A] ├──────────►│
│ │dcid: C │ │
│ │scid: B │ │
│ └───────────────────┘ │
│ │
│ │
Note that after the stateless retry, it passes a signed token containing the original DCID, but it also passes the key-derived connection ID from the original DCID as the new_scid
.
The client is supposed to use the new SCID and pass back that token.
One thing that is confusing to me is that the subsequent initial packet has the new SCID appear in header.dcid
and not header.scid
so I asked this here: https://github.com/cloudflare/quiche/issues/1370
At any case, after stage 5.
, server now has both the original DCID odcid
AND the originally key-derived connection ID.
It appears that QUIC server requires a TLS certificate and TLS key even if we are not verifying the peer. If we don't load the TLS certificate or TLS key, when we do conn.recv()
we will get a TlsFail
error.
[2022-12-17T11:48:19.531713597Z ERROR quiche_server] d9fb4a3b84d6d01156eaa167bd6028c3a527ba60 recv failed: TlsFail
So even if you do config.verify_peer(false)
you still need to provide a TLS certificate and key because TLS is something that is built into the protocol. Still it seems kind of confusing here, because then a random certificate and key should be sufficient too if we are not verifying peers. Of course the issue is you don't know whether the client will want to verify or not.
We can think of the TLS certificate and key as something provides the server an "identity". It seems it must always be generated.
And with respect to https://github.com/cloudflare/quiche/issues/1250. Let's assume then certs and keys are necessary.
Then we can use step
to generate a self-signed certificate for development purposes. Let's generate it for localhost
and 127.0.0.1
.
step certificate create \
localhost localhost.crt localhost.key \
--profile self-signed \
--subtle \
--no-password \
--insecure \
--force \
--san 127.0.0.1 \
--san ::1 \
--not-after 31536000s
The above generates a cert like:
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 105144264462316474925316115295518905462 (0x4f1a0c7a47d87b139dac28b12e0cd476)
Signature Algorithm: ECDSA-SHA256
Issuer: CN=localhost
Validity
Not Before: Dec 17 12:11:00 2022 UTC
Not After : Dec 17 12:11:00 2023 UTC
Subject: CN=localhost
Subject Public Key Info:
Public Key Algorithm: ECDSA
Public-Key: (256 bit)
X:
6c:ca:34:fd:43:9e:a3:c4:96:07:3a:aa:d2:83:37:
bd:99:15:df:34:4f:ff:e9:be:e3:f8:4d:85:7c:73:
2a:91
Y:
b1:5b:10:fc:22:83:a3:19:9a:80:a6:8d:36:0e:8d:
73:78:86:25:58:67:26:b2:60:f3:c4:97:aa:7b:11:
98:50
Curve: P-256
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature
X509v3 Extended Key Usage:
Server Authentication, Client Authentication
X509v3 Subject Key Identifier:
81:56:BA:8B:E2:DF:0E:73:1E:5F:E4:82:61:43:4C:09:4F:D4:47:98
X509v3 Subject Alternative Name:
IP Address:127.0.0.1, IP Address:::1
Signature Algorithm: ECDSA-SHA256
30:45:02:21:00:d9:de:15:52:d2:8d:c1:a9:b2:a1:39:ff:c2:
4b:8d:33:75:63:f5:02:d5:25:90:c7:ec:e6:09:41:76:70:69:
b0:02:20:4e:87:92:67:8b:78:6f:e5:2d:9a:7b:b2:48:82:31:
93:3d:34:fc:34:61:83:1c:2b:04:d8:3c:d1:51:f5:61:45
In production usage, we would probably want to do this without having to load from a file. However QUIC config only provides ways to load from file, or to take a boringssl context object. I'm not even sure if that would work here.
So that could mean PK may need to produce a temporary file and then call these functions to load. This library could facilitate this, produce a temporary file somehow and then just load it from there, then delete the temporary file.
And we have a successful connection between the pre-existing quiche-client
and our JS/Rust quicserver.ts
prototype!
To test:
npm run napi-build
npm run ts-node -- ./quicserver.ts
Then in quiche
project:
cargo run --bin quiche-client -- --no-verify 'http://127.0.0.1:55555'
This is using the certificates that have been put into ./tmp/localhost.crt
and ./tmp/localhost.key
.
Without the --no-verify
the client will fail immediately after the server first responds properly with conn.send
, it probably checks the contents of the cert and decides this is invalid, and sends a Handshake
packet which probably indicates an error or a closing of the connection.
Furthermore I can see that after the first conn.recv()
the TLS handshake hasn't even started. So the connection is not in early data nor established. After sending the response with conn.send
which is 1200
bytes.
WE GOT A CONNECTION! Connection {}
Connection Trace ID b263221dc542ab564ed3dddcf1c1d3adfb34a367
RECV INFO {
to: { addr: '127.0.0.1', port: 55555 },
from: { addr: '127.0.0.1', port: 50194 }
}
CONNECTION processed this many bytes: 1200
NOT In early data or is established!
SENDING loop
SENDING 1200 {
from: { addr: '127.0.0.1', port: 55555 },
to: { addr: '127.0.0.1', port: 50194 },
at: [External: 37621b0]
}
SENDING 0 null
NOTHING TO SEND
The next packet that the client sends which is still an Initial packet, and I'm guessing this is where the config.enableEarlyData()
comes in, with actual data.
In this case the connection is now in early data or established, and there is both a writable and readable stream:
EXISTING CONNECTION
Connection Trace ID b263221dc542ab564ed3dddcf1c1d3adfb34a367
RECV INFO {
to: { addr: '127.0.0.1', port: 55555 },
from: { addr: '127.0.0.1', port: 50194 }
}
CONNECTION processed this many bytes: 1350
WRITABLE stream 0
READABLE stream 0
SENDING loop
SENDING 516 {
from: { addr: '127.0.0.1', port: 55555 },
to: { addr: '127.0.0.1', port: 50194 },
at: [External: 38f6be0]
}
SENDING 0 null
NOTHING TO SEND
At this point we can see that a stream can be writable... and the stream also has data to be read.
At this point we need to hook the stream events into the dom/web streams we were prototyping earlier. I've believe we don't require any kind of partial data objects, it should all be managed as web stream backpressure.
Right now the above all happens in one handler handleMessage
, as we mentioned earlier, getting a message is just 1 event. We should handle multiple events and emit other events that can trigger other handlers. So handling message event could be split into other events once we have a connection or not. We can also factor out the connection setup into its own function...
I'm trying to connect the QUIC stream to the WritableStream
.
So in our QUICStream
class we have both a readable
and writable
properties. The writable
is a WritableStream
. This stream ultimately connects the RPC writer to the QUIC stream.
RPC Writer -> WritableStream -> QUIC stream
It is the latter half that is confusing to implement.
In the rust quic server example, they manage this by storing a map of partial responses. The handle_writable
first looks up to see if there's any partial responses. If there is, it will perform conn.stream_send
. This may send less bytes than what is needed to be sent. In that case, the remaining response is saved to be used again later.
Now it turns out the reason why you cannot just keep pushing bytes into conn.stream_send
even if you are flushing the QUIC connection to the UDP socket is because the other peer's send buffer may still be full.
This means we have a few complications:
WritableStream
has to have a backpressure that slows down the RPC writer, the RPC writer can just wait for the ready
promise before doing anything further.conn.streamSend
returns a sent length less than the input chunk.write()
method of the WritableStream
is a promise, only when the promise is resolved will the WritableStream
proceed to continue writing.write()
can be made to only resolve when the chunk is fully sent out, then that would be a sufficient backpressure.conn.streamSend
. This "event" could another UDP socket message, or it could be a QUIC connection timeout event.write()
promise wait for such a thing in an asynchronous way?When we iterate over conn.writable()
, what it is telling us is that these streams have capacity to be written to. Ultimately the fact that a stream is writable is an push-flow event that is pushed from the UDP socket datagram being received.
This can be translated to an event being emitted like events.dispatchEvent(new QUICStreamWritableEvent({ detail: streamId }))
.
Could this event somehow result in the write()
promise end up resolving?
For that to occur, we would need the write()
promise to await on an event handler to be called, and where the event results in the remaining chunk to be completely sent to the QUIC stream.
That may look like something like:
async function streamSendFully(chunk) {
let sentLength;
try {
sentLength = conn.streamSend(streamId, chunk);
} catch (e) {
throw e;
}
if (sentLength < chunk.length) {
await new Promise((resolve) => {
streamEvents.addEventListener(
'writable',
() => {
resolve();
}
);
});
return await streamSendFully(
chunk.subarray(sentLength)
);
}
}
new WritableStream({
async write(chunk: Uint8Array, controller) {
await streamSendFully(chunk);
}
});
Notice that the streamSendFully
is essentially awaiting on the writable
event before it attempts to finish the streamSendFully
.
This writable
event will need to come from an EventTarget
for a specific stream.
In our QUICServer
we are going to have a map of QUICConnection
(which itself wraps the JS bridged object from rust), which must then maintain QUICStream
objects.
Since our QUICStream
is going to wrap ReadableStream
and WritableStream
. Then we can potentially have this event just be emitted against the same QUICStream
object.
Thus:
async function streamSendFully(chunk) {
const sentLength = conn.streamSend(streamId, chunk);
if (sentLength < chunk.length) {
await new Promise((resolve) => {
this.addEventListener(
'writable',
() => {
resolve();
},
{ once: true }
);
});
return await streamSendFully(
chunk.subarray(sentLength)
);
}
}
Some things above still need consideration:
fin
? Or 0-length chunks?return await streamSendFully(...)
at the end or can we just do return streamSendFully
?Also note that the above snippet was iterated from a recursive callback function.
The result here is that we have a promise that only resolves when the chunk is fully sent onto the stream, which itself depends on a writable
event being emitted to the QUICStream
object, which can only occur when conn.stream()
iteration occurs which derives this "event" from a receiving a UDP socket message or a QUIC connection timeout.
The mixing of promises and event handlers is always going to be a bit complicated because promises are pull-oriented API while event handlers is push-oriented API.
And in this case, it looks like we have situations where a write()
promise pulled to be resolved can only be resolved while waiting for push-events... and may not be resolved if those push-events never occur. Failure scenarios could mean the stream gets closed or there is an error.
We could also use a lock like for plugging and unplugging similar to what we did using js-async-locks
in other areas of the code. Using a lock in this way is basically very similar to monitors https://en.wikipedia.org/wiki/Monitor_(synchronization).
The write()
ends up being blocked until the handle message triggers an event indicating the stream is ready to be written again. We also have Barrier
and Semaphore
available to be used.
Note that in order to run the quiche/apps
:
cd quiche/apps
cargo run --bin quiche-server
cargo run --bin quiche-client -- --no-verify https://127.0.0.1:4433
By extending QUICStream
with EventTarget
, we get something like this:
class QUICStream extends EventTarget implements ReadableWritable<Uint8Array, Uint8Array> { ... }
QUIC streams in quiche doesn't have any kind of object. They are just identified by the streamId
.
So this object is what will be exposed to the end user when they get new streams to work with.
We can see that this is a duplex stream, so web-stream functions can work on it.
In its constructor, we construct the this.readable = new ReadableStream
and this.writable = new WritableStream
.
Now as per the above dicussion, I'm prototyping the method streamSendFully
that is an asynchronous function that only resolves when all the chunks are fully sent. Right now there's no specialised error handling.
It does this by checking that if the sent length is less than the input data, it will proceed to wait for the writable
event to be emitted before it proceeds. This is our "plug"/lock.
The idea is that the QUIC connection would emit a writable
event to the QUIC
stream to unplug it.
However using events here might leak too much implementation details. Alternatives might be to use a function and a lock from async-locks, but this should be sufficient to prototype this.
Actually we should revisit the plug/unplug pattern: https://github.com/MatrixAI/Polykey/blob/91287ab57f9cc34c337f1dc9f7523ea122aa96e5/src/nodes/Queue.ts#L76-L99
Furthermore the writable stream:
start
doesn't do anything atm..., the assumption is that the stream does already exist. This may be more relevant later when one starts a QUIC connection/stream from the client-side.write
just calls await this.streamSendFully
.close
just calls await this.streamSendFully(Buffer.from([]), true)
which sends a 0-length buffer and switches fin
to trueabort
calls this.conn.streamShutdown(this.streamId, quic.Shutdown.write, reasonToCode(reason))
. This would send a RESET_STREAM
frame indicating an abrupt termination of a stream with an application supplied error code, where 0
is the general code in case we don't understand the reason.
Specification
We need QUIC in order to simplify our networking stack in PK.
QUIC is a superior UDP layer that can make use of any UDP socket, and create multiplexed reliable streams. It is also capable of hole punching either just by attempting to send ping frames on the stream, or through the unreliable datagrams.
Our goals is to make use of a QUIC library, something that is compilable to desktop and mobile operating systems, expose its functionality to JS, but have the JS runtime manage the actual sockets.
On NodeJS, it can already manage the underlying UDP sockets, and by relying on NodeJS, it will also ensure that these sockets will mix well with the concurrency/parallelism used by the rest of the NodeJS system due to libuv and thus avoid creating a second IO system running in parallel.
On Mobile runtimes, they may not have a dgram module readily available. In such cases, having an IO runtime to supply the UDP sockets may be required. But it is likely there are already existing libraries that provide this like https://github.com/tradle/react-native-udp.
The underlying QUIC library there is expected to be agnostic to the socket runtime. It will give you data that you need to the UDP socket, and it will take data that comes of the UDP socket.
However it does need 2 major duties:
Again if we want to stay cross platform, we would not want to bind into Node.js's openssl crypto. It would require instead that the library can take a callback of crypto routines to use. However I've found that this is generally not the case with most existing QUIC libraries. But let's see how we go with this.
Additional context
QUIC and NAPI-RS
Sub issues
2
3
4
5
6
7
8
9
10
13
14
15
16
17
18
Tasks
Config
so the user can decide this (especially since this is not a HTTP3 library). - 0.5 day~ - see #13null
. Right now when a quiche client connects to the server, even after closing, the server side is keeping the connection alive. - 1 dayQUICConnection
andQUICStream
andQUICServer
andQUICSocket
. This will allow users to hook into the destruction of the object, and perhaps remove their event listeners. These events must be post-facto events. - 0.5 dayQUICStream
and change to BYOB style, so that way there can be a byte buffer for it. Testing code should be able generator functions similar to our RPC handlers. - 1 day~ - see #5.QUICClient
with the shared socketQUICSocket
. - 3 dayQUICClient
and a singleQUICServer
. - 1 day~ - See #14rinfo
from the UDP datagram into theconn.recv()
so that the streams (either during construction or otherwise) can have itsrinfo
updated. Perhaps we can just "set" therinfo
properties of the connection every time we do aconn.recv()
. Or... we just mutate theconn
parameters every time we receive a UDP packet.~ - See #16stream.connection
they can acquire the remote information and also all the remote peer's certificate chain.~ - See #16napi
program into thescripts/prebuild.js
so we can actually build the package in a similar to other native packages we are using likejs-db
~ - see #7