Closed CMCDragonkai closed 1 year ago
On the readable side of QUIC stream, we are dealing with a push-flow source.
In our start
method, we register a handler for the readable
event. We will use the readable
event every time we know that a QUIC stream is readable.
Upon being readable we go through this logic:
return
here.1024
and read from the stream.Done
, then there's nothing to do, we just return. (Here I have reversed the change of Done
to 0-length reads/writes because it does seem to indicate something different from having an empty buffer).controller.error(e);
and we return
here`.fin
is true, then we do controller.close()
and return
.<= 0
then we pause.For the pull()
method, we will:
readable
event, which will trigger another read of the handleReadable
.For the cancel()
method, we will:
handleReadable
.this.conn.streamShutdown
to shutdown the read side with the reason mapped to a code.Some possible improvements:
fin
and Done
works will need to be experimented with.desiredSize
is null
needs to be considered. It seems that users can set the stream's desiredSize
to be null to avoid any blocking at all, and this would just cause the controller the buffer up all the data... basically removing any kind of backpressure and having an unbounded queue.It turns out that plugging the receive was a simple matter of switching a boolean. No promises or events required.
On the other hand for the send side where we wait for promise to resolve based on a writable
event, an alternative is to use a plug, and expect the emitter of writable
to explicit unplug however this is not symmetric.
By symmetry I mean that the QUICStream
right now receives 2 internal events: readable
and writable
. Where both events are ultimately determined by whether the QUIC stream from quiche is in fact readable or writable, and these events being derived from every UDP socket message and timeout event.
If we are going to expose the Done
exception, we will need to do that for the other functions too, not just stream_recv
and stream_send
.
Just a note, when we have the QUICStream
later. We can do things like: await quicStream.writable.getWriter().ready;
.
This would essentially be waiting for the stream to be ready to write. This is how the backpressure works at the writer to web stream stage.
Since we are using EventTarget
right now, we need to consider error handling. The EventTarget
does not have any special handling for exceptions or errors. That means we should not use async
functions as event handlers for EventTarget
. Any rejections would be considered an uncaughtException
and will go to process.on('uncaughtException', () => { })
.
This means any errors on socket sending or other callbacks must be done with callbacks that then emit the event to the error
handler.
A default error
handler should be available... that will then throw the event detail as an uncaught exception if no custom error handler is made available.
So the command:
napi build --platform --js ./src/native/index.js --dts ./src/native/index.d.ts
Will compile a single binary index.linux-x64-gnu.node
while putting the index.js
and index.d.ts
into the same directory.
As long as this is the case, TSC will realise that index.d.ts
is the types for index.js
.
Now I can choose to name the index.linux-x64-gnu.node
differently, this is controlled by the napi config inside package.json
.
The generated JS file does do the equivalent of node-gyp-build
in js-db. It figures out which platform we are on, and then loads the appropriate dependency.
Interestingly it does in fact support:
@matrixai/quic-android-arm64
.I'm wondering how to control these names. It seems to automatically derive as suffixes of the main package name being @matrixai/quic
.
So we would end up with something like:
@matrixai/quic
@matrixai/quic-android-arm64
@matrixai/quic-android-arm-eabi
@matrixai/quic-win32-x64-msvc
@matrixai/quic-win32-ia32-msvc
@matrixai/quic-win32-arm64-msvc
@matrixai/quic-darwin-x64
@matrixai/quic-darwin-arm64
@matrixai/quic-freebsd-x64
@matrixai/quic-linux-x64-musl
@matrixai/quic-linux-x64-gnu
@matrixai/quic-linux-arm64-musl
@matrixai/quic-linux-arm64-gnu
@matrixai/quic-linux-arm-gnueabihf
One problem here is that for situation 1., it currently expects the binary to exist in the same directory as index.js
.
Now we don't have to always use the .js
file, we could change it ourselves accordingly. So we could hardcode a fix if autogeneration doesn't work well for us.
If we go with the optional package route, we will need to have a multi-package repo, since each separate platform will need its own directory with a package.json
.
Also it seems once we create optional dependencies, we can apply constraints to each optional dependency using the os
and cpu
keys:
This should be able to add constraints to which ones you are installing for.
However there's another issue. Suppose you creating a cross-platform native package that depends on another cross-platform native package. In that scenario, if you are only allowed to install linux because you're on linux, it can make it more difficult to distribute your own set of binaries, unless you are also changing the entire OS via a CI/CD. For example things like android
above, who would be building using an android OS.
Atm, it seems like all optional packages would be installed without the constraints.
I've moved all the rust code into src/native/napi
. The Cargo.toml
has been updated accordingly.
We are probably going to create a subdirectory called packages
and actually make use of optionalDependencies
in this main package.
The packages
directory then requires each of the built binaries to be put into the directory during the CI/CD build process and then published.
The os/arch constraint will then be applied.
Ok one of things that is different between EventEmitter
and EventTarget
is the handling of errors. Here's a demo:
const et = new EventTarget();
et.dispatchEvent(new Event('error');
Nothing happens. No handler for error
event, no special handling.
Now in EventEmitter
which is how the TCP server behaves along with other node constructs:
import events from 'events';
const ee = new events.EventEmitter();
ee.emit('error');
Immediately we get:
Error [ERR_UNHANDLED_ERROR]: Unhandled error. (undefined)
This is the case with TCP servers:
import net from 'net';
async function main() {
const server = net.createServer((conn) => {
console.log(conn);
});
server.listen(55555, () => {
server.emit('error');
});
}
void main();
This results in the same error. Only when we add an event handler for error
does it not end up breaking the entire node runtime.
So if we want to replicate the behaviour of TCP server and TCP connections, we would need to do something similar with EventTarget
. To do so, we would need to add a "default" error handler to the error
event, and then check that the error
event hasn't previously been handled, so it's the handler of last resort.
Alternatively we can overload the addEventListener
so that if the error
handler is being assigned, we remove the default error
handler.
Note that it's possible the error
handler is removed, so as long as there's no handlers, the throwing the exception would need to be ensured.
There's another thing, it would be important to ensure that if errors are just ignored, that the server/connection can continue to work, especially given the node runtime will continue to run.
But one could also differentiate between recoverable errors, and unrecoverable errors.
The main top level classes should avoid to use as much nodisms as possible. So I'll need that defaulting behaviour to the classes extending EventTarget
.
There's some additional complexity. The role of the QUICServer
and the QUICConnection
. In TCP there's just the server and the socket. That's it. The socket itself can have errors too, but that may also result in errors that exist the node system. In QUIC we would have 3 concepts, server (wrapping UDP socket, and propagating udp errors), connection, stream.
I need to check what happens if the an error is emitted to the tcp connection (while the server is running), if the same thing happens. And I just realised that I need to propagate the UDP socket error to the QUICServer too.
Actually I cannot do the above error handling with EventTarget
very easily. This is because event target does not allow the same listener instance to be used multiple times unless the capture
is toggled.
We don't know how many listeners there would be an error
handler. We would end up having to keep track of these listener instances... Which just overcomplicates the situation.
Options are:
I think the best solution right now is 1., to not replicate the behaviour, users can still decide what to do with their addEventListener('error')
.
Regarding garbage collection.
For streams, stream close happens in 2 separate ways:
For the read side, a stream could be closed when we receive fin packet. This is indicated by the fin
boolean through connection.streamRecv()
.
In this case, it would mean that we should close the stream. Because this means the QUIC stream is closed, and therefore our web stream is closed too.
Alternatively it is possible that, our web stream is cancelled, and thus stream.cancel()
is called. In this case, we perform connection.streamShutdown
on the read side.
I'm not entirely sure what would happen if there's an error during streamRecv
, if that means we should attempting to shutdown the stream, or whether that would the case already. The examples don't show what happens.
Right now I just propagate the error using controller.error(e);
.
However I imagine that if there is in fact a error here... we should attempt to shutdown the stream. I'm trying to see if this is a problem.
For the write side, we have both stream.close
and stream.abort
. The stream.close
sends a fin packet, whereas stream.abort
immediately shutsdown.
So because these streams are duplex (and we are only dealing with duplex streams atm in this library), then a QUICStream
is only truly closed when both sides are closed. And right now half-open state is possible.
So upon any of the above closing of the readable and writable, we call a function right now called gcStream
that checks that both read and send sides are closed, if so, it proceeds to delete the QUICStream
from the parent streams map.
This means stream lifecycle can be determined by the user of the QUICStream
, and by the QUICConnection
. It's also possible that QUICConnection
also proceeds to explicitly stop the readable and writable side with QUICConnection.stop
. This proceeds to cancel on the readable, while doing writable.close
because the writable side should be gracefully done.
This seems to make sense for QUICStream
.
Now for QUICConnection
, we have a separate problem. A connection can also be closed by outside events, primarily due to recv
, but also could happen during an error with send
.
In the case of recv
, an error is dispatched. But in the examples, they just continue the read loop. They don't do anything to the connection.
I believe this is because:
On success the number of bytes processed from the input buffer is returned. On error the connection will be closed by calling close() with the appropriate error code.
Which implies that the recv
itself will end up calling the close()
if there was an error. So we don't need to do this on our end.
Ok I can see it:
// In case of error processing the incoming packet, close
// the connection.
self.close(false, e.to_wire(), b"").ok();
So connection.recv
does in fact automatically close the underlying connection.
On the send side, if the send fails, we have to explicitly call connection.close
to indicate the fact that we failed to send a packet. We have to choose whether it is an application error or a library error, the error code and error message.
The problem is that connection.close()
does not mean connection.isClosed()
is true. It's a lazy operation, meaning there are some stuff that still needs to be done.
Note that the connection will not be closed immediately. An application should continue calling the recv(), send(), timeout() and on_timeout() methods as normal, until the is_closed() method returns true.
And because there's callbacks in quiche library. We have to actually poll the quiche library to know when a connection is actually closed. And only then can we proceed to remove the connection from the connections
map in QUICServer
.
The other issue is that I'm not sure how closing a connection affects all of its existing underlying streams.
It would appear that closing a connection should also mean all its underlying streams are closed. But if that's the case, how does our QUICStream
get knowledge about this?
It's possible we get a memory leak here, since if the streams are all closed, they could potentially not be readable or writable anymore, and in that case, we never get a quic stream error, and can therefore not propagate such errors to the web stream.
One alternative is to proceed to do an explicit stop on all our streams if there's a connection error. Actually we need to do it probably non-gracefully since if we cannot send data on the connection, it's rather useless to attempt to write an explicit stream close.
So a couple problems here:
QUICStream
object so that it can be properly garbage collected. Beware of how the underlying stream works, and how QUICStream
object itself would be aware or not aware of its underlying object state.QUICConnection
itself cannot just be closed synchronously. We have to poll whether it is closed, before we can remove it from the connections
map. The polling seems to occur on every event either when we received a UDP socket message, or when there's a timeout event.This is relevant too: https://datatracker.ietf.org/doc/html/rfc9000#section-10.2
For 2., I'm just doing connection iteration at the end of QUICServer
handleMessage
and handleTimeout
.
It's time to do the next practical test and map out what happens with the new class structure. And then from this point onwards we need to iterate on the class structure as there's still unknown unknowns and known unknowns. Difficult translating a loopy example codebase in quiche to a nodejs evented codebase.
Do note that the in order to pass in in-memory private key and certificates we will need to use boringssl's builder for the config struct. https://docs.rs/quiche/0.16.0/quiche/struct.Config.html#method.with_boring_ssl_ctx
It appears to support building up things like: https://docs.rs/boring/2.1.0/boring/pkey/struct.PKey.html#method.private_key_from_pem
And then you use functions like https://docs.rs/boring/2.1.0/boring/ssl/struct.SslContextBuilder.html#method.set_private_key which sets it up.
The builder is consumed and returns the ssl context: https://docs.rs/boring/2.1.0/boring/ssl/struct.SslContextBuilder.html#method.build.
After building this, more config settings can be set. This can done as a separate issue.
Ok I'm able to use wireshark to inspect the protocol using the filter udp.port == 55555 && not icmp
.
Here we only want to see the interaction between the client and server.
One issue is that we do need to decrypt the TLS, apparently I need to make use of https://docs.rs/quiche/0.16.0/quiche/struct.Config.html#method.log_keys. That can dump the keylog file that can be loaded by wireshark.
Actually you need more than this. You have to also use: https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.set_keylog
So right now wireshark will be of limited use... if we want to decrypt contents. I don't have that available in the native code yet. Can be done later.
The file is apparently meant to be supplied with SSLKEYLOGFILE
environment variable. I think the node runtime can provide this. But it may just a be a file path, that the rust side has to turn into a writer object.
In the initial packet the client sends to the server, it has both the DCID and SCID.
The DCID is what the client uses to identify the server. The SCID is what the client uses to identify itself.
Now why are we using dcid in our code to identify the connection? Because as chatgpt says:
The reason that the DCID is typically used to identify the connection in Quiche (and other Quic implementations) is because the DCID is included in every packet that is sent over the connection, while the SCID is only included in the initial packet sent by the client. This means that the DCID is more readily available for use as a connection identifier, as it is present in every packet.
And this is in fact true according to wireshark, there are QUIC protected paylaod packets that only contain the DCID and no longer the SCID. Because they use a short header.
Tracing the QUIC implementation is tough without some sort of tracing system for the logs. And this is a pending thing to do later.
I just realised a new logging standard that might be useful:
Receive X <-- start
Receiving X <-- only use this if you intend 1 log message
Received X <-- stop
We currently using ing
when we start, but really we should the ing
only when we intend to use 1 log message like an "event" compared to a trace which has a start event and stop event. I think in opentracing this is called a log? Not sure about the terminology there.
Furthermore suppose a function call happens before or after? Where should the ing
be? It's a bit ambiguous since it can be either. It's really its own point in time wherever it is relevant. Whereas start and stop should be at the beginning and end of the relevant function call.
However I err on before, because then if the operation failed, then an exception occurs after the message.
For 2., I'm just doing connection iteration at the end of
QUICServer
handleMessage
andhandleTimeout
.It's time to do the next practical test and map out what happens with the new class structure. And then from this point onwards we need to iterate on the class structure as there's still unknown unknowns and known unknowns. Difficult translating a loopy example codebase in quiche to a nodejs evented codebase.
With respect to the timeout. It turns out that for a new connection that was just accepted, calling timeout()
returns null
. This must imply that the timeout()
is intended be "polled", that is we must call into the quiche library on different events to find out whether the connection has a timeout that we must setup. In the rust examples, these appear to occur on every new message sent. Because otherwise how would we ever know if a connection ever needs to set a timeout or not? We may be able to spread out checks for this depending on various state changes on the connection but this requires further experimentation.
We might to start working on the QUICClient
class to fully work out the protocol... the server is getting to the point where the streams are just waiting for work to be done. And the example quiche client is using HTTP3, and we also need to somehow deal with the connection streams too...
So when we create a quic server, we need to have an event handler for new connections, but for each connection a handler for new streams.
Having a handler for new streams is similar to like HTTP handlers for request/response transactions. Each new stream is like new request, only that we can both read and write. A stream in this case is like a TCP connection/socket. But have it all multiplexed as part of connections too.
The user of the quic server needs to handle connections, but do they also handle streams directly like addEventListener('stream')
? Or would these events be part of the connection instead? I think it makes more sense to expect the user to add event listeners to the connection objects themselves so the server only works on connections.
We need to start building out the QUICClient
in order to prove that all these requirements can be met: https://github.com/MatrixAI/Polykey/issues/234#issuecomment-1123133027.
Specifically:
- The ability to multiplex separate connections into 1 socket, and therefore that means 1 port. This is necessary for NAT busting. A connection is between node to node. This means the same port/socket is used for multiple connections to different nodes.
Right now quiche doesn't do anything with sockets, so we are just receiving messages on the UDP socket and then processing it using the header parse. But we need to see how this would be done on the client side.
This was possible via go: https://github.com/lucas-clemente/quic-go/issues/561 so we need to be able to replicate this here.
When using the same socket, it is likely we would use the same dgram socket for multiple client connections to different servers and also for the server.
We need to also be able to identify whether the received packet is intended for which client connection.
The client.rs
in quiche apps is more comprehensive than the example client.rs
it also demonstrates how connection migration would be done on the client side. Only clients can do connection migration right now.
In terms of identifying packets, we can work with parsing their connection IDs, or their remote addresses. If we identified packets as coming from a certain address we could route to the right client. But we would have to first see if it is intended for the server, and not for the clients.
Also discussion on long and short headers:
Packets with long headers include Source Connection ID and Destination Connection ID fields. These fields are used to set the connection IDs for new connections; see Section 7.2 for details.
Packets with short headers (Section 17.3) only include the Destination Connection ID and omit the explicit length. The length of the Destination Connection ID field is expected to be known to endpoints. Endpoints using a load balancer that routes based on connection ID could agree with the load balancer on a fixed length for connection IDs or agree on an encoding scheme. A fixed portion could encode an explicit length, which allows the entire connection ID to vary in length and still be used by the load balancer.
But the problem even with addresses, we would need to know if these addresses are legitimate, which is why the server does the stateless retry.
This version of QUIC uses the long packet header during connection establishment; see Section 17.2. Packets with the long header are Initial (Section 17.2.2), 0-RTT (Section 17.2.3), Handshake (Section 17.2.4), and Retry (Section 17.2.5). Version negotiation uses a version-independent packet with a long header; see Section 17.2.1.
Packets with the short header are designed for minimal overhead and are used after a connection is established and 1-RTT keys are available; see Section 17.3.
In reviewing Polykey's usage of UTP, it does seem that UTP was made for this usecase. Where we have the ability to create a UTP object when doing UTP.createServer()
, and then subsequently it was possible to create UTP connections on the same UTP object. Thus being able to act like a server, and the ability to create multiple clients.
It seems that quiche should be capable of doing this. There's no restriction on sharing the same UDP socket after doing bind, we can have multiple QUICClient
and a QUICServer
all using the same UDP socket.
The only issue is that all the examples show that the clients and server all depend on handleMessage
but we just need a way of parsing these UDP messages and identifying what kind of packet it is... and then directing it. Although I'm also confused that it's possible for there to be coalesced QUIC packets in 1 UDP datagram. If so, then how does the Header.fromSlice
work? Maybe it only reads 1 QUIC packet out of the UDP datagram, meaning its possible for there to be more than one?
I think the Header.fromSlice
is ONLY used to parse the INITIAL QUIC packet. This means there could in fact be multiple QUIC packets coalesced in a single handleMessage
. However Header.fromSlice
will only parse the first one. The subsequent packets don't need to be parsed, as they will just be handled by the connection.recv
. The primary use of Header.fromSlice
is to acquire the connection IDs so we can route them appropriately. Remember that the server already is muxing multiple connections from different clients. Therefore if there is a UDP socket used for both clients and server, then at the end of the day we just have an even larger map of connections all identified by connection IDs.
The client's INITIAL packet sent to the server contains a randomly set SCID and DCID.
When an Initial packet is sent by a client that has not previously received an Initial or Retry packet from the server, the client populates the Destination Connection ID field with an unpredictable value. This Destination Connection ID MUST be at least 8 bytes in length. Until a packet is received from the server, the client MUST use the same Destination Connection ID value on all packets in this connection.
The Destination Connection ID field from the first Initial packet sent by a client is used to determine packet protection keys for Initial packets. These keys change after receiving a Retry packet; see Section 5.2 of [QUIC-TLS].
The client populates the Source Connection ID field with a value of its choosing and sets the Source Connection ID Length field to indicate the length.
Ok so I think the idea here is that if we are to combine clients and server, we then have to have a combined handleMessage
demuxer. It must then analyse the packet header to get the connection ID. If the UDP datagram has multiple coalesced QUIC packets, then it is assumed that all these QUIC packets would be intended for the same connection. If any of the packets deviates from the first packet's connection IDs, then that will result in a processing error later by quiche.
Then inside this demuxer it has to refer to a ConnectionMap
. This map is basically something that will need to be shared across QUICClient
and QUICServer
because it has to be used to identify QUIC packets intended for client-side connections or server-side connections.
Further prototyping required on this front.
In the client examples, it binds to a random IPv4 address and port if the server/target address is IPv4, and a random IPv6 address and port if the server/target address is IPv6. This actually means we do need to know what the target address via a DNS resolution first.
I'm not entirely sure if NodeJS supports dual stack properly where ipv4 mapped ipv6 addresses are possible: https://stackoverflow.com/questions/61741547/udp-client-server-mix-ipv4-and-ipv6
See https://blog.apify.com/ipv4-mapped-ipv6-in-nodejs/
Suppose we use the rinfo
, then if we were bound to ::
, would we then get ipv4 packets as ::ffff:127.0.0.1
?
And also this only works if we bind to ::
and not any other IPv6 addresses. This will require further testing later.
I got a confirmation that using connection IDs is how we would demux packets for shared UDP socket.
So I'm creating a QUICSocket
to encapsulate the management of a shared socket object, since creating an appropriate UDP socket is a bit complex.
Now that we know that some packet headers are short-form and thus do not have an SCID
, therefore when we do quiche.header.fromSlice
, what happens to the scid
property? Well it still exists, but the Rust code uses ConnectionId::default()
which will produce an empty byte array.
Updated handshake diagram with DCID and SCID.
The SCID is the source connection ID, and it is meant to be chosen by the peer to represent the ID for itself.
The DCID gets "agreed" upon using the below protocol.
Retry and version negotiation packets and packets with short header cannot be coalesced with other packets in the same datagram. This means they are always by themselves.
Notice that in the retry packet, a new SCID S2
is used. This is derived from the S1
via HMAC signing. The HMAC signing is not part of the QUIC specification, however it is security feature to ensure the integrity and prevent replay attacks. It is just chosen by the quiche implementation to use in its examples.
The client has to change its DCID in response to this retry packet. But it is possible for the server to change its SCID again in its next initial packet. In which case the client has to change the DCID again. There's no specific reason why this might occur.
Note that at this point in time, the QUIC spec only supports client-side connection migration. It's not expected that the server would have its address change. Although this would be the case in the future. Connection migration just means that the client can change its address (possibly because it changed networks on a mobile network).
In PK's P2P case, any change in address would affect both client side and server side. So it would require performing a new handshake and reconnecting all connections.
┌────────┐ ┌────────┐
│ Client │ │ Server │
└───┬────┘ └────┬───┘
│ ┌───────────────────┐ │
1.├─────────►│Initial ├──────────►│
│ │version: 3132799674│ │
│ │token: [] │ │
│ │dcid: S1 │ │
│ │scid: C1 │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
2.│◄─────────┤Version Negotiate │◄──────────┤
│ │version: 1 │ │
│ │dcid: C1 │ │
│ │scid: S1 │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Initial │ │
│ │version: 1 │ │
3.├─────────►│token: [] ├──────────►│
│ │dcid: S1 │ │
│ │scid: C1 │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Retry │ │
4.│◄─────────┤token: [S1] │◄──────────┤
│ │dcid: C1 │ │
│ │scid: S2 │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Initial │ │
│ │version: 1 │ │
5.├─────────►│token: [S1] ├──────────►│
│ │dcid: S2 │ │
│ │scid: C1 │ │
│ └───────────────────┘ │
│ │
│ ┌───────────────────┐ │
│ │Initial │ │
│ │version: 1 │ │
6.│◄─────────┤token: [] │◄──────────┤
│ │dcid: C1 │ │
│ │scid: S3 │ │
│ └───────────────────┘ │
Ok with respect to the connection map. We are therefore always mapping DCID to the connections.
But due to connection migration, it is possible that multiple connection IDs may point to the same connection, but won't worry about that for the moment.
Now in our implementation, there's a double lookup of both the QUIC packet's DCID and also the derived connection ID. The reason is mainly due to the fact that in the middle of processing packets, its possible that the connection does already exist, but just under the name of the derived connection ID. Apparently this is due to potentially splitting ClientHello
across multiple Initial
QUIC packets.
See: https://github.com/cloudflare/quiche/commit/06c0d497a4e08da31e8d3684a7bcf03cca38448d
Ok so therefore, ultimately it's the client sent packet's DCID (which itself is derived from the server's SCID) which is used to identify the connection (this is called the "server-generated DCID" for identifying the server).
On a client-side perspective, using the received packet's DCID would mean that sometimes this DCID is actually our own client-generated DCID when we first connected to the server. When we create a client-side connection, we create a SCID randomly, and subsequent packets sent in response to this connection has the DCID equal to this.
So our connection map would then map EITHER:
And there would be no overlap... or exceedingly low probability of it occurring.
In terms of demuxing the handleMessage
. It is also possible to register multiple handlers for the message
event. However this is not efficient because every time an message event occurs on the dgram socket, it ends up calling every single handler.
If we use event.stopImmediatePropagation()
we can cancel the handling of subsequent handlers. But again it's still possible with a 100 QUIC clients, it would check all 100 clients before hitting the server.
So our demuxing logic instead happens within a single message
handler, that instead checks a shared connection map to decide what to do depending on whether a server is registered or not, or whether the connection exists or not.
So now I'm at the point where the QUICSocket
can call into the QUICServer
and ask it to handle a new connection, but only if it is registered as the server for the socket. Otherwise it just discards those kinds of packets.
It can also acquire an existing QUICConnection
from the connection map if the connections DCID matches one of them.
However some decisions has to be made now.
QUICSocket
handleMessage
, or is that something we delegate that to the QUICClient
or QUICServer
? I'm still not entirely sure what the QUICClient
really does besides bootstrap a QUICConnection
.QUICConnection
, there doesn't seem to a way to identify whether it is a client or server connection, and if we identify the client, how to then access the client object too. Not sure if we even need this.In the case of 1.
, we could say that neither should, that is instead the construction of a QUICConnection
leads it to be put into the map, and correspondingly the destruction of the connection takes it out of the map. Creation and destruction can be approached with a CreateDestroy
pattern where, asynchronous creation and asynchronous destruction.
For the QUICConnection
, I've made the 2 methods recv
and send
represent:
recv
means bridging socket -> connection -> streams
.send
means bridging socket <- connection <- streams
.This means QUICConnection.recv
is supposed to be called by the socket, since that's where the data flows from.
The QUICConnection.send
on the other hand is supposed to be called by the stream, since that's where the data flows from.
HOWEVER the QUICConnection.send
may also have data flowing from other sources, such as just handshake protocol, so even in in the handle message of the socket, in can trigger QUICConnection.send
.
So the control flow can come from other sources.
This means the send
actually loops over the connection object and flushes all data to the socket.
This also means, there's some tight coupling between all these objects, since the socket injects itself into the server/client which itself injects it to the connection object when it creates it... etc. I've realised that there's no other way to do it, it has to be somewhat tightly coupled mechanism.
Finally the problem is errors, where do they go? Should they go to the caller? Or in the case where the caller isn't available or it's all just event driven, it seems errors has to goes to a generic error event (that is like a parent object listening for the error event). But in that case it goes against how we normally write things with promise based methods, where errors flow back up the control flow.
I'm still not entirely sure how to do it, so right now I'm doing both and going to figure out which is the best soon.
This paper https://www.dsi.fceia.unr.edu.ar/downloads/informatica/info_III/eventexcep.pdf talks about exactly the issue I'm dealing with right now. How to manage errors from methods are basically "event handlers". The composition of these objects is not strictly hierarchical. The event invoker isn't actually really more capable of handling these errors, and especially with a promise-API, having the invoker discard errors results in ugly line noise of await p.catch(() => {});
(and in some cases, there should be exceptions where something was written incorrectly.
Ok the QUICConnection
now works against 3 possible events: send
, recv
and timeout. These 3 events drive all interactions that could occur with a connection. The send
results in a UDP socket event. The recv
results in stream events. The timeout
ends up closing the connection eventually due to draining timer, idle timer or path timer.
I think on server side created connections, they don't have a timeout until the first recv
call occurs (which happens immediate for newly created server-side connections).
Errors don't go to the caller, there's a no-exception guarantee, and instead they get emitted. Of course this is for "expected exceptions", anything unexpected will still be thrown up, but those would be considered programmer errors, not runtime errors.
Now we proceed back to testing the QUICSocket
, QUICServer
, QUICConnection
and QUICStream
before going back to QUICClient
.
Cool, it is sort of working right now. I can see the timeouts being set, but it's not really cleaning up the connection just yet.
INFO:QUICServer:Starting QUICServer on 127.0.0.1:55555
INFO:QUICSocket:Starting QUICSocket on 127.0.0.1:55555
INFO:QUICSocket:Started QUICSocket on 127.0.0.1:55555
INFO:QUICServer:Started QUICServer on 127.0.0.1:55555
DEBUG:QUICServer:QUIC packet version is not supported, performing version negotiation
DEBUG:QUICServer:Send VersionNegotiation packet to 127.0.0.1:34101
DEBUG:QUICServer:Sent VersionNegotiation packet to 127.0.0.1:34101
DEBUG:QUICServer:Send Retry packet to 127.0.0.1:34101
DEBUG:QUICServer:Sent Retry packet to 127.0.0.1:34101
DEBUG:QUICServer:Accepting new connection from QUIC packet
INFO:QUICConnection ad0c415a56fd5c885fd2a35a2fa9f2580932122a:Creating QUICConnection
EVER TIMEOUT null
INFO:QUICConnection ad0c415a56fd5c885fd2a35a2fa9f2580932122a:Created QUICConnection
got the connection QUICConnection {}
EVER TIMEOUT 4999
EVER TIMEOUT 998
EVER TIMEOUT 4997
EVER TIMEOUT 28
EVER TIMEOUT 4999
EVER TIMEOUT 4999
EVER TIMEOUT null
^CINFO:QUICServer:Stopping QUICServer on 127.0.0.1:55555
INFO:QUICConnection ad0c415a56fd5c885fd2a35a2fa9f2580932122a:Destroying QUICConnection
INFO:QUICConnection ad0c415a56fd5c885fd2a35a2fa9f2580932122a:Stopped QUICConnection
INFO:QUICSocket:Stopping QUICSocket on 127.0.0.1:55555
INFO:QUICSocket:Stopped QUICSocket on 127.0.0.1:55555
INFO:QUICServer:Stopped QUICServer on 127.0.0.1:55555
Important that this library will only focus on QUIC, and not HTTP3.
We need to investigate how the timers work and ensure that they are actually being timed out properly and then it should proceed to close the connection if there's no response on the other side.
@tegefaulkes regarding 14.
- [ ] Propagate the
rinfo
from the UDP datagram into theconn.recv()
so that the streams (either during construction or otherwise) can have itsrinfo
updated. Perhaps we can just "set" therinfo
properties of the connection every time we do aconn.recv()
. Or... we just mutate theconn
parameters every time we receive a UDP packet.
We need to understand that under QUIC, it's possible for the same connection to have different remote host and remote port.
See: https://www.rfc-editor.org/rfc/rfc9000.html#name-connection-migration
Now I'm actually not entirely clear atm.
It's possible that when the client migrates to a new network path, that the dcid or scid changes.
I haven't handled this case yet in the QUICConnection
class. That is, it would mean maintaining the same QUICConnection
object instance (because we don't want to lose the streamMap
state), and transitioning the connection ID.
At any case, this mean the remoteHost
and remotePort
could change at any point in time when querying the connection.
It's even possible that there could be multiple valid concurrent remote hosts and ports that are going to different streams on the same connection, I haven't confirmed this case yet.
Furthermore you're handling events on the stream itself, and you want to give it an object of remote host/port information at the point of handling that stream.
My current solution is to:
We put in the properties localHost
, localPort
, remoteHost
and remotePort
into QUICConnection
, however the remoteHost
and remotePort
can be change at any time, so the information you are passing in the handler is only valid at that specific point in time where you are constructing the object. This may be sufficient for your usecase, since you are just using it for logging and nothing else.
So now there is:
QUICConnection.localHost
QUICConnection.localPort
QUICConnection.remoteHost
QUICConnection.remotePort
The remoteHost
and remotePort
can change on every QUICConnection.recv
invocation.
This still needs to be tested with connection migration. Atm only client connections can migrate, servers cannot migrate according to the QUIC spec.
Since every PK agent is both client and server, this makes it bit weird.
Migrating a connection to a new server address mid-connection is not supported by the version of QUIC specified in this document. If a client receives packets from a new server address when the client has not initiated a migration to that address, the client SHOULD discard these packets.
That means if a PK agent were to be migrating to new IP/port due to disconnection, then the servers would have to restart. This would be important on mobile networks, it's something we will think about when we get PK on the mobile phone. We don't actually need to do "live migration" for our PK agent, we can rely on our kademlia system to do this, but indeed all client connections would have to automatically disconnect and retry on the new address upon detecting this on the node graph.
The config.setMaxIdleTimeout
does control the initial timeout.
So on the server side, upon constructing the QUICConnection
, the initial call to conn.timeout()
gives back null
.
However after the first conn.recv()
, the next call gives back 6000
ms. (Actually 5999).
What's a bit confusing is that after calling conn.send()
, the next call changes this to 1000ms (actually 998).
The max idle timeout parameter is explained as:
/// Sets the `max_idle_timeout` transport parameter, in milliseconds.
///
/// The default value is infinite, that is, no timeout is used.
pub fn set_max_idle_timeout(&mut self, v: u64) {
self.local_transport_params.max_idle_timeout = v;
}
This parameter is actually exchanged with the client in order to get the lowest possible max idle timeout.
On the second time and third time we call it, we are checking if we are draining. In both cases this is false (probably because the connection isn't closed).
There's a check as to the lowest loss detection timer, and again compared with the idle timer.
This means on the second time it is called, the idle timer must win, and there probably isn't a loss detection timer. And thus we get 5000ms.
Then on the third time, there must be a loss detection timer set, and therefore the 1000ms is lower, and this is returned instead.
The config does not have anyway of setting this loss detection timer, this must be set automatically.
.filter_map(|(_, p)| p.recovery.loss_detection_timer())
The loss detection timer is set inside the Recovery
struct.
It appears to be set by a function Recovery::set_loss_detection_timer
.
This feels like this is called by the Connection::send
. Note that the loss_detection_timer
is a Option<Instant>
.
I suspect this loss detection timer has something to do with: https://www.rfc-editor.org/rfc/rfc9000.html#name-loss-detection-and-congesti.
This would make sense that the loss detection time may be something that is lower than the idle timeout, since we are more concerned if some packet was lost, and therefore probably needs to be resent.
From ChatGPT:
The draining timer is used to ensure that all data has been sent on a connection before it is closed. The idle timer is used to close a connection if no data is sent or received for a certain period of time. The loss detection timer is used to detect when packets have been lost in transit and retransmit them if necessary. All 3 timers are used to manage the state of a connection and ensure reliable communication.
It does make sense that we would reset the timer upon additional events.
Ok so when the connection times out, and it does eventually. The handler is in fact called.
Now I put in:
console.log('draining', this.conn.isDraining());
console.log('closed', this.conn.isClosed());
console.log('timed out', this.conn.isTimedOut());
console.log('established', this.conn.isEstablished());
console.log('in early data', this.conn.isInEarlyData());
console.log('resumed', this.conn.isResumed());
Before and after calling this.conn.onTimeout()
.
I can see something like this:
TIMEOUT HANDLER CALLED
draining false
closed false
timed out false
established true
in early data false
resumed false
AFTER ON TIMEOUT
draining false
closed true
timed out true
established true
in early data false
resumed false
It is interesting here that upon calling onTimeout()
the connection is closed. But the draining
remains false
.
This can mean that if it is already closed, the draining
does not remain true even if draining were to be the case after which a timeout occurs and then the closing happens.
Anyway this leads us to call this.send()
again just in case timing out means that we are draining now.
In the send
, the finally clause kicks in and here it goes to run setTimeout
again.
At this point we do re-run the setTimeout
before then checking if the connection is closing or draining.
Now 2 things happen here, the next call to the this.conn.timeout
is null
, this is because the connection is already closed, therefore no timeout is further necessary. However I imagine that if instead the connection was draining, another timeout is necessary.
The problem was that I was checking that the status is not in destroying
and that the connection was isClosed() && isDraining()
, that's obviously not possible now that we know what happens inbetween onTimeout()
.
So I've changed this to isClosed() || isDraining()
.
This fixes the problem and now the connection is infact destroyed after the timeout.
However I think this revealed another problem. Which is that suppose it was draining instead of being closed.
When we call destroy
, this section will run:
if (!this.conn.isClosed()) {
// The `recv`, `send`, `timeout`, and `on_timeout` should continue to be called
const { p: closeP, resolveP: resolveCloseP } = utils.promise();
this.resolveCloseP = resolveCloseP;
await Promise.all([
this.send(),
closeP
]);
}
Which means it blocks on the closeP
promise while running send()
simultaneously.
Now 2 things could happen here.
The original timeout that was set before calling destroy()
might be running, it might be the draining timer. That draining timer could activate, which then triggers the onTimeout()
which would close the connection, which then results in a call to QUICConnection.send
which should then resolve the closeP
and then allow the destruction to complete.
Alternatively the send
runs, the connection would still not be closed, it ends up flushing the data to the socket.
Now if the data is all completely flushed in the socket, that could mean the socket is in fact closed now.
If it runs setTimeout()
this may then clear the timeout because no more timers are needed, and then because the status is destroying that ends the send()
promise, but nothing is resolving the closeP
.
Which means we now have a blocked promise.
So this is a potential problem.
It seems the solution here is that we need to move the resolveCloseP
to a different location.
On solution is something like this:
if (
this[status] !== 'destroying' &&
(this.conn.isClosed() || this.conn.isDraining())
) {
await this.destroy();
} else if (
this[status] === 'destroying' &&
(this.conn.isClosed() && this.resolveCloseP != null)
) {
// If we flushed the draining, then this is what will happen
this.resolveCloseP();
}
But I need more testing on flushing situations so we can see how it behaves.
Tagging @tegefaulkes to keep up with this progress.
@tegefaulkes I'm testing the handling of streams atm.
It looks like:
const writer = stream.writable.getWriter();
for await (const read of stream.readable) {
console.log(read);
}
await writer.ready;
await writer.write(Buffer.from('Hello World'));
await writer.ready;
writer.releaseLock();
await stream.destroy();
Now I have some questions...
It seems that a "writer" is a locked reference to the writable stream. It's a way of maintaining control/ownership of the stream, such that only 1 writer can be the one using the stream.
await writer.close()
, and later inside stream.destroy
it calls await this.writable.close()
, there's an error TypeError [ERR_INVALID_STATE]: Invalid state: WritableStream is closed
. How are we supposed to close a writable stream, if there's a writer that still exists? Are we meant to check that if the writer is still locked (that is with an active writer), therefore it cannot be destroyed?this.readable.cancel()
then await this.writable.close()
is the right order of events. But now the issue is that how do you co-ordinate that with reader/writer objects?writer.releaseLock()
? Is there actually meant to be multiple potential writers to the same stream?
Specification
We need QUIC in order to simplify our networking stack in PK.
QUIC is a superior UDP layer that can make use of any UDP socket, and create multiplexed reliable streams. It is also capable of hole punching either just by attempting to send ping frames on the stream, or through the unreliable datagrams.
Our goals is to make use of a QUIC library, something that is compilable to desktop and mobile operating systems, expose its functionality to JS, but have the JS runtime manage the actual sockets.
On NodeJS, it can already manage the underlying UDP sockets, and by relying on NodeJS, it will also ensure that these sockets will mix well with the concurrency/parallelism used by the rest of the NodeJS system due to libuv and thus avoid creating a second IO system running in parallel.
On Mobile runtimes, they may not have a dgram module readily available. In such cases, having an IO runtime to supply the UDP sockets may be required. But it is likely there are already existing libraries that provide this like https://github.com/tradle/react-native-udp.
The underlying QUIC library there is expected to be agnostic to the socket runtime. It will give you data that you need to the UDP socket, and it will take data that comes of the UDP socket.
However it does need 2 major duties:
Again if we want to stay cross platform, we would not want to bind into Node.js's openssl crypto. It would require instead that the library can take a callback of crypto routines to use. However I've found that this is generally not the case with most existing QUIC libraries. But let's see how we go with this.
Additional context
QUIC and NAPI-RS
Sub issues
2
3
4
5
6
7
8
9
10
13
14
15
16
17
18
Tasks
Config
so the user can decide this (especially since this is not a HTTP3 library). - 0.5 day~ - see #13null
. Right now when a quiche client connects to the server, even after closing, the server side is keeping the connection alive. - 1 dayQUICConnection
andQUICStream
andQUICServer
andQUICSocket
. This will allow users to hook into the destruction of the object, and perhaps remove their event listeners. These events must be post-facto events. - 0.5 dayQUICStream
and change to BYOB style, so that way there can be a byte buffer for it. Testing code should be able generator functions similar to our RPC handlers. - 1 day~ - see #5.QUICClient
with the shared socketQUICSocket
. - 3 dayQUICClient
and a singleQUICServer
. - 1 day~ - See #14rinfo
from the UDP datagram into theconn.recv()
so that the streams (either during construction or otherwise) can have itsrinfo
updated. Perhaps we can just "set" therinfo
properties of the connection every time we do aconn.recv()
. Or... we just mutate theconn
parameters every time we receive a UDP packet.~ - See #16stream.connection
they can acquire the remote information and also all the remote peer's certificate chain.~ - See #16napi
program into thescripts/prebuild.js
so we can actually build the package in a similar to other native packages we are using likejs-db
~ - see #7