crossbario / crossbar

Crossbar.io - WAMP application router
https://crossbar.io/
Other
2.05k stars 274 forks source link

rlinks: multiplexing based on QUIC ? #1898

Open om26er opened 2 years ago

om26er commented 2 years ago

Currently when running a Crossbar mesh, all the realms are connected to their "counterparts" in each node. WIth dynamic realms that means realms_count * nodes_count. That doesn't scale well.

QUIC, which doesn't have the head-of-line blocking issue and is multiplexed would serve "best" in this case. So basically doing RawSocket over QUIC (as opposed to TCP).

The current defacto implementation of QUIC in python could be used for that https://github.com/aiortc/aioquic. Even though the project says AIO, the good thing is, it was written in a "sansIO" architecture, which makes it adapting to Twisted relatively simple.

oberstet commented 2 years ago

yeah, I more or less agree with above comments!

however, before jumping into details, I think we should discuss the top-level design options a bit first.

namely because the issue here convolutes 2 things:

  1. multiplexing WAMP sessions over 1 transport
  2. having QUIC as a new bytes transport alternative for websocket, TCP, ...

the "head of line" blocking is related to "backpressure control", and yet a 3rd thing ..

now, QUIC has "streams".

a QUIC stream could hence serve instead of a single plain TCP connection.

both could run "websocket" or "rawsocket" WAMP framing, with any supported serialization

in this model, a QUIC stream would be mapped to a new Autobahn connection (an AB tx protocol instance).

the whole QUIC connection would be sth outside/top of an AB connection


an alternative would be:


also see the discussion here: https://github.com/wamp-proto/wamp-proto/issues/363


aioquic : yes, this looks very good. I also like "Both the QUIC and the HTTP/3 APIs follow the "bring your own I/O" pattern, leaving actual I/O operations to the API user. "

because: we need this to integrate with Twisted! all of CB runs on tx.

oberstet commented 2 years ago

CB can dynamically start rlinks. Consequently, being able to dynamically start a new WAMP session in a new QUIC stream over an already established WAMP-QUIC connection to a remote node would be great of course! also tricky;) in the sense, bits to be changed deep inside and in multiple places.

However, what I'm sure is: such a project would be definitely worth efforts. It fits nicely into WAMP and CB with pretty zero user level ripples, and opens up WAMP to QUIC - which is the future ... it might become the new TCP actually:

oberstet commented 2 years ago

in any case, I am very much ++1 on this=)

couple of more notes:

om26er commented 2 years ago

an alternative would be:

I'd be honest I don't fully understand the two options completely. However the first option seems relatively straightforward as compared to the second one, which would require, changing the WAMP protocol. For WAMP perspective, it does make sense to make it future-proof and adapt it for QUIC, would that be WAMP2.5 though, given the would be a somewhat fundamental change in how a WAMP session is started.

For the first approach, IMO a tangible proof-of-concept would be to just run WAMP-over-RawSocket-over-QUIC, without multiplexing, of course that doesn't help much but could help get a "feel" of running WAMP over QUIC. I can probably play with that during the weekend, just for kicks

om26er commented 2 years ago

I agree, writing on the wamp-proto issue would probably also involve feedback from the other router and client implementers.

  • but since there are multiple approaches and details, these should be set out (defined/nailed) before starting with code.

Yeah, makes sense :+1:

om26er commented 2 years ago

sidenote: Having multiplexed transport could also mean starting more than 65535 WAMP connections from a single IP address. That can be useful for benchmarks etc.

oberstet commented 2 years ago

For the first approach ...

ok, cool!! so this approach is roughly outlined here:

https://github.com/wamp-proto/wamp-proto/issues/363#issuecomment-582812770

=> mapping WAMP transports to QUIC streams - not QUIC connections. keep 1 WAMP session per WAMP transport. multiplexing works at the WAMP transport / QUIC streams level.

I'd be honest I don't fully understand the two options completely

no worries. the advantage of the alternative option would be: multiplexing of multiple WAMP sessions over any WAMP transport (eg a single RawSocket connection).

in any case, yes, this requires deeper changes:

om26er commented 2 years ago

CB can dynamically start rlinks. Consequently, being able to dynamically start a new WAMP session in a new QUIC stream over an already established WAMP-QUIC connection to a remote node would be great of course! also tricky;) in the sense, bits to be changed deep inside and in multiple places.

I agree, this is a very important thing to have. The real power lies in rlinks + dynamic realms.

Separately, I am curious how would the API on autobahn-python side would look like ? There are two flows that come to mind.

  1. The QUIC connection is made separately and that "connection" object is passed to each new session
  2. Autobahn internally keeps a reference to the transport and a new WAMP session provides the "transport_id" that it wants to attach to.

Option 1 is similar to how the Client class in autobahn-java is implemented i.e. it can be initialized by a list of ITransports https://github.com/crossbario/autobahn-java/blob/1dba43fbeea92dff70be423f7a067c4e8ccce5c3/autobahn/src/main/java/io/crossbar/autobahn/wamp/Client.java#L87 and then new WAMP sessions can be "attached" https://github.com/crossbario/autobahn-java/blob/1dba43fbeea92dff70be423f7a067c4e8ccce5c3/autobahn/src/main/java/io/crossbar/autobahn/wamp/Client.java#L109

oberstet commented 2 years ago

so in general, my thinking would be more along option 1:

I've never tried that kind of nesting, but it sounds plausible .. @meejah any opinions?

in any case, the 2nd IProtocol instance that wraps a QUIC stream is providing a transparent bidirectional reliable byte stream (which is what twisted IProtocol abstract), and on top, can run WebSocket or RawSocket framed WAMP messages.

a "transport ID" is implicit in that 2nd instance, and the containing QUIC connection is the dual 1st IProtocol/IFactory

oberstet commented 2 years ago

just writing down a possible class structure (which maybe can be simplified for "client"/"rawsocket" to reduce number of classes):

and instance of QuicServerFactory would live in the router worker as the top-level object for a wamp listening transport in the worker

om26er commented 2 years ago

just writing down a possible class structure (which maybe can be simplified for "client"/"rawsocket" to reduce number of classes):

  • class QuicServerFactory : accepts incoming QUIC connections, producing instances of QuicServerConnection for accepted connections
  • class QuicServerConnection : represent the server side of a single, connected QUIC connection, and holds an instance of:
  • class QuicServerStreamFactory : produces instances of QuicServerStreamProtocol per QUIC stream opened on above connection
  • class QuicServerStreamProtocol : represent the server side of a single, connected QUIC stream - a transparent bidirectional reliable byte stream
  • class QuicWebSocketServerFactory(WebSocketServerFactory, QuicServerStreamFactory) : adds WebSocket message framing on top of the byte stream provided in a QUIC stream

and instance of QuicServerFactory would live in the router worker as the top-level object for a wamp listening transport in the worker

I like the proposal, I did start working on that a while ago but got side tracked.

One issue that understand that we'll face is performance when using QUIC. Since it's encrypted by default transport, it's definitely going to be slower (read: CPU intensive comparatively) that WAMP-Over-RawSocket-Over-TCP that we use today for rlinks. We'll know the real numbers when we have something running of course

oberstet commented 2 years ago

it's definitely going to be slower (read: CPU intensive comparatively) that WAMP-Over-RawSocket-Over-TCP that we use today for rlinks.

yes, true. I think, using UDS for host-local rlinks, even if non-multiplexed, which results in "many" UDS connection will be very very hard to beat. essentially, only with a true 0-copy IPC thing ..

for non-local rlinks running over WAN the connection should use TLS anyways .. hence no added cycles compared to quic

so the only case remaining is data-center local rlinks .. eg where all hosts connected by such rlinks reside within a separate, firewalled layer2 private LAN

further, I am assuming NICs are used without HW crypto offloads.

not sure .. is there a mellanox, chelsio, netronome, etc accelerated NIC with HW crypto for Quic, eg via openssl?

oberstet commented 2 years ago

Up to 50Gb/s of SSL, IPsec and TLS/KTLS inline offload and acceleration

https://www.netronome.com/media/documents/PB_Agilio-CX-OCP_50G-7-20.pdf

question is, is that usable from quic? it doesn't say "DTLS support" ...

om26er commented 2 years ago
  • add multiplexing to WAMP itself (I had a PR with code somewhere .. definitely on GH .. but I forgot where. could be AB, CB or CFX. likely a now closed PR)

Quite likely this https://github.com/crossbario/autobahn-python/pull/989

oberstet commented 2 years ago

yes, indeed! you found it=) cool.

so from a quick look, the code is unfinished, but the design can be seen already:

  1. it introduces new WAMP transport variants "muxed" which can be combined with serializer and batched/non-batched

"json.muxed.batched" "json.batched" "json.muxed" "json"

  1. it uses a WAMP connection that is established as normally, and then allows the WAMP (base) session to open new WAMP muxed sessions via open_mux_transport

  2. muxed messages are transported as [MUX_MESSAGE_TYPE, self._mux_session_id, self.marshal()]