Open Demi-Marie opened 4 years ago
I assume you have some data that has shown this is a bottleneck for your application? Can you share some such data? I'd like to better understand what the scenario is.
It don’t know if it is for my application, but I know it is for Google ― I read this design in Google presentation slides from 2017. Cloudflare making the same decision in quiche is consistent with this. It also avoids needing to copy packets on ingress, even if the buffer is owned by the event loop.
What does matter for libp2p is preventing a deadlock without requiring unbounded channels: #637.
This is definitely an interesting case, particularly when combined with support for user-generated connection IDs and a NIC or load balancer that can route packets to different queues or sockets intelligently. I think accepting packets directly at the connection with a fallback path for unrecognizable packets (e.g. stateless resets) could make sense, and I'd be happy to review concrete proposals, and ultimately patches, towards that. Because it's a bit niche, I'm unlikely to personally prioritize it above other issues like conformance and performance.
On Wed, Feb 19, 2020, 11:28 AM Demi Obenour notifications@github.com wrote:
It don’t know if it is for my application, but I know it is for Google ― I read this design in Google presentation slides from 2017. Cloudflare making the same decision in quiche is consistent with this. It also avoids needing to copy packets on ingress, even if the buffer is owned by the event loop.
What does matter for libp2p is preventing a deadlock without requiring unbounded channels: #637 https://github.com/djc/quinn/issues/637.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/djc/quinn/issues/636?email_source=notifications&email_token=AAAZQ3SWAJGJW36ESP4F4STRDWB5XA5CNFSM4KX5NEWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMJFJ6A#issuecomment-588403960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZQ3XJRFLKQ64VIMDNAPLRDWB5XANCNFSM4KX5NEWA .
User-generated connection IDs might also reduce the amount of message passing strictly necessary for an endpoint to function, which would be nice for cosmetic reasons as well.
We should provide utility functions for generating and parsing them, though.
They're totally opaque blobs; I'm not sure what we could provide, beyond an out-of-the-box implementation that uses random IDs like the current hardwired one.
By parsing, I mean parsing the connection ID from an incoming UDP packet, so that the user’s code knows where to send it.
Oh, yes, that's a good call. Though I expect nearly anyone who actually needs to deal with such an interface rather than using the basic out-of-the-box stuff will in practice be delegating that responsibility to specialized hardware that must be programmed through other means.
Not necessarily! Linux’s SO_REUSEPORT
does steering based on 4-tuple, and that is accurate the vast majority of the time. The few packets for which it is wrong can be sent to the thread that can process them.
It seems to me that the simple path for a short-header packet through the endpoint is very simple and should hardly be a bottleneck. In the common case I also don't think it requires a mutable Endpoint
reference. Maybe we should split the error-handling path that requires mutability from the simple path, so perhaps the Endpoint
can be wrapped in an RwLock
or some such so that it can more easily be shared across threads?
I would like to see the common case not require an Endpoint
reference at all, so that it can be entirely lock-free. Even an RwLock
can be a bottleneck.
How are you going to route datagrams to connections, then? Seems like you'll always need some kind of table of 4-tuples or connection IDs, which might as well be called Endpoint
.
What I would like is to be able to shard an Endpoint
across multiple tasks/threads, and avoid locking. This is easy provided that different Endpoint
s assign disjoint connection IDs. I would also like to be able to deal with #637.
Specifically, the fast path should not require any synchronization at all. Sharding across cores, and relying on SO_REUSEPORT for traffic steering, is one way to do this. A lock-free mechanism for reading from the table, such as RCU, is another.
To clarify: SO_REUSEPORT enables load balancing in the kernel. So each thread holding its own socket should receive pre-filtered packets based on the source address/port. Therefore most of the time a thread will get packets belonging to the same connection, except for roaming clients.
To be clear, I think this would be great to have, it's just a good way down the list.
This is also needed for per-connection backpressure.
libp2p-quic really could use this, so that it can implement per-connection backpressure.
I'd be happy to provide guidance to help you draft a design proposal and ultimately a PR. Hop in the chat if you'd like to discuss.
If users can dispatch incoming packets themselves, they can avoid all traffic having to go through
Endpoint
, and potentially obtain a significant speedup. In particular, they can shard the entire state machine across threads, and thus avoid any sort of synchronization in the fast path. NAT rebinding and migration can cause a packet to be delivered to the wrong thread, but such packets are rare, and can be sent to the correct one for processing.