Open ryleung-solana opened 1 week ago
@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests. We merely accept all incoming connections and then immediately drop some Connecting future based on our own logic.
@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests.
Why not? I think we had some Solana people look at that as an option for this exact reason.
@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests.
Why not? I think we had some Solana people look at that as an option for this exact reason.
Mostly because we currently only do some basic rate-limiting and nothing more. It's probably a good idea to look at using reject when over the rate limit instead of accepting and immediately dropping :sweat: but the thing is that we currently don't really have a great idea of what the limit should be, since we don't have insight into the resource usage from handshakes, especially crypto (but we think it's significant).
Would you guys have any ideas of how we might be able to get some better numbers on e.g. the mem usage from crypto in accept() before the future is returned?
Can you elaborate on "mem usage from crypto"? I don't expect cryptographic operations to consume significant memory, or really anything but O(input). Usually the concern with handshakes is CPU usage, which could in principle be measured with some sort of high resolution timer around accept
and, most importantly, the poll
calls on the internal connection driver future.
Can you elaborate on "mem usage from crypto"? I don't expect cryptographic operations to consume significant memory, or really anything but O(input). Usually the concern with handshakes is CPU usage, which could in principle be measured with some sort of high resolution timer around
accept
and, most importantly, thepoll
calls on the internal connection driver future.
I think that's (CPU usage) also something to look at as well. We're looking specifically at x509 verification, as it happens before we get the future returned to us when we accept() making it a prime target for rate limiting given some data. It would also allow us to determine once and for all, whether crypto does indeed take a significant amount of memory under attack (e.g. getting spammed with lots of handshake attempts, or some variation of this), which we suspect from a recent security advisory, but don't know for sure.
In general I think a quinn_proto::Connection
takes a decent amount of memory, but only a small part of that is due to the crypto involved. To the extent that you can, it's probably better to reject connections based on the Incoming
data.
We're looking specifically at x509 verification, as it happens before we get the future returned to us when we accept() making it a prime target for rate limiting given some data.
This is no longer the case as of 0.11. No CPU-intensive work is performed until you accept a connection.
In Anza, we need to get more insight into the resources being used by quinn in servicing quic handshakes. This PR seeks to add an initial interface to the Endpoint struct to allow for getting more detailed stats than just the open connections, and some initial data relevant to the handshakes. On a side note to the Quinn devs, if there is a good way to support getting the current number of open handshakes, we would be interested in know it. Currently, it seems that the Endpoint is not signaled when a Connecting future finishes, and to add such an event would require some refactoring, not to mention handling an extra event, locking, etc.