Closed Yaoxiao1 closed 2 years ago
If I understand the flamegraph correctly, it's actually rustls that holds on to a large allocation. I think https://github.com/rustls/rustls/issues/794 would be the relevant issue. (You'll also want to read through the relevant PR.)
Thank you for running this detailed analysis! To be clear, is this memory freed when the connections themselves are destroyed? If so, this is not a leak, just a bit wasteful. It should not pose an issue for a long-running server, because a long-running server must limit the number of concurrent connections it handles anyway.
FWIW, most of the rustls state shouldn't be required at all after the connection was initially established. Maybe there's some potential to compact or drop things.
For the question itself, it might also be useful to look into configuring suitable idle and keepalive timeouts to make sure connection state is not kept around forever. If one side has a keepalive mechanism running that is lower than the idle timeout of both sides, then connection state will be around for pretty much forever in the absence of networking issues.
Yeah, an appropriate idle timeout is highly recommended for applications where connections may be idle for long periods.
Maybe there's some potential to compact or drop things.
Maybe we should tweak rustls to drop buffers after the handshake is complete when in QUIC mode?
rustls already supports dropping the Connection and only retaining the required key state.
CRYPTO
frames may be delivered on an established connection, so that might be too big a hammer.
Once the handshake completes, if an endpoint is unable to buffer all data in a CRYPTO frame, it MAY discard that CRYPTO frame and all CRYPTO frames received in the future
So we can actually just drop data at that point; that might be a good next step, then.
Closing as there's no evidence of an actual leak, nor further engagement regarding optimizing memory use in general. Feel free to open a new issue for either if you want to pursue that optimization.
My program is using a relay server to transfer message from peer to peer, each time when a connection was created, the server would take more memory than before, as the server might running for a long period, the memory will finally run out. I wrote a test program and used a memory profiler called bytehound to check the leaked memory , the test program created 100 pairs of connecntions under
tokio::spawn
, let call it P1-P2 pair, in each connection, P1 will send a 20KB message to P2 first, and P2 will send another 20KB message back, and repeat for serveral seconds. according to the memory profiler, the memory usage will grow fast when creating connections, after the message-transfer, most of the memory will be freed, but there is still small part of them remain unfreed. please see below picture, overall it created about 1MB memory leakage, and the Connection took 493KB of them, and when creating more connections, the protion of leakage caused by Connection will be higher and higher, below is a flamegraph of leaked memory it looks like the Connection allocate some reserved space for message and didnot free them, I'm not sure, can you help explain? Thanks.