quinn-rs / quinn

Async-friendly QUIC implementation in Rust
Apache License 2.0
3.85k stars 394 forks source link

memory leakage #1348

Closed Yaoxiao1 closed 2 years ago

Yaoxiao1 commented 2 years ago

My program is using a relay server to transfer message from peer to peer, each time when a connection was created, the server would take more memory than before, as the server might running for a long period, the memory will finally run out. I wrote a test program and used a memory profiler called bytehound to check the leaked memory , the test program created 100 pairs of connecntions under tokio::spawn, let call it P1-P2 pair, in each connection, P1 will send a 20KB message to P2 first, and P2 will send another 20KB message back, and repeat for serveral seconds. according to the memory profiler, the memory usage will grow fast when creating connections, after the message-transfer, most of the memory will be freed, but there is still small part of them remain unfreed. please see below picture, 2022-04-25 14-44-56 的屏幕截图 2022-04-25 14-45-13 的屏幕截图 overall it created about 1MB memory leakage, and the Connection took 493KB of them, and when creating more connections, the protion of leakage caused by Connection will be higher and higher, below is a flamegraph of leaked memory 2022-04-25 14-53-25 的屏幕截图 it looks like the Connection allocate some reserved space for message and didnot free them, I'm not sure, can you help explain? Thanks.

djc commented 2 years ago

If I understand the flamegraph correctly, it's actually rustls that holds on to a large allocation. I think https://github.com/rustls/rustls/issues/794 would be the relevant issue. (You'll also want to read through the relevant PR.)

Ralith commented 2 years ago

Thank you for running this detailed analysis! To be clear, is this memory freed when the connections themselves are destroyed? If so, this is not a leak, just a bit wasteful. It should not pose an issue for a long-running server, because a long-running server must limit the number of concurrent connections it handles anyway.

Matthias247 commented 2 years ago

FWIW, most of the rustls state shouldn't be required at all after the connection was initially established. Maybe there's some potential to compact or drop things.

Matthias247 commented 2 years ago

For the question itself, it might also be useful to look into configuring suitable idle and keepalive timeouts to make sure connection state is not kept around forever. If one side has a keepalive mechanism running that is lower than the idle timeout of both sides, then connection state will be around for pretty much forever in the absence of networking issues.

Ralith commented 2 years ago

Yeah, an appropriate idle timeout is highly recommended for applications where connections may be idle for long periods.

Maybe there's some potential to compact or drop things.

Maybe we should tweak rustls to drop buffers after the handshake is complete when in QUIC mode?

djc commented 2 years ago

rustls already supports dropping the Connection and only retaining the required key state.

Ralith commented 2 years ago

CRYPTO frames may be delivered on an established connection, so that might be too big a hammer.

Ralith commented 2 years ago

Once the handshake completes, if an endpoint is unable to buffer all data in a CRYPTO frame, it MAY discard that CRYPTO frame and all CRYPTO frames received in the future

So we can actually just drop data at that point; that might be a good next step, then.

Ralith commented 2 years ago

Closing as there's no evidence of an actual leak, nor further engagement regarding optimizing memory use in general. Feel free to open a new issue for either if you want to pursue that optimization.