cloudflare / quiche

🥧 Savoury implementation of the QUIC transport protocol and HTTP/3
https://docs.quic.tech/quiche/
BSD 2-Clause "Simplified" License
9.4k stars 709 forks source link

RecvBuf + SendBuf Memory Usage #632

Open victorstewart opened 4 years ago

victorstewart commented 4 years ago

I saw that quiche_config_set_initial_max_data controls the amount of unacknowledged data allowed to buffer into RecvBuf's and SendBuf's data: BinaryHeap<RangeBuf>.

I'm concerned about overall memory usage of the server over time.

It seems to me that in the worst case, the limits set in quiche_config_set_initial_max_data quiche_config_set_initial_max_stream_data_bidi_local etc, could cause both buffers to grow to their capacities and that memory is never reclaimed?

Even at a modest 512KB value for the send and receive buffers, with 100,000 connections, we're talking 100GB.

I haven't had a chance to learn Rust yet, so maybe I'm missing something.

In C my approach would be to mmap(..., MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE,...) in the memory and then MADVISE_COLD the pages after consuming them. Otherwise we're in a position where a MASSIVE amount of unused memory is trapped. (And in my case this is untenable because I don't have that much memory per connection available on my machines).

I wish Rust had per container instance allocator specialization like C++ so we could back the BinaryHeaps with mmap-ed memory... maintaining operational convenience but maximally exploit memory efficiency.

Maybe a simple solution is to just add some shrink_to_fit or shrink_to calls?

And on this subject, are there any other parts of the implementation that would suffer from similar memory usage problems at such a high connection count?

victorstewart commented 4 years ago

since Rust has native mmap and madvise support, i suggest we replace the BinaryHeap object with a raw memory buffer that we track the head and tail of, shift when needed and madvise cold once consumed.

just requires rounding up the supplied max data config values to the nearest page boundary.

whether this above direction or not, i can work on this PR once a direction is agreed upon.

ghedo commented 4 years ago

It seems to me that in the worst case, the limits set in quiche_config_set_initial_max_data quiche_config_set_initial_max_stream_data_bidi_local etc, could cause both buffers to grow to their capacities and that memory is never reclaimed?

Based on what are you saying that? The data is stored as a list of buffers. Those buffers actually own the chunks of application data sent/received (i.e. what initial_max_data and initial_max_stream_data limit) and those are moved/freed after they are read by the application (in case of incoming data), or when they are acked (in case or outgoing data), so what you are saying is not really true (unless of course there is a bug in the code).

The lists themselves are not reduced in size, but the overall memory consumed by them should be pretty low, even in cases where they have been grown a lot.