apple / swift-nio

Event-driven network application framework for high performance protocol servers & clients, non-blocking.
https://swiftpackageindex.com/apple/swift-nio/documentation
Apache License 2.0
7.85k stars 633 forks source link

ByteBufferAllocator.allocate(capacity: Int) implementation does not correspond to the definition. #2723

Open ser-0xff opened 1 month ago

ser-0xff commented 1 month ago

Expected behavior

The documentation says `ByteBufferAllocator.allocate(capacity: Int)

The passed `capacity` is the `ByteBuffer`'s initial capacity, it will grow automatically if necessary.

while in the reality it always allocates ByteBuffer with storage with size equal to nearest power of 2. In some cases we definitely know the required size and think would be really nice if ByteBufferAllocator API would give a possibility to allocate ByteBuffer with precise size of the storage. Probably it can reduce amount of used memory when work with a big number of big byte buffers.

Actual behavior

Allocates a storage of size equal to nearest power of 2.

Steps to reproduce

Allocate a ByteBuffer using ByteBufferAllocator().buffer(capacity:)

If possible, minimal yet complete reproducer code (or URL to code)

SwiftNIO version/commit hash

System & version information

Lukasa commented 1 month ago

Thanks for filing this. I have to admit that I don't have particularly strong feelings about this. Generally speaking it's nice to allocate slightly more space than the user expected, because it allows for some slack for immediate modifications, and it is essentially "free": no extra computation is required.

You're right that it does cause us to very slightly overcommit memory. I'm not sure how much that matters, honestly: at large enough allocation sizes the extra space is essentially fake, because the tail of the storage will be lazily allocated when it is first faulted. Its impact is most noticeable at the middle allocation ranges, and we haven't really seen this cause any issues.

@weissi, got thoughts here? I expect you to have a strong opinion one way or another, and I don't 😁

hassila commented 1 month ago

I can just mention for our use case we have an exact byte count and sometime when you are close to the power of two it basically balloons to a twice-the-size-as-desired, so even if you want to overcommit a bit, it would be super nice to have the option to have the exact requested size too.

glbrntt commented 1 month ago

I just want to note that the reserveCapacity(_:) and reserveCapacity(minimumWritableBytes:) are more explicit about overcommitting, so at a minimum we should improve the docs for buffer(capacity:).

There have been times when I've wanted a buffer with an exact capacity, but I'm not sure if the memory difference would've been meaningful in practice, but it certainly feels wasteful.

hassila commented 1 month ago

Yeah, it's also this feeling of overall tiny unavoidable additions to memory footprint - we're trying to be quite careful to keep it as small as we can to get great cache utilization, so just a minor nit we stumbled on that would be nice to clean if possible.

weissi commented 1 month ago

@hassila Out of interest, what are ballpark sizes you're working with?

weissi commented 1 month ago

Also related https://github.com/apple/swift-nio/issues/2729 . I was sure I had filed this many years ago but apparently I didn't (or couldn't find it).

hassila commented 1 month ago

@weissi for this specific case I believe we were in the range 300-600 bytes approx (but will follow up internally).

Edit (follow-up: we typically have two modes, mode 1 is typically <1KB with significant variation, mode 2 is 50K up to a couple of MB. Mode 1 continuous running, Mode 2 used for synchronisation of cache state (which is mostly startup, but can happen during runtime too, then it is typically a multiple of 1000x the mode1 payload)).

One comment related to #2729 is that I wouldn't make the assumption that the underlying allocator always uses a power-of-two, IIRC both e.g jemalloc and mimalloc has several interim sizes in that ranges <1K, and even if ByteBufferAllocator.allocate means well, it can't know easily what the underlying memory allocation library does (different platforms, possible library interposing for optimization).

E.g.. we are using jemalloc specifically for the benchmark numbers with the following size classes:

image

Not saying we will use that for production, we have an outstanding to-do to evaluate different allocators for swift at some point but haven't come around to it, e.g. mimalloc looks worth trying out in addition to jemalloc/tcmalloc etc..