jeromefroe / circbuf-rs

A growable circular buffer for working with bytes
https://docs.rs/circbuf/
MIT License
18 stars 8 forks source link

Why increase capacity to next power of 2 - 1? #5

Open EkardNT opened 4 years ago

EkardNT commented 4 years ago

The documentation states that the size of the circular buffer is increased to the next power of 2 above or including the provided capacity, and I've confirmed that behavior with the following test.

#[cfg(test)]
#[test]
fn capacity_demo() {
    let mut c = CircBuf::with_capacity(10).unwrap();
    assert_eq!(c.cap(), 15);
    assert_eq!(c.get_avail()[0].len(), 15);
}

The test also shows the behavior of the actual available capacity being one less than the power of 2. My question is, why are both of these behaviors present?

For the first question, I thought maybe the package was using bitmasking of some form that only works on power-of-2 array lengths, but I looked through the code and wasn't able to find anything of that nature. For example, advance_read and advance_write both use modular arithmetic which would work just as well with non-power-of-2 lengths as with, as far as I can tell.

For the second question, there is this explanation in the code: [The capacity] is equal to one less than the length of the underlying buffer because we cannot let the write cursor to ever circle back to being equal with the read cursor. That's clear enough, but I don't understand why not just increase the length of the buffer by 1 to hide this implementation detail from the user. Perhaps it is tied to the need to have power-of-2 lengths? Which leads back to the first question.

jeromefroe commented 4 years ago

Hi @EkardNT!

Why is the capacity rounded to the next power of 2?

This approach was taken because it strikes a good balance between minimizing the total number of allocations required while the buffer grows to the maximum size it will assume and minimizing the amount of wasted memory. If N is the maximum size of the buffer, the total number of allocations is guaranteed to be O(log N) and the amount of unused memory is O(N). Doubling the size of a buffer when growing it is a fairly common approach (for example, the alloc::raw_vec::RawVec has a double method) though one improvement to this approach that I've seen is to grow the buffer exponentially up to a point and then linearly after that. The idea being that most of the time the buffer's maximum size will fall into the exponential growth phase where the amount of wasted is memory is small because the buffer is small, but for the use cases where a large buffer is required the linear growth will provide a hard cap on the amount of wasted memory.

Why is the available capacity one less than the next power of 2?... That's clear enough, but I don't understand why not just increase the length of the buffer by 1 to hide this implementation detail from the user.

Admittedly this does expose implementation details to the user, and is a somewhat unsightly feature of the API. The reasoning behind not hiding this implementation detail is to reduce the memory requirements of the buffer. As far as I understand, most allocators often allocate memory in powers of 2 so if the buffer wanted to have a power of 2 capacity itself, say 2^n, then it would need to request (2^n)+1 bytes from the allocator. However, the system allocator would likely increase this allocation to 2^(n+1) thereby increasing the amount of memory required by the crate. This crate therefore trades a reduction in the ergonomics of the API for an increase in efficiency.

Stargateur commented 2 years ago

Admittedly this does expose implementation details to the user, and is a somewhat unsightly feature of the API. The reasoning behind not hiding this implementation detail is to reduce the memory requirements of the buffer. As far as I understand, most allocators often allocate memory in powers of 2 so if the buffer wanted to have a power of 2 capacity itself, say 2^n, then it would need to request (2^n)+1 bytes from the allocator. However, the system allocator would likely increase this allocation to 2^(n+1) thereby increasing the amount of memory required by the crate. This crate therefore trades a reduction in the ergonomics of the API for an increase in efficiency.

I'm not certain it's true, actually allocation require more than 1 octet of metadata, it's generally something like 24-32 octets, thus forcing a power of 2 is probably counter productive but I can't be sure.

I agree double the size is a good strategy, but I don't see why force power of 2 (for example Vec doesn't do that). Having a method that allow user to only add few bytes would be a plus. Having a shrink/shrink_to would be nice too.

On a related note, the choice to have a boxed slice and a realloc feature is also counter productive this disallow any reuse of the same buffer if there was enough space available to grow the original space.