llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.02k stars 11.06k forks source link

[libc++] `shrink_to_fit` can increase capacity if `allocate_at_least` returns a bigger allocation #95161

Open MitalAshok opened 1 month ago

MitalAshok commented 1 month ago

https://eel.is/c++draft/vector.capacity#9.sentence-3

constexpr void shrink_to_fit();
Effects: [...] It does not increase capacity(), but may reduce capacity() by causing reallocation.

https://godbolt.org/z/hP63rn661

#include <vector>
#include <memory>
#include <algorithm>

std::size_t min_bytes = 1000;

template<typename T>
struct increasing_allocator {
    using value_type = T;
    increasing_allocator() = default;
    template<typename U>
    constexpr increasing_allocator(const increasing_allocator<U>&) noexcept {}
    std::allocation_result<T*> allocate_at_least(std::size_t n) {
        std::size_t allocation_amount = n * sizeof(T);
        if (allocation_amount < min_bytes) allocation_amount = min_bytes;
        min_bytes += 1000;
        return { static_cast<T*>(::operator new(allocation_amount)), allocation_amount };
    }
    T* allocate(std::size_t n) {
        return allocate_at_least(n).ptr;
    }
    void deallocate(T* p, std::size_t n) noexcept {
        ::operator delete(static_cast<void*>(p));
    }
};

template<typename T, typename U>
constexpr bool operator==(increasing_allocator<T>, increasing_allocator<U>) { return true; }

int main() {
    std::vector<int, increasing_allocator<int>> x;
    x.push_back(1);
    __builtin_printf("Before shrink cap: %zu\n", x.capacity());
    x.shrink_to_fit();
    __builtin_printf("After  shrink cap: %zu\n", x.capacity());
}
Before shrink cap: 1000
After  shrink cap: 2000

The new capacity isn't checked before the elements are moved to the new allocation:

https://github.com/llvm/llvm-project/blob/a13bc9714a6bfb766693aa7900217f6f9be6f25d/libcxx/include/vector#L1438-L1440

This came up when using an arena allocator and the smaller arenas filled up and shrink_to_fit didn't help

frederick-vs-ja commented 4 weeks ago

Yeah, basic_string is similarly buggy (link).

mordante commented 1 week ago

I had a look and there are 4 usages of shrink_to_fit in the Standard

philnik777 commented 1 week ago

Is there a use-case where this is actually a problem and not just happens to make a bug more visible? AFAICT our previous implementation was conforming, but didn't solve the problem you describe in any way. The only thing it would have done is throw away usable space.

MitalAshok commented 1 week ago

I noticed it because I was more vigilant about the capacity when migrating to std::allocation_result. The previous behaviour of allocating too much and then underreporting the capacity (i.e., the current behaviour but with setting the capacity to size() afterwards) was standards conforming.

This might be an LWG issue: libc++ should be allowed to increase the capacity if the new allocation happens to be larger, which it currently isn't.

mordante commented 1 week ago

Is there a use-case where this is actually a problem and not just happens to make a bug more visible? AFAICT our previous implementation was conforming, but didn't solve the problem you describe in any way. The only thing it would have done is throw away usable space.

What do you mean with "AFAICT our previous implementation was conforming"? The provided test code shows the code is not conforming. The example is a bit far-fetched, however @MitalAshok mentioned it happened with an arena allocator. So this sounds like a real-world issue to me.

philnik777 commented 1 week ago

Is there a use-case where this is actually a problem and not just happens to make a bug more visible? AFAICT our previous implementation was conforming, but didn't solve the problem you describe in any way. The only thing it would have done is throw away usable space.

What do you mean with "AFAICT our previous implementation was conforming"? The provided test code shows the code is not conforming. The example is a bit far-fetched, however @MitalAshok mentioned it happened with an arena allocator. So this sounds like a real-world issue to me.

Before we used allocate_at_least, we used allocate. We just assumed that the block was exactly the requested size. I'm not convinced that any "fix" on our side would actually solve a problem. In the example the fundamental problem is that the allocator didn't have enough memory for a given size. If the arena allocator didn't provide an allocate_at_least member or we didn't use it, we would simply set the capacity() to the requested allocation size and we would be conforming.

mordante commented 1 week ago

I see what you mean. allocate indeed always honors the request exactly, but allocate_at_least may return more than we requested. The issue in the example is that we have container using 4 bytes and a buffer of 1000 bytes. When we shrink we end up with a buffer of 2000 bytes. In that case it would be better to keep the 1000 byte buffer and release the 2000 byte buffer. That would also be Standard conforming, whereas using the 2000 byte buffer is not.

Of course in the past with allocate we probably were not aware that "behind our backs" the buffer increased from 1000 to 2000 bytes and thus were conforming.

mordante commented 1 week ago
* `vector<bool>` this needs more investigation; `shrink_to_fit` looks wrong.

This uses the C++98 containter-swap-with-empty-container idom which only shrinks when the container is empty. This will never grow the current container's capacity. This could be improved, but it's conforming.