rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.63k stars 12.63k forks source link

`Vec::into_boxed_slice` no-excess guarantee regressed in 1.77.0 #125941

Open str4d opened 4 months ago

str4d commented 4 months ago

36284 asked for clarity about how to use Box<[T]>::into_raw. The core issue was that the documentation for Vec::into_boxed_slice said it reduced excess capacity in a way that was equivalent to calling Vec::shrink_to_fit. However, the latter makes no guarantees about there being no excess capacity afterwards. The resolution of #36284 (specifically in https://github.com/rust-lang/rust/issues/36284#issuecomment-297116448) was for Vec::into_boxed_slice to itself guarantee that it results in an exact allocation with no excess capacity, and to decouple it from Vec::shrink_to_fit. The documentation was updated to reflect this guarantee in #44960.

120110 reintroduced the equivalence to Vec::shrink_to_fit, and removed the guarantee by removing this language:

    /// If `v` has excess capacity, its items will be moved into a
    /// newly-allocated buffer with exactly the right capacity.

I note however that the PR did not update the example, which still appears to encode the intention of the guarantee (it checks the post-shrinking capacity is exactly the length, vs the corresponding example in Vec::shrink_to_fit which checks the post-shrinking capacity is at least the length).

As such, it is unclear whether this guarantee regression was intentional or not. But as things currently stand, we are back in a position where Box<[T]>::into_raw is functionally useless.

Version with regression

120110 was merged during the 1.77.0 cycle.

the8472 commented 4 months ago

"exact capacity" wording is obsolete due to memory fitting. For example reserve_exact was updated to reflect that.

The intent is that into_boxed_slice and shrink_to_fit return a fitting allocation. Which means it might have some excess capacity but it's ok to treat it as-if length == capacity when reconstituting it.

str4d commented 4 months ago

it's ok to treat it as-if length == capacity when reconstituting it.

If that is the case, then the guarantee remains in place and it would be good to update the documentation again to re-introduce language that guarantees it.

For avoidance of doubt, this is the code that I want to ensure is guaranteed to work (and that to my understanding has been guaranteed to work since 2017), in that it should correctly deallocate all allocated memory:

/// Converts the given vector into a raw pointer and length.
///
/// # Safety
///
/// The memory associated with the returned pointer must be freed with an appropriate
/// method ([`free_ptr_from_vec`] or [`free_ptr_from_vec_with`]).
fn ptr_from_vec<T>(v: Vec<T>) -> (*mut T, usize) {
    // Going from Vec<_> to Box<[_]> drops the (extra) `capacity`.
    let boxed_slice: Box<[T]> = v.into_boxed_slice();
    let len = boxed_slice.len();
    let fat_ptr: *mut [T] = Box::into_raw(boxed_slice);
    // It is guaranteed to be possible to obtain a raw pointer to the start
    // of a slice by casting the pointer-to-slice, as documented e.g. at
    // <https://doc.rust-lang.org/std/primitive.pointer.html#method.as_mut_ptr>.
    // TODO: replace with `as_mut_ptr()` when that is stable.
    let slim_ptr: *mut T = fat_ptr as _;
    (slim_ptr, len)
}

/// Frees vectors that had been converted into raw pointers.
///
/// # Safety
///
/// - `ptr` and `len` must have been returned from the same call to `ptr_from_vec`.
fn free_ptr_from_vec<T>(ptr: *mut T, len: usize) {
    free_ptr_from_vec_with(ptr, len, |_| ());
}

/// Frees vectors that had been converted into raw pointers, the elements of which
/// themselves contain raw pointers that need freeing.
///
/// # Safety
///
/// - `ptr` and `len` must have been returned from the same call to `ptr_from_vec`.
fn free_ptr_from_vec_with<T>(ptr: *mut T, len: usize, f: impl Fn(&mut T)) {
    if !ptr.is_null() {
        let mut s = unsafe { Box::from_raw(slice::from_raw_parts_mut(ptr, len)) };
        for k in s.iter_mut() {
            f(k);
        }
        drop(s);
    }
}
the8472 commented 3 months ago

We discussed this during today's libs-meeting.

We're ok with a documentation update. The behavior is still in place, but now more nuanced due to the unstable Allocator API which introduces the concept of fitting which GlobalAlloc does not have. Vec::shrink_to_fit already refers to the Allocator API, so into_boxed_slice can do that too.

It can say something along the line that the length of the returned slice is appropriate for sized deallocation APIs without using the "exact capacity" wording and also refer to the memory fitting section.