Open jmaargh opened 1 year ago
I don't quite understand what the problem is. Either value is fine for rawvec to use because ultimately that's what's passed back to the allocator and the allocator API only requires the value to fit the allocation, which both values do.
Given that you must use the same allocator to do any operations on an allocation and that the allocator must accept the value that you requested for any deallocation or similar APIs, I don't think this is possible to cause anything worse than maybe some unnecessary reallocation where it might not need to happen. (And even then only under "weird" allocators/platforms)
These are good points, and on reflection I'm struggling to come up with many concrete bugs that could arise from this discrepancy (which doesn't mean they don't exist).
Intuitively, the most likely place to actually need a precise number for the allocated size is when building around custom allocators (e.g., perhaps you're wrapping a third-party allocator whose deallocate
does require that the given size is exact---and you've also found a situation where RawVec::cap
is where you want to track that---but I couldn't find any examples of such an allocator with a quick look).
If we can extend this "I can't think of examples where this would cause a bug" can be extended to a proof of "every time returned capacity is needed, allocated capacity is equally correct---including in all FFI scenarios" then I suppose this boils down to:
Vec::into_raw_parts*
(and any others that might have the same issue), which currently claims to return the "allocated capacity", when it actually only has the same guarantees as Vec::capacity
There are two "capacity"s for any successful allocation: the requested capacity and the returned capacity. The
Allocator
trait explicitly allows these to not be the same. However, it appears thatRawVec
assumes they are the same.Everywhere
RawVec
request an allocation, it then sets its internalcap
field for the capacity to be equal to the requested capacity, not the returned capacity The canonical example of this is here, when a newRawVec
is constructed with a given capacity (though the same behaviour also applies when growing allocation). The comment even stateswhich directly contradicts the docs for the
Allocator
trait.In principle, we could design
RawVec
to only store and report the requested capacity and that could be an entirely sound design. However, I worry that the assumption embodied by this comment has wormed its way throughout the standard library and probably the ecosystem. For example, if breaking apart aVec
-based structure to send for FFI, the docs until recently strongly suggested thatVec::capacity
would return the allocated capacity (which is only true if the allocator always returns the requested capacity). In #99790 the docs are currently being updated to reflect that this isn't the case.Another example of where this is problematic is
Vec::to_raw_parts
which clearly assumes thatRawVec::cap
is the allocator returned capacity. I worry that this assumption is implicitly present through a lot of code.I suggest therefore
RawVec
s allocation methods to store the capacity returned from the allocator, unfortunately requiring an extra division to do soRawVec
and structures that use it to ensure that it is clear thatcapacity()
may return more than what was requested, and associated methodscapacity()
returns exactly the allocator returned capacity, and for which datastructures.The alternative that I see is to instead commit to
RawVec::cap
being the requested capacity. In which case, it would be good to review where this may have been assumed elsewhere and what might need to be fixed. Happy to hear views on whether people think this would be a better path.