rust-lang / libs-team

The home of the library team
Apache License 2.0
116 stars 18 forks source link

add wrapping_offset_from #244

Closed RalfJung closed 1 year ago

RalfJung commented 1 year ago

Proposal

Problem statement

Using wrapping_offset it is possible (even at const-time) to create pointers that are still associated with the original allocation they came from, but may point out-of-bounds of that allocation. One fairly common use of such an operation is for iterators over slices of ZST, where the begin and end pointers are re-interpreted as "the difference between them (in bytes) is the remaining number of elements" but where they actually point is irrelevant (since ZST values do not carry any data, only their count is relevant). This need an operation to compute the distance between two such pointers created by wrapping_offset. Currently, we do not offer such an operation: at runtime it is possible to cast the pointers to integers and do the subtraction there, but at compile time that will not work -- and also it seems preferable to avoid ptr-to-int casts when they are not actually needed.

Motivating examples or use cases

I mentioned the ZST iterator above. Also see https://github.com/rust-lang/rust/issues/92512.

Solution sketch

Add a wrapping_offset_from that allows subtracting pointers even if they are out-of-bounds or have wrapped the address space, as long as they originate from the same allocation.

See https://github.com/rust-lang/rust/pull/112837 for an implementation of this.

Alternatives

If we do nothing, implementing iterators in this way in const fn will remain impossible.

We could weaken the offset_from requirements to allow out-of-bounds and wrapping difference. However that would destroy the symmetry indicating by the naming of offset and offset_from.

Links and related work

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

RalfJung commented 1 year ago

After looking at the slice iterator code again, actually the ZST code path does not do pointer subtraction any more. So I guess that concrete motivation for wrapping_offset_from does not hold up.

I still think it would make sense to have a function that mirrors wrapping_offset, but given that wrapping_offset is rarely required, it is hard to come up with concrete motivations for both wrapping_offset and wrapping_offset_from.

thomcc commented 1 year ago

So, I've wanted a wrapping equivalent to offset_from often, but had always hoped it (like wrapping_offset) would be a safe function.

In the PR your version is unsafe because it requires that the pointers be from the same object (and the pointers be an exact multiple of T apart, which is indeed tricky), but this requirement is just for const support:

The requirement for pointers to be derived from the same allocated object is primarily needed for const-compatibility: at compile-time, pointers into different allocated object do not have a known distance to each other.

This is somewhat surprising, since I think it'd be reasonable for users to expect the operation to behave as wrapping arithmetic on the addresses.

IMO it's also not worth it -- offset_from supports const, and the users who can't or don't want to use that can either separately track the offset, or do something like what core does for slice iterators of ZST (as you mention).


That said, the non-const case seems a lot more useful... if the function were safe. While it's true that wrapping_(add|offset|sub) is rarely needed, it's actually semi-common to see these used in the wild, since they're safe (and thus easier to review, and less likely to cause major regret in the future).

I think this would be true for wrapping_offset_from, since the returned value can be used (safely) in a bounds/range check, as an index, as a count, and various other things which are not that likely to be require other unsafe code.

I've found this to be the case in my own projects, which often end up with some utility function like the following (typed from memory):

// in a utils.rs somewhere...
pub(crate) fn ptr_byte_diff<T>(a: *const T, b: *const T) -> isize {
    a.addr().wrapping_sub(b.addr()) as isize
}

Which is basically something like a (safe) x.wrapping_byte_offset_from(y)... or something along those lines.

Sadly, I have no idea what the non _byte_ version of this should do if the pointers are not a multiple of size_of::<T> apart. All the options seem bad; "panic", "round up/down", "abort", "UB"... Oof. That said, maybe we have existing design precedent for the behavior here in some other API, IDK.

I also don't know if this is sufficiently useful to justify adding (especially if it would just be the version for the byte offset). I've found it useful, but perhaps it's too niche?

(Not to mention, it can't support const in any way I can see. I don't think this is that bad (as mentioned) but it was the goal of your proposal...)

RalfJung commented 1 year ago

IMO it's also not worth it -- offset_from supports const, and the users who can't or don't want to use that can either separately track the offset, or do something like what core does for slice iterators of ZST (as you mention).

I'm doubtful that will always work satisfyingly. But I don't have a counterexample.

That said, the non-const case seems a lot more useful... if the function were safe.

A safe version of this can't be the exact dual to wrapping_offset though, in the sense that it is valid to call if and only if self could have been computed as origin.wrapping_offset(n) for some n, and it will then return that n. Maybe that's not a desirable property, but it is at least a design principle.

And indeed, it cannot be const (except maybe via "just halt evaluation if the pointers are not derived from the same allocated object").

Note that a safe wrapping_offset_from existed for a short time but got removed again (https://github.com/rust-lang/rust/pull/73580).

the8472 commented 1 year ago

We discussed this in the libs-meeting today. With https://github.com/rust-lang/rust/issues/92512#issuecomment-1620562580 the primary motivation seems to be gone.

And as thomcc explains other use-cases many want this to be a safe function but then we can't make it const.

So we're closing this based on the given motivation. If there's a better one, maybe with different tradeoffs you can make an updated ACP.