Open scottmcm opened 2 years ago
I will always prefer this to the use of isize
. The name is very generic. Thus I don't have better idea. Maybe usize_from
or distance_from
. I don't like the order of parameter but guess it's too late for pointer.
I also wonder if we could just say this doesn't panic but just return the absolute since we assert the greater than equal why not just make it return absolute ? this would cost nothing more and would remove the burden for dev to think about order, and we could just call this distance()
.
I would be strongly against this returning the absolute difference between the pointers. Part of the value of this API is that it's more efficient than offset_from
-- that's why slice iterators were open-coding this before the function existed -- and if it needed to do tests for ordering it would no longer be that.
Adding a separate absolute difference API for pointers might make sense, like the integers have, but it wouldn't be a serviceable replacement for the current sub_ptr
. (#106393 wouldn't use an absolute difference version, for example.)
(Oh, and as prior art, std::distance
in C++ isn't an absolute distance either, so calling it distance
might mislead people.)
I proposed this cause you write in the PR:
// SAFETY: The comparison has no side-effects, and the intrinsic // does this check internally in the CTFE implementation. unsafe { assert_unsafe_precondition!(self >= origin) };
so I expect it to be "free", I'm all in favor of remove all assert since the method is unsafe anyway, this kind of operation should be as fast as possible. I just see this line and wonder, well if it's free why not ?
(Oh, and as prior art, std::distance in C++ isn't an absolute distance either, so calling it distance might mislead people.)
I don't think C++ std using distance name wrongly is a good point, have you ever see "this object is -1cm
" ? a distance is always positive, that why we return usize. Distance is also mention many time in https://github.com/rust-lang/rust/pull/95837.
cause you write in the PR:
assert_unsafe_precondition!
is a magic compiler thing that only does something in CTFE. At codegen time, sub_ptr
doesn't do any checks, just an "I promise not to overflow" subtraction and a "I promise it's an exact multiple" division: https://rust.godbolt.org/z/fE79dv6WT.
Since offset_from
has a parallel byte_offset_from
method (tracked in https://github.com/rust-lang/rust/issues/96283), should this method have an equivalent byte_sub_ptr
method?
I am aware that this was added specifically for slice iterators, and there it doesn't seem like there has been a usecase for a bytewise method so far - but looking at the docs it seemed somewhat surprising that this is missing.
Yeah, if this stabilizes after the other byte
methods we probably want byte_sub_ptr
as well.
I've read some of the bikeshedding in #95837, and I'd like to second @Gankra's suggestion of "offset_to" - it's not perfect, but when I was looking for this method, I couldn't find it in the autocomplete and had to search for the -> usize
return type instead. Having offset
in the name would have made it easier to locate, as that's one of the first things I looked for.
Are there any blockers for stabilization besides the name? I'm not familiar with the stabilization process, but I'd be willing to open a stabilization PR for this if that would help.
So offset_from
returns isize
but offset_to
returns usize
? That makes little sense to me.
Once this is stable we can put a note in the offset_from
docs pointing to sub_ptr
.
So
offset_from
returnsisize
butoffset_to
returnsusize
? That makes little sense to me.
In isolation, it doesn't make much sense to me either - given the chance to go back in time, I would have made offset_from
return usize
instead, as the earlier discussion mentioned, but it's too late for that now.
Another possibility may be count_from
- it still retains a word from offset_from
, so it shouldn't take more than a couple tries to discover through autocomplete, but it still communicates its unsignedness - an element count will/should always be unsigned, as opposed to a relative offset which can go in either direction. (Maybe also len_from
.)
@rust-lang/libs-api what are your current thoughts on this function? Is this generally something we'd want to have, in terms of its semantics? Any opinions on the name?
sub_ptr
seems like an okay name to me. None of the proposed alternatives (offset_to
, count_from
) obviously beat it.
This doesn't currently have a "byte" variant but I assume we'll want that, too. (It is being added in https://github.com/rust-lang/rust/pull/132459.)
I'm somewhat partial to offset_from_unsigned
which clearly indicates that this produces a usize
and that the offset must be positive. Although this is actually longer than offset_from() as usize
, it does involve additional safety conditions and may be easier for people to type with IDE auto-completion.
I think it would be a shame if it ended up with a name as long as offset_from_unsigned
since we found out it's actually substantially more commonly the case than offset_from
. This is the one you probably want, not the weird one -- and it's more efficient than offset_from() as usize
too.
(Worse, these are all subtly different, so it's not obvious that clippy can tell you to change one pattern to another, either.)
I do agree it's not great compared to offset_from
, but the question here is whether offset_from_unsigned
is better than sticking with the currently proposed sub_ptr
.
Another proposal: p1.offset_after(p2)
. That would make it clear that p1
must be after p2
, and since it must be after, the use of unsigned naturally follows from that.
We discussed this in today's libs-api meeting, but did not reach a consensus. We were having difficulty resolving the tradeoffs between brevity (not being so long that it discourages usage) and clarity (making it clear that it's UB if the offset would be negative).
FWIW I quite like offset_after
.
I could go for offset_after
👍
(Sure, it's one character longer than offset_from
, but that's not what I was referring to above -- the extra _unsigned
would feel bad, but after
-vs-from
is essentially the same.)
Some other suggestions that came up in that meeting were: positive_offset_from
, offset_within
("within" an allocation), distance_from
, and offset_to
(or distance_to
), but there was no consensus for any of these.
For offset_to
the arguments would be reversed: first_ptr.offset_to(second_ptr)
rather than second_ptr.sub_ptr(first_ptr)
. (Which I personally like, because then you write the two pointers in the same order they appear in memory.)
I noted in the meeting that if I was new to a language that had both "offset_from" and "sub[_ptr]", I would expect offset_from
to be unsigned, and sub
to be signed, which is the opposite of what we have now. (But unfortunately, offset_from is already stable.)
I dislike offset_after
, because then we'll end up having both offset_from
and offset_after
, which kind of sound like opposites, even though they aren't. I can already see them getting mixing them up all the time.
I personally like offset_to
or distance_to
best, because it separates it a bit from offset_from
and puts the two arguments in a (to me) more natural order, since this new method is all about (unsafely) assuming the order of the pointers.
(Similar to how we do collection.find(element)
instead of element.find_in(collection)
, I feel like it should be slice_ptr.offset_to(element_ptr)
rather than element_ptr.offset_from(slice_ptr)
when finding the index. (Edit: Seems like I'm not the only one who thinks that: comment found on github))
It seems quite confusing that offset_to
would be unsigned if offset_from
is signed.
We discussed this in the libs-api meeting. Team members were of differing opinions, but in the end we reached a consensus on offset_from_unsigned
. It is slightly better than the status quo of casting to usize
because of IDE auto-completion, and the name clearly indicates that the result is unsigned and must be positive.
@rfcbot merge
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
:bell: This is now entering its final comment period, as per the review above. :bell:
Feature gate:
#![feature(ptr_sub_ptr)]
&#![feature(const_ptr_sub_ptr)]
This is a tracking issue for the
<*const _>::sub_ptr
&<*mut _>::sub_ptr
methods.This is the produces-
usize
version ofoffset_from
, the same way thatadd
andsub
are the takes-usize
versions ofoffset
.It turns out that people almost always actually know which pointer is greater than which when doing this operation, and would rather a
usize
instead of anisize
-- every use ofoffset_from
in the library was followed withas usize
in practice. So like how.add(d)
greatly improved code compared to needing.offset(d as isize)
, being able to useptr.sub_ptr(origin)
instead ofptr.offset_from(origin) as usize
is also a major improvement. And Miri can check the unsafety better, too, since if you get the order wrong it'll detect that, unlike happens with theas usize
approach.This also tracks the
const
ness of operations, though with #92980 stabilizingoffset_from
beingconst
, this beingconst
is likely uncontroversial.Public API
Steps / History
NonNull
method to this gate https://github.com/rust-lang/rust/pull/124498Unresolved Questions
add
andsub
methods, but there might be another name that would be more evocative for people https://github.com/rust-lang/rust/pull/95837#issuecomment-1094134054