Closed danakj closed 1 year ago
RangeFrom
needs to return a lower bound for its size_hint
which is normally T::MAX
. But if you constructed a RangeFrom
over uptr
then that doesn't work cuz there's no MAX
.
MAX_ADDR
would do the wrong thing, cuz that just excludes capabilities from its value, but pointer values may be larger with the capabilities present on CHERI.
MAX_BIT_PATTERN
is the closest thing to correct here. If you iterate so far that you are incrementing past MAX_ADDR
in the address part of the integer then you've done bad things to your pointer, but that's where the iteration would technically go.
Here MAX
being 2^N-1
(not matching uintptr_t on cheri) would be the most correct thing to use, and it's annoying in this case that it does not exist. There's nothing... better.
Edit: Well it turns out uptr
is weird in lots of ways so it doesn't have like try_from(unsigned) cuz it needs to be constucted with provenance, so this use case evaporated.
Maybe uptr
shouldn't be included in Integer
concept though..
Right now you have to reinterpret_cast to a primitive value then construct a usize. Yuck.
Instead
template <class T> usize::from(T*)
? Butusize
has size equal tosize_t
. Should we now introduceuptr
before things get out of hand withusize
holding pointers?I think it's clear from reading the Rusty stuff below there's regret they didn't do this sooner, they would use
uptr
as a name, and they would have it be the size of a pointer, not ofptraddr_t
which is too theoretical to need to support in the stdlib.TODO:
u*
as laid out below.Rust project thoughts
Here are Rusty thoughts on it: https://github.com/rust-lang/rust/issues/95228 Highlights:
sizeof(void *) == 16
UINTPTR_MAX == UINT64_MAX
Rust stdlib docs on it: https://doc.rust-lang.org/std/ptr/#pointer-vs-addresses
Divergence from
u*
integer types.MAX
Of interest here is that if
uintptr_t
holds 128 bits butUINTPTR_MAX
reports (2^64-1), we will need to either makeuptr::MAX
not self-consistent (ouch) or be very clear in the docs thatuptr::MAX
is not the same asUINTPTR_MAX
, or just not haveuptr::MAX
at all. Ifuptr::MAX
is self-consistent and returns (2^128-1), any rewriting ofUINTPTR_MAX
into a Subspace constant should be rewritten tousize::MAX
, which will convert up to the largeruptr::MAX
if needed, unless the authors were sure they wanted to include the non-address part of the pointer in their max. Having to go throughusize
seems like the wrong path though, there should be a more clear well-lit Good Path.Another possibility here would be to have two max values on
uptr
instead, likeuptr::MAX_ADDR
// 2^64-1 on CHERIuptr::MAX_BIT_PATTERN
// 2^128-1 on CHERI This would break the use ofuptr
in ducktyped generic code that rely onT::MAX
, though perhaps in a good way. IfMAX == MAX_ADDR
it would be incorrect to use in generic code that wants to do bit masking with it as it would miss half the bits. IfMAX == MAX_BIT_PATTERN
it would be incorrect to use in generic code that wants to writeMAX
into the type, as it would be writing non-zeros into the capabilities and then produce Bad Things in CHERI.FWIW
usize::MAX
is not really a valid value forusize
for the things it is intended for which is offsets into arrays, as the max size of any allocation/array isisize::MAX
or a 31-bit number. But it can also just be used as "an integer" soMAX
is not (2^31-1). That saiduptr::MAX
beinguptr::MAX_BIT_PATTERN
feels much worse as it will directly cause bad CHERI things to occur when converted back to a pointer, whereas ax_usize >= 2^31
can be caught by bounds checks (or compiler checks for Array sizes) if misused. So I don't see this as strong argument for havinguptr::MAX == MAX_BIT_PATTERN
.Proposal
Let's start with
MAX_ADDR
andMAX_BIT_PATTERN
and dropMAX
for now (and also haveMAX_ADDR_PRIMITIVE
andMAX_BIT_PATTERN_PRIMITIVE
).Conversions
uptr
should be implicitly constructible from any pointer, unlike otheru*
.uptr
should not be implicitly constructible from smaller integer types, unlike otheru*
.uptr
should not be explicitly convertible to primitive integer types except uintptr_t. Up for consideration if this should be through a method call to help avoid code compiling and relying on it converting to other types due to type aliases and then breaking under CHERI, but a method call would give an explicit name/intend to callers but would be less type-safe than anoperator std::same_as<uintptr_t>()
.usize
, it should provide.addr()
(like Rust's proposal) which returnsusize
.uptr
can be constructible from ausize
but explicitly through means that specify the upper bits.uptr.with_addr(usize)
(like the the Rust proposal) can copy the high bits from*this
to the new pointer. e.g.uptr(actual_pointer).with_addr(a_different_address)
.static uptr::from_ptr_and_addr(T*, usize)
but it's basically a syntactic sugar forwith_addr
. It's not clear if it would be painful to construct auptr
and call.with_addr
such that it's worth adding this shorter path.uptr
should not beFrom<u*>
for explicit conversions, as it should instead be constructed throughwith_addr()
through another pointer.uptr
should also not beInto<u*>
(or IOWu*
should not beFrom<uptr>
). Ifsize_of<uptr>() > size_of<usize>()
then basically every conversion fromuptr
tousize
will be out of bounds due to the capabilities in the high bits, and thus.addr()
would need to be used to not panic.