chromium / subspace

A concept-centered standard library for C++20, enabling safer and more reliable products and a more modern feel for C++ code.; Also home of Subdoc the code-documentation generator.
https://suslib.cc
Apache License 2.0
89 stars 15 forks source link

Conversion from pointer to integer #238

Closed danakj closed 1 year ago

danakj commented 1 year ago

Right now you have to reinterpret_cast to a primitive value then construct a usize. Yuck.

Instead template <class T> usize::from(T*)? But usize has size equal to size_t. Should we now introduce uptr before things get out of hand with usize holding pointers?

I think it's clear from reading the Rusty stuff below there's regret they didn't do this sooner, they would use uptr as a name, and they would have it be the size of a pointer, not of ptraddr_t which is too theoretical to need to support in the stdlib.

TODO:

Rust project thoughts

Here are Rusty thoughts on it: https://github.com/rust-lang/rust/issues/95228 Highlights:

Rust stdlib docs on it: https://doc.rust-lang.org/std/ptr/#pointer-vs-addresses

Divergence from u* integer types.

MAX

Of interest here is that if uintptr_t holds 128 bits but UINTPTR_MAX reports (2^64-1), we will need to either make uptr::MAX not self-consistent (ouch) or be very clear in the docs that uptr::MAX is not the same as UINTPTR_MAX, or just not have uptr::MAX at all. If uptr::MAX is self-consistent and returns (2^128-1), any rewriting of UINTPTR_MAX into a Subspace constant should be rewritten to usize::MAX, which will convert up to the larger uptr::MAX if needed, unless the authors were sure they wanted to include the non-address part of the pointer in their max. Having to go through usize seems like the wrong path though, there should be a more clear well-lit Good Path.

Another possibility here would be to have two max values on uptr instead, like

FWIW usize::MAX is not really a valid value for usize for the things it is intended for which is offsets into arrays, as the max size of any allocation/array is isize::MAX or a 31-bit number. But it can also just be used as "an integer" so MAX is not (2^31-1). That said uptr::MAX being uptr::MAX_BIT_PATTERN feels much worse as it will directly cause bad CHERI things to occur when converted back to a pointer, whereas a x_usize >= 2^31 can be caught by bounds checks (or compiler checks for Array sizes) if misused. So I don't see this as strong argument for having uptr::MAX == MAX_BIT_PATTERN.

Proposal

Let's start with MAX_ADDR and MAX_BIT_PATTERN and drop MAX for now (and also have MAX_ADDR_PRIMITIVE and MAX_BIT_PATTERN_PRIMITIVE).

Conversions

danakj commented 1 year ago

Done by https://github.com/chromium/subspace/pull/284

danakj commented 1 year ago

RangeFrom needs to return a lower bound for its size_hint which is normally T::MAX. But if you constructed a RangeFrom over uptr then that doesn't work cuz there's no MAX.

MAX_ADDR would do the wrong thing, cuz that just excludes capabilities from its value, but pointer values may be larger with the capabilities present on CHERI.

MAX_BIT_PATTERN is the closest thing to correct here. If you iterate so far that you are incrementing past MAX_ADDR in the address part of the integer then you've done bad things to your pointer, but that's where the iteration would technically go.

Here MAX being 2^N-1 (not matching uintptr_t on cheri) would be the most correct thing to use, and it's annoying in this case that it does not exist. There's nothing... better.

Edit: Well it turns out uptr is weird in lots of ways so it doesn't have like try_from(unsigned) cuz it needs to be constucted with provenance, so this use case evaporated.

Maybe uptr shouldn't be included in Integer concept though..