CTSRD-CHERI / cheri-c-programming

CHERI C/C++ Programming Guide
28 stars 3 forks source link

Find a better alternative to vaddr_t and actually define it in cheriintrin.h #8

Closed kevin-brodsky-arm closed 3 years ago

kevin-brodsky-arm commented 4 years ago

The programming guide currently uses vaddr_t in the prototype of many of the cheri_* functions, but cheriintrin.h does not declare it or use it in any way. Further discussions showed that vaddr_t is probably not a good choice because many codebases already define a type with that name. It would be great to find another, less overloaded name, which could actually be exposed in cheriintrin.h.

ruben-arm commented 4 years ago

One of the possible names: ptraddr_t - it isn't used much nowadays, while seems to reflect the essence of the underlying concept being introduced.

arichardson commented 4 years ago

I did a quick search on github for possible names:

There are probably more variations we could try, but ptraddr_t sounds reasonable.

bsdjhb commented 4 years ago

I like ptraddr_t. In particular not all addresses in pointers are virtual in some cases (M-mode caps in RISC-V but we have also talked about physical memory capabilities for DMA, etc.)

brooksdavis commented 4 years ago

ptraddr_t works for me and has a clear meaning (although what about MTE).

Do we need signed and unsigned variants?

arichardson commented 4 years ago

ptraddr_t works for me and has a clear meaning (although what about MTE).

Do we need signed and unsigned variants?

Do we want to support architectures where the signed variant is not ptrdiff_t?

jrtc27 commented 4 years ago

ptraddr_t works for me and has a clear meaning (although what about MTE). Do we need signed and unsigned variants?

Do we want to support architectures where the signed variant is not ptrdiff_t?

ptrdiff_t only needs to hold values for valid pointer subtractions (and even then is allowed to not cover all possible values, since technically you can have a 3 GiB object on a 32-bit system...), and valid pointer subtractions are ones where both pointers involved point to the same object (one-past-the-end allowed as normal). So on segmented architectures it is permissible to have sizeof(ptrdiff_t) < sizeof(ptraddr_t); IA-64 comes to mind as an example where the segment ("region") is part of the seemingly-linear address, and so in theory could have used a smaller ptrdiff_t, but in practice the region number is only 3 bits so you still need to round up to 64 bits. But ptrsaddr_t (sptraddr_t?) is a conceptually different thing from ptrdiff_t, even if in practice they are the same thing, just like ptraddr_t is likely to be the same as size_t, and ptrsaddr_t also the same as ssize_t (the latter being particularly weird; the only negative number it's required to be able to store is -1).

arichardson commented 4 years ago

Yes, I realize that ptrdiff_t is not the same thing. However I do wonder what the meaning of a "signed address" is supposed to be. It might make signed comparisons easier but maybe you should convert to an appropriately sized signed type instead.

brooksdavis commented 4 years ago

I guess the question is, what does the subtraction of two ptraddr_ts notionally produce?

(IIRC @nwf mentioned a radically segmented architecture with something like 32-bit addresses and 16-bit ptrdiff_t's, but that's not especially relevent here.)

bsdjhb commented 4 years ago

I think for subtraction, since this is an integer type, it can still return the same type (e.g. subtracting of two size_t values gives a size_t value). I don't think this is any different.

I do think Jess is right in that the only real use cases for a signed ptraddr_t is probably -1 cookie values.

I'd probably prefer the "default" name to be unsigned (like size_t). I don't have a strong opinion on where to put the 's': ptrsaddr_t vs sptraddr_t. Most existing typs I can think of use prefixes (uintptr_t and ssize_t come to mind). Verbally I'd probably say "signed pointer address" rather than "pointer signed address" which favors sptraddr_t.

arichardson commented 4 years ago

I'm not convinced we need a signed variant. The -1 magic value can still be used with an unsigned type and appropriate casts or a __PTRADDR_MAX__ define? However, I don't feel strongly about this and I'd be happy to add sptraddr_t to stddef.h if anyone can give me an example where this would be useful.

jrtc27 commented 4 years ago

Whether or not they should use it, people will want it. Let's not make the same mistake as C99 did by defining size_t but not ssize_t making the latter a POSIX extension. People might also want it as an int/address union in the same way they currently use uintptr_t and intptr_t depending on what types of integers they want to store.

jrtc27 commented 4 years ago

I believe this has been addressed for a while now, the only outstanding issue is whether we want explicit sptraddr_t/uptraddr_t. Currently ptraddr_t is unsigned and I don't think we want to go down the implementation-defined char route, so should we take a size_t-like approach and define an sptraddr_t?

bsdjhb commented 4 years ago

I think sptraddr_t following the convention of size_t and ssize_t is fine. I do think someone, somewhere will want to return -1 as an error value for some function that otherwise returns ptraddr_t.

jrtc27 commented 4 years ago

Arguably the correct return value there would be whatever (ptraddr_t)NULL is (0 in any sane modern architecture) as -1 could be a valid address (and < 0 as an error check breaks in most kernels) but we should give people the flexibility to use it if it's what they really want (e.g. an alternative use could be to distinguish user vs kernel addresses on some architectures fairly concisely).

arichardson commented 4 years ago

I agree that (ptraddr_t)NULL seems like the more sensible error return value. I don't really see much need for sptraddr_t since we already have intmax_t/intprt_t for signed int/address unions and comparisons < 0. I don't object to adding it but I won't make that change.

We should not add uptraddr_t and mandate that ptraddr_t should be unsigned. If someone feels like typing another character they can always add the typedef :)

kevin-brodsky-arm commented 3 years ago

I've just stumbled upon this issue again, LLVM and the guide itself are now using ptraddr_t, and so is Android / Morello. I guess we can now close that issue?

jrtc27 commented 3 years ago

sptraddr_t doesn't yet exist and I have run into a case where I've wanted it.

kevin-brodsky-arm commented 3 years ago

OK, maybe it would be better to open a new issue for it then because that one doesn't originally cover sptraddr_t.