CTSRD-CHERI / llvm-project

Fork of LLVM adding CHERI support
48 stars 42 forks source link

No way to get an address constant in a `constexpr` context in C++ #651

Open davidchisnall opened 2 years ago

davidchisnall commented 2 years ago

Long ago, before we made __capability a proper qualifier, it was possible to declare a global in address space 0 and then take the address of it and get an integer address value. It would be nice to have an idiom for doing this sensibly and lowering to a relocation. For example, something like:

int someGlobal;
ptraddr_t address = (ptraddr_t)&someGlobal; // <- This should be an Integer Constant Expression

We currently fudge this with some inline assembly, but then it can't be an ICE.

jrtc27 commented 2 years ago

That code works today? https://cheri-compiler-explorer.cl.cam.ac.uk/z/WWYdr6

davidchisnall commented 2 years ago

Interesting. It is a C ICE, but it is not constexpr in C++,l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:33.33333333333333,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:cheri-riscv64-purecap%2B%2B,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'',selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1,tree:'1'),l:'5',n:'0',o:'Purecap+CHERI-RISCV64+(C%2B%2B,+Editor+%231,+Compiler+%231)',t:'0')),k:33.33333333333333,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'Purecap+CHERI-RISCV64',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+Purecap+CHERI-RISCV64+(Compiler+%231)',t:'0')),k:33.33333333333333,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4). I've updated the issue title to more accurately reflect our real problem.

jrtc27 commented 2 years ago

Ok, but pointer->integer casts aren't constexpr for non-CHERI either owing to the reinterpret_cast that backs the C-style cast.

davidchisnall commented 2 years ago

This is the case on CHERI, but this does work for the non-CHERI version of the same code,l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:33.333333333333336,l:'4',m:50,n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:gsnapshot,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'1',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-std%3Dc%2B%2B17+-O3',selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1,tree:'1'),l:'5',n:'0',o:'x86-64+gcc+(trunk)+(C%2B%2B,+Editor+%231,+Compiler+%231)',t:'0')),header:(),k:33.333333333333336,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'x86-64+gcc+(trunk)',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+x86-64+gcc+(trunk)+(Compiler+%231)',t:'0')),k:33.33333333333333,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',m:100,n:'0',o:'',t:'0')),version:4). This even works if you change the C-style cast to an explicit reinterpret_cast,l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:33.333333333333336,l:'4',m:50,n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:gsnapshot,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'1',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-std%3Dc%2B%2B17+-O3',selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1,tree:'1'),l:'5',n:'0',o:'x86-64+gcc+(trunk)+(C%2B%2B,+Editor+%231,+Compiler+%231)',t:'0')),header:(),k:33.333333333333336,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'x86-64+gcc+(trunk)',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+x86-64+gcc+(trunk)+(Compiler+%231)',t:'0')),k:33.33333333333333,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',m:100,n:'0',o:'',t:'0')),version:4).

In the CHERI version, I believe the C-style cast is expanded as something roughly analogous to static_cast<ptraddr_t>(reinterpret_cast<uintptr_t>(ptr)). I am not quite sure which sequence of casts are allowed in constexpr, but unfortunately there doesn't seem to be a CHERI mechanism for getting the address of a global as an integer in a constexpr in CHERI C++, whereas there is in non-CHERI C++. I'd be quite happy with this being an intrinsic, because the only place where we really want this is in early initialisation code that is very CHERI specific (it is deriving the capabilities for things from root capabilities, based on their addresses).

arichardson commented 2 years ago

The example fails if you actually use constexpr for the variables though: https://godbolt.org/z/T5433xfs1

davidchisnall commented 2 years ago

The one on line 12 is the reduced test case from our code.

arichardson commented 2 years ago

GCC seems to be weirdly inconsistent here: It allows constexpr initialization of a struct containing a single ptraddr_t but disallows it for a plain ptraddr_t. Clang is more consistent and rejects both cases: https://godbolt.org/z/bvWbWcvKG

jrtc27 commented 2 years ago

The official rule in the standard is no reinterpret_cast is a constant, and pointer to integer casts are reinterpret_casts.