Open saethlin opened 1 month ago
This ends up not mattering, since LLVM also folds inttoptr i64 0 to ptr
to null
:
define void @src(i64 noundef) {
start:
%1 = inttoptr i64 0 to ptr
%2 = getelementptr i8, ptr %1, i64 %0
store i8 0, ptr %2
ret void
}
With -passes='early-cse,instcombine' -debug
:
; EarlyCSE Simplify: %1 = inttoptr i64 0 to ptr to: ptr null
define void @src(i64 noundef %0) {
%1 = getelementptr i8, ptr null, i64 %0
store i8 poison, ptr %1, align 1
ret void
}
Alive2 thinks this is illegal:
https://alive2.llvm.org/ce/z/EJxazG
However, their LangRef also says that:
- An integer constant other than zero or a pointer value returned from a function not defined within LLVM may be associated with address ranges allocated through mechanisms other than those provided by LLVM.
- A pointer value formed from a scalar getelementptr operation is based on the pointer-typed operand of the getelementptr.
- A pointer value formed by an inttoptr is based on all pointer values that contribute (directly or indirectly) to the computation of the pointer’s value.
So they might have a special case where inttoptr i64 0 to ptr
has no provenance? I'm not sure what their intended semantics are for that.
This ends up not mattering
It definitely matters. There's no hope of the full compiler getting this right if rustc corrupts the program. We should have an LLVM bug filed for this; can you file one or link an existing issue?
There appears to be another issue about this in the UCG repo here: https://github.com/rust-lang/unsafe-code-guidelines/issues/507.
Also, I think this is a similar issue about i2p 0 != null
: https://github.com/rust-lang/rust/issues/107326
On the LLVM side:
I'll see if I can find any issues about i2p 0 != null
. If not I'll open a new issue.
I think the underlying cause is this line of code: https://github.com/llvm/llvm-project/blob/ec78f0da0e9b1b8e2b2323e434ea742e272dd913/llvm/lib/IR/ConstantFold.cpp#L146
if (V->isNullValue() && !DestTy->isX86_AMXTy() &&
opc != Instruction::AddrSpaceCast)
return Constant::getNullValue(DestTy);
which gets called by simplifyInstruction
on int2ptr
.
Cc @rust-lang/opsem @nikic
Yeah, the fact that inttoptr 0 folds to null is a "well known" issue in LLVM, and as usual, hard to fix :) We do go out of the way not to produce ptrtoint 0 in cases where this is likely to matter in practice.
See also https://github.com/AliveToolkit/alive2/issues/929 for some related discussion from an alive2 perspective. I guess the memset(0) question at the end is resolved on the Rust side by saying that transmuting that memset to a pointer wouldn't have provenance anyway, but on the LLVM side this is still an unresolved issue (cf byte type).
Miri executes this program without error, but I believe our lowering to LLVM IR adds UB:
With optimizations enabled, LLVM cleans up
from_exposed_null
toWith
-Cno-prepopulate-passes
I believe the codegen also returns a pointer that definitely has no provenance, though it's just harder to read. godbolt: https://godbolt.org/z/s414rfK44I think this happens because codegen is implicitly const-propagating through the int-to-pointer cast via
OperandValue
. At least I don't see any other way to get from this MIR:To this LLVM IR
godbolt: https://godbolt.org/z/MjPvcdj9h