Closed ghost closed 1 year ago
I ran out of time to debug this but I've got a pretty good idea of what's wrong.
The SROA pass kicks in and becomes confused by the is_weak: bool
field, the pass gets something wrong (that doesn't trip any assertion) and ends up writing the is_weak
part of .temp_buffer = .{ .index = 0, .is_weak = false },
into result
's slice len
field.
Running the following LLVM IR code with opt -debug -S -O3 < file.ll
gives something like this, I've highlighted the important parts:
%0 = alloca %"struct:96:29", align 8
%old.sroa.2.0.sroa_cast = bitcast i8* %old.sroa.2 to i1*
store i1 true, i1* %old.sroa.2.0.sroa_cast, align 8
%result.sroa.0.0..sroa_cast = bitcast %"struct:96:29"* %0 to i64*
store i64 or (i64 shl (i64 zext (i56 trunc (i64 lshr (i64 ptrtoint ([20 x i8]* @0 to i64), i64 8) to i56) to i64), i64 8), i64 zext (i8 ptrtoint ([20 x i8]* @0 to i8) to i64)), i64* %result.sroa.0.0..sroa_cast, align 8
; This is where the `len` parameter is smashed
%result.sroa.6.0..sroa_idx21 = getelementptr inbounds %"struct:96:29", %"struct:96:29"* %0, i64 0, i32 0, i32 1
%result.sroa.6.0..sroa_cast = bitcast i64* %result.sroa.6.0..sroa_idx21 to i1*
store i1 true, i1* %result.sroa.6.0..sroa_cast, align 8
; ..snip..
%7 = bitcast %"struct:96:29"* %0 to %"[]u8"*
call void @llvm.memset.p0i8.i64(i8* nonnull align 8 dereferenceable(32) %4, i8 0, i64 32, i1 false) #7
; This is where things go south, the ptr part of the slice is fine while the length becomes 1
%8 = call fastcc i16 @std.fmt.formatType(%"[]u8"* %7, %std.fmt.FormatOptions* %options.i.i.i, %"std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write)"* %stderr.i) #7
Having someone from the LLVM team confirm this hypotesis and have a look would be helpful.
0.9.0 milestone, haha, ouch.
LemonBoy are you saying that this is an LLVM bug, not a Zig compiler bug?
0.9.0 milestone, haha, ouch.
LemonBoy are you saying that this is an LLVM bug, not a Zig compiler bug?
It really sounds like one, you can (hopefully) work around it by making is_weak
a u8
.
You're right, my project is "fixed" by changing that field from a bool to a u8. It doesn't give me a lot of confidence though.
Do you think the problem is specific to boolean struct fields getting returned through (possibly multiple layers of) result locations?
Do you think the problem is specific to boolean struct fields getting returned through (possibly multiple layers of) result locations?
Hard to tell, the problem disappears if you touch any part of it. You've managed to line-up all the pieces in perfect order to let the over-eager optimization pass kick in.
Hard to tell, the problem disappears if you touch any part of it. You've managed to line-up all the pieces in perfect order to let the over-eager optimization pass kick in.
Good news (?), I've managed to pin-point the bad code generation to an unfortunate interaction between the inliner pass and the SROA one. The bad news is that I don't have enough time to dig into the pass implementation, it's better to punt this bug to the LLVM developers.
Here's a nice not-so-reduced test case, I've already run it trough opt -inline
.
You can observe the bug by running:
opt-11 -sroa <.ll file> | lli-11 # The output is corrupted
lli-11 <.ll file> # The output is fine
I managed to shrink the test case down to ~200LoC (from ~7k), this is now ready to be reported upstream (@andrewrk can you open a ticket for this?).
LLVM IR source:
; ModuleID = '/tmp/foo.inl.ll'
source_filename = "foo"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%CallArg = type { %Expression }
%Expression = type { %EnumLiteral, i1 }
%EnumLiteral = type { %"[]u8" }
%"[]u8" = type { i8*, i64 }
%ExpressionResult = type { %TempRef, i2 }
%TempRef = type { i64, i1 }
%"struct:78:50" = type { %"[]u8" }
%"struct:132:52" = type { %"[]u8" }
@0 = internal unnamed_addr constant [20 x i8] c"what happened to me\00", align 1
@1 = internal unnamed_addr constant %CallArg { %Expression { %EnumLiteral { %"[]u8" { i8* getelementptr inbounds ([20 x i8], [20 x i8]* @0, i64 0, i64 0), i64 19 } }, i1 true } }, align 8
; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0
; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
declare { i64, i1 } @llvm.uadd.with.overflow.i64(i64, i64) #1
; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
declare { i64, i1 } @llvm.ssub.with.overflow.i64(i64, i64) #1
define internal fastcc void @main() unnamed_addr {
Entry:
%v.i = alloca %EnumLiteral, align 8
%result.i = alloca %ExpressionResult, align 8
%w.i = alloca %"[]u8", align 8
%0 = alloca %"struct:78:50", align 8
%arg = alloca %CallArg, align 8
%result = alloca %ExpressionResult, align 8
%w = alloca %"[]u8", align 8
%1 = alloca %"struct:132:52", align 8
%derp = alloca %ExpressionResult, align 8
%2 = bitcast %CallArg* %arg to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* bitcast (%CallArg* @1 to i8*), i64 24, i1 false)
%3 = getelementptr inbounds %CallArg, %CallArg* %arg, i32 0, i32 0
%4 = getelementptr inbounds %Expression, %Expression* %3, i32 0, i32 1
%5 = load i1, i1* %4, align 1
switch i1 %5, label %SwitchElse3.i [
i1 true, label %SwitchProng1.i
]
SwitchProng1.i: ; preds = %Entry
%6 = getelementptr inbounds %Expression, %Expression* %3, i32 0, i32 0
%7 = bitcast %EnumLiteral* %6 to i8*
%8 = bitcast %EnumLiteral* %v.i to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %8, i8* %7, i64 16, i1 false)
%9 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %result.i, i32 0, i32 1
store i2 -2, i2* %9, align 1
%10 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %result.i, i32 0, i32 0
%11 = bitcast %TempRef* %10 to %"[]u8"*
%12 = getelementptr inbounds %EnumLiteral, %EnumLiteral* %v.i, i32 0, i32 0
%13 = bitcast %"[]u8"* %12 to i8*
%14 = bitcast %"[]u8"* %11 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %14, i8* %13, i64 16, i1 false)
br label %SwitchProng2.i
SwitchProng2.i: ; preds = %SwitchProng1.i
%15 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %result.i, i32 0, i32 0
%16 = bitcast %TempRef* %15 to %"[]u8"*
%17 = bitcast %"[]u8"* %16 to i8*
%18 = bitcast %"[]u8"* %w.i to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %18, i8* %17, i64 16, i1 false)
%19 = getelementptr inbounds %"struct:78:50", %"struct:78:50"* %0, i32 0, i32 0
%20 = bitcast %"[]u8"* %w.i to i8*
%21 = bitcast %"[]u8"* %19 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %21, i8* %20, i64 16, i1 false)
%22 = getelementptr inbounds %"struct:78:50", %"struct:78:50"* %0, i32 0, i32 0
call fastcc void @std.os.write(i32 1, %"[]u8"* %22)
br label %ErrRetContinue.i.i.i
ErrRetContinue.i.i.i: ; preds = %SwitchProng2.i
br label %f.exit.i.i
f.exit.i.i: ; preds = %ErrRetContinue.i.i.i
br label %UnwrapErrError.i.i
UnwrapErrError.i.i: ; preds = %f.exit.i.i
br label %print.18.exit.i
print.18.exit.i: ; preds = %UnwrapErrError.i.i
br label %SwitchEnd.i
SwitchEnd.i: ; preds = %print.18.exit.i
%23 = bitcast %ExpressionResult* %result.i to i8*
%24 = bitcast %ExpressionResult* %result to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %24, i8* %23, i64 24, i1 false)
br label %genExpression.exit
SwitchElse3.i: ; preds = %Entry
ret void
genExpression.exit: ; preds = %SwitchEnd.i
%25 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %result, i32 0, i32 1
%26 = load i2, i2* %25, align 1
switch i2 %26, label %SwitchElse [
i2 -2, label %SwitchProng
]
SwitchElse: ; preds = %genExpression.exit
ret void
SwitchProng: ; preds = %genExpression.exit
%27 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %result, i32 0, i32 0
%28 = bitcast %TempRef* %27 to %"[]u8"*
%29 = bitcast %"[]u8"* %28 to i8*
%30 = bitcast %"[]u8"* %w to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %30, i8* %29, i64 16, i1 false)
%31 = getelementptr inbounds %"struct:132:52", %"struct:132:52"* %1, i32 0, i32 0
%32 = bitcast %"[]u8"* %w to i8*
%33 = bitcast %"[]u8"* %31 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %33, i8* %32, i64 16, i1 false)
%34 = getelementptr inbounds %"struct:132:52", %"struct:132:52"* %1, i32 0, i32 0
call fastcc void @std.os.write(i32 1, %"[]u8"* %34)
ret void
SwitchProng1.i3: ; No predecessors!
%35 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %derp, i32 0, i32 0
%36 = getelementptr inbounds %TempRef, %TempRef* %35, i32 0, i32 1
store i1 false, i1* %36, align 1
ret void
SwitchProng2.i4: ; No predecessors!
%37 = bitcast %ExpressionResult* %result to i8*
%38 = bitcast %ExpressionResult* %derp to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %38, i8* %37, i64 24, i1 false)
ret void
}
define internal fastcc void @std.os.write(i32 %0, %"[]u8"* %1) unnamed_addr {
Entry:
%result.i8 = alloca i64, align 8
%y.i = alloca i64, align 8
%number.i.i = alloca i64, align 8
%arg2.i.i = alloca i64, align 8
%arg3.i.i = alloca i64, align 8
%buf.i = alloca i8*, align 8
%count.i = alloca i64, align 8
%adjusted_len = alloca i64, align 8
%2 = getelementptr inbounds %"[]u8", %"[]u8"* %1, i32 0, i32 1
%3 = load i64, i64* %2, align 8
store i64 %3, i64* %y.i, align 8
br label %Else.i10
Else.i10: ; preds = %Entry
%4 = load i64, i64* %y.i, align 8
store i64 %4, i64* %result.i8, align 8
%5 = load i64, i64* %result.i8, align 8
br label %std.math.min.exit
std.math.min.exit: ; preds = %Else.i10
store i64 %5, i64* %adjusted_len, align 8
br label %WhileCond
WhileCond: ; preds = %std.math.min.exit
br label %WhileBody
WhileBody: ; preds = %WhileCond
%6 = getelementptr inbounds %"[]u8", %"[]u8"* %1, i32 0, i32 0
%7 = load i8*, i8** %6, align 8
%8 = load i64, i64* %adjusted_len, align 8
store i8* %7, i8** %buf.i, align 8
store i64 %8, i64* %count.i, align 8
%9 = load i8*, i8** %buf.i, align 8
%10 = ptrtoint i8* %9 to i64
%11 = load i64, i64* %count.i, align 8
store i64 1, i64* %number.i.i, align 8
store i64 %10, i64* %arg2.i.i, align 8
store i64 %11, i64* %arg3.i.i, align 8
%12 = load i64, i64* %number.i.i, align 8
%13 = load i64, i64* %arg2.i.i, align 8
%14 = load i64, i64* %arg3.i.i, align 8
%15 = call i64 asm sideeffect "syscall", "={rax},{rax},{rdi},{rsi},{rdx},~{rcx},~{r11},~{memory},~{dirflag},~{fpsr},~{flags}"(i64 %12, i64 undef, i64 %13, i64 %14)
ret void
}
attributes #0 = { argmemonly nofree nosync nounwind willreturn }
attributes #1 = { nofree nosync nounwind readnone speculatable willreturn }
The different behaviour can be observed by running
opt-11 -sroa <.ll file> | lli-11 # The output is corrupted
lli-11 <.ll file> # The output is fine
(@andrewrk can you open a ticket for this?).
Sure thing. Upstream bug report
I managed to shrink the test case down to ~200LoC (from ~7k), this is now ready to be reported upstream (@andrewrk can you open a ticket for this?).
LLVM IR source:
<...>
This is either over-reduced, or the original IR is already broken, because alive2 says that the transformation is valid:
$ /repositories/alive2/build-Clang-release/alive-tv input.ll output.ll
----------------------------------------
@1 = constant 24 bytes, align 8
@0 = constant 20 bytes, align 1
define void @main() {
#init:
%__constexpr_1 = gep inbounds * @0, 20 x i64 0, 1 x i64 0
%__copy_0 = {*, i64} { %__constexpr_1, 19 }
%__copy_1 = {{*, i64}} { %__copy_0 }
%__copy_2 = {{{*, i64}}, i1, i56} { %__copy_1, 1, [padding] }
%__copy_3 = {{{{*, i64}}, i1, i56}} { %__copy_2 }
store [20 x i8] { 119, 104, 97, 116, 32, 104, 97, 112, 112, 101, 110, 101, 100, 32, 116, 111, 32, 109, 101, 0 }, * @0, align 1
store {{{{*, i64}}, i1, i56}} %__copy_3, * @1, align 8
br label %Entry
%Entry:
%v.i = alloca i64 16, align 8
%result.i = alloca i64 24, align 8
%w.i = alloca i64 16, align 8
%0 = alloca i64 16, align 8
%arg = alloca i64 24, align 8
%result = alloca i64 24, align 8
%w = alloca i64 16, align 8
%1 = alloca i64 16, align 8
%derp = alloca i64 24, align 8
%2 = bitcast * %arg to *
%__constexpr_0 = bitcast * @1 to *
memcpy * %2 align 1, * %__constexpr_0 align 1, i64 24
%3 = gep inbounds * %arg, 24 x i32 0, 1 x i64 0
%4 = gep inbounds * %3, 24 x i32 0, 1 x i64 16
%5 = load i1, * %4, align 1
switch i1 %5, label %SwitchElse3.i [
i1 1, label %SwitchProng1.i
]
%SwitchProng1.i:
%10 = gep inbounds * %3, 24 x i32 0, 1 x i64 0
%11 = bitcast * %10 to *
%12 = bitcast * %v.i to *
memcpy * %12 align 1, * %11 align 1, i64 16
%13 = gep inbounds * %result.i, 24 x i32 0, 1 x i64 16
store i2 2, * %13, align 1
%14 = gep inbounds * %result.i, 24 x i32 0, 1 x i64 0
%15 = bitcast * %14 to *
%16 = gep inbounds * %v.i, 16 x i32 0, 1 x i64 0
%17 = bitcast * %16 to *
%18 = bitcast * %15 to *
memcpy * %18 align 1, * %17 align 1, i64 16
br label %SwitchProng2.i
%SwitchProng2.i:
%19 = gep inbounds * %result.i, 24 x i32 0, 1 x i64 0
%20 = bitcast * %19 to *
%21 = bitcast * %20 to *
%22 = bitcast * %w.i to *
memcpy * %22 align 1, * %21 align 1, i64 16
%23 = gep inbounds * %0, 16 x i32 0, 1 x i64 0
%24 = bitcast * %w.i to *
%25 = bitcast * %23 to *
memcpy * %25 align 1, * %24 align 1, i64 16
%26 = gep inbounds * %0, 16 x i32 0, 1 x i64 0
call void @std.os.write(i32 1, * %26)
br label %ErrRetContinue.i.i.i
%ErrRetContinue.i.i.i:
br label %f.exit.i.i
%f.exit.i.i:
br label %UnwrapErrError.i.i
%UnwrapErrError.i.i:
br label %print.18.exit.i
%print.18.exit.i:
br label %SwitchEnd.i
%SwitchEnd.i:
%27 = bitcast * %result.i to *
%28 = bitcast * %result to *
memcpy * %28 align 1, * %27 align 1, i64 24
br label %genExpression.exit
%genExpression.exit:
%29 = gep inbounds * %result, 24 x i32 0, 1 x i64 16
%30 = load i2, * %29, align 1
switch i2 %30, label %SwitchElse [
i2 2, label %SwitchProng
]
%SwitchElse:
ret void
%SwitchProng:
%31 = gep inbounds * %result, 24 x i32 0, 1 x i64 0
%32 = bitcast * %31 to *
%33 = bitcast * %32 to *
%34 = bitcast * %w to *
memcpy * %34 align 1, * %33 align 1, i64 16
%35 = gep inbounds * %1, 16 x i32 0, 1 x i64 0
%36 = bitcast * %w to *
%37 = bitcast * %35 to *
memcpy * %37 align 1, * %36 align 1, i64 16
%38 = gep inbounds * %1, 16 x i32 0, 1 x i64 0
call void @std.os.write(i32 1, * %38)
ret void
%SwitchElse3.i:
ret void
}
=>
@1 = constant 24 bytes, align 8
@0 = constant 20 bytes, align 1
define void @main() {
#init:
%__constexpr_8 = gep inbounds * @0, 20 x i64 0, 1 x i64 0
%__copy_0 = {*, i64} { %__constexpr_8, 19 }
%__copy_1 = {{*, i64}} { %__copy_0 }
%__copy_2 = {{{*, i64}}, i1, i56} { %__copy_1, 1, [padding] }
%__copy_3 = {{{{*, i64}}, i1, i56}} { %__copy_2 }
store [20 x i8] { 119, 104, 97, 116, 32, 104, 97, 112, 112, 101, 110, 101, 100, 32, 116, 111, 32, 109, 101, 0 }, * @0, align 1
store {{{{*, i64}}, i1, i56}} %__copy_3, * @1, align 8
br label %Entry
%Entry:
%v.i.sroa.2 = alloca i64 1, align 8
%v.i.sroa.3 = alloca i64 7, align 1
%result.i.sroa.4 = alloca i64 7, align 1
%result.i.sroa.7 = alloca i64 7, align 1
%w.i.sroa.2 = alloca i64 1, align 8
%w.i.sroa.3 = alloca i64 7, align 1
%0 = alloca i64 16, align 8
%arg.sroa.0.sroa.3 = alloca i64 7, align 1
%arg.sroa.3 = alloca i64 7, align 1
%result.sroa.4 = alloca i64 7, align 1
%result.sroa.6 = alloca i64 7, align 1
%w.sroa.2 = alloca i64 1, align 8
%w.sroa.3 = alloca i64 7, align 1
%1 = alloca i64 16, align 8
%derp.sroa.2.sroa.0 = alloca i64 7, align 1
%derp.sroa.2.sroa.1 = alloca i64 1, align 1
%derp.sroa.2.sroa.2 = alloca i64 7, align 1
%__constexpr_0 = bitcast * @1 to *
%arg.sroa.0.sroa.0.0.copyload = load i64, * %__constexpr_0, align 1
%__constexpr_2 = gep inbounds * @1, 24 x i64 0, 1 x i64 0, 1 x i64 0, 1 x i64 0, 1 x i64 8
%__constexpr_1 = bitcast * %__constexpr_2 to *
%arg.sroa.0.sroa.2.0.copyload = load i8, * %__constexpr_1, align 1
%arg.sroa.0.sroa.3.0..sroa_idx = gep inbounds * %arg.sroa.0.sroa.3, 7 x i64 0, 1 x i64 0
%__constexpr_4 = bitcast * @1 to *
%__constexpr_3 = gep inbounds * %__constexpr_4, 1 x i64 9
memcpy * %arg.sroa.0.sroa.3.0..sroa_idx align 1, * %__constexpr_3 align 1, i64 7
%__constexpr_5 = gep inbounds * @1, 24 x i64 0, 1 x i64 0, 1 x i64 16
%arg.sroa.2.0.copyload = load i1, * %__constexpr_5, align 1
%arg.sroa.3.0..sroa_idx = gep inbounds * %arg.sroa.3, 7 x i64 0, 1 x i64 0
%__constexpr_7 = bitcast * @1 to *
%__constexpr_6 = gep inbounds * %__constexpr_7, 1 x i64 17
memcpy * %arg.sroa.3.0..sroa_idx align 1, * %__constexpr_6 align 1, i64 7
switch i1 %arg.sroa.2.0.copyload, label %SwitchElse3.i [
i1 1, label %SwitchProng1.i
]
%SwitchProng1.i:
store i8 %arg.sroa.0.sroa.2.0.copyload, * %v.i.sroa.2, align 8
%arg.sroa.0.sroa.3.9.v.i.sroa.3.0..sroa_idx.sroa_idx = gep inbounds * %v.i.sroa.3, 7 x i64 0, 1 x i64 0
%arg.sroa.0.sroa.3.9..sroa_idx = gep inbounds * %arg.sroa.0.sroa.3, 7 x i64 0, 1 x i64 0
memcpy * %arg.sroa.0.sroa.3.9.v.i.sroa.3.0..sroa_idx.sroa_idx align 1, * %arg.sroa.0.sroa.3.9..sroa_idx align 1, i64 7
%v.i.sroa.2.0..sroa_cast = bitcast * %v.i.sroa.2 to *
%v.i.sroa.2.0.v.i.sroa.2.8.result.i.sroa.3.0.copyload = load i1, * %v.i.sroa.2.0..sroa_cast, align 8
%v.i.sroa.3.9.result.i.sroa.4.0..sroa_idx.sroa_idx = gep inbounds * %result.i.sroa.4, 7 x i64 0, 1 x i64 0
%v.i.sroa.3.9..sroa_idx = gep inbounds * %v.i.sroa.3, 7 x i64 0, 1 x i64 0
memcpy * %v.i.sroa.3.9.result.i.sroa.4.0..sroa_idx.sroa_idx align 1, * %v.i.sroa.3.9..sroa_idx align 1, i64 7
br label %SwitchProng2.i
%SwitchProng2.i:
%w.i.sroa.2.0..sroa_cast22 = bitcast * %w.i.sroa.2 to *
store i1 %v.i.sroa.2.0.v.i.sroa.2.8.result.i.sroa.3.0.copyload, * %w.i.sroa.2.0..sroa_cast22, align 8
%w.i.sroa.3.9.result.i.sroa.4.0..sroa_idx21.sroa_idx = gep inbounds * %result.i.sroa.4, 7 x i64 0, 1 x i64 0
%w.i.sroa.3.9..sroa_idx = gep inbounds * %w.i.sroa.3, 7 x i64 0, 1 x i64 0
memcpy * %w.i.sroa.3.9..sroa_idx align 1, * %w.i.sroa.3.9.result.i.sroa.4.0..sroa_idx21.sroa_idx align 1, i64 7
%w.i.sroa.0.0..sroa_cast = bitcast * %0 to *
store i64 %arg.sroa.0.sroa.0.0.copyload, * %w.i.sroa.0.0..sroa_cast, align 1
%w.i.sroa.2.0..sroa_idx = gep inbounds * %0, 16 x i64 0, 1 x i64 0, 1 x i64 8
%w.i.sroa.2.0..sroa_cast = bitcast * %w.i.sroa.2.0..sroa_idx to *
%w.i.sroa.2.0.w.i.sroa.2.0.copyload = load i8, * %w.i.sroa.2, align 8
store i8 %w.i.sroa.2.0.w.i.sroa.2.0.copyload, * %w.i.sroa.2.0..sroa_cast, align 1
%w.i.sroa.3.0..sroa_raw_cast = bitcast * %0 to *
%w.i.sroa.3.0..sroa_raw_idx = gep inbounds * %w.i.sroa.3.0..sroa_raw_cast, 1 x i64 9
%w.i.sroa.3.0..sroa_idx = gep inbounds * %w.i.sroa.3, 7 x i64 0, 1 x i64 0
memcpy * %w.i.sroa.3.0..sroa_raw_idx align 1, * %w.i.sroa.3.0..sroa_idx align 1, i64 7
%2 = gep inbounds * %0, 16 x i32 0, 1 x i64 0
call void @std.os.write(i32 1, * %2)
br label %ErrRetContinue.i.i.i
%ErrRetContinue.i.i.i:
br label %f.exit.i.i
%f.exit.i.i:
br label %UnwrapErrError.i.i
%UnwrapErrError.i.i:
br label %print.18.exit.i
%print.18.exit.i:
br label %SwitchEnd.i
%SwitchEnd.i:
%result.i.sroa.4.9.result.sroa.4.0..sroa_idx.sroa_idx = gep inbounds * %result.sroa.4, 7 x i64 0, 1 x i64 0
%result.i.sroa.4.9..sroa_idx = gep inbounds * %result.i.sroa.4, 7 x i64 0, 1 x i64 0
memcpy * %result.i.sroa.4.9.result.sroa.4.0..sroa_idx.sroa_idx align 1, * %result.i.sroa.4.9..sroa_idx align 1, i64 7
%result.i.sroa.7.17.result.sroa.6.0..sroa_idx.sroa_idx = gep inbounds * %result.sroa.6, 7 x i64 0, 1 x i64 0
%result.i.sroa.7.17..sroa_idx = gep inbounds * %result.i.sroa.7, 7 x i64 0, 1 x i64 0
memcpy * %result.i.sroa.7.17.result.sroa.6.0..sroa_idx.sroa_idx align 1, * %result.i.sroa.7.17..sroa_idx align 1, i64 7
br label %genExpression.exit
%genExpression.exit:
switch i2 2, label %SwitchElse [
i2 2, label %SwitchProng
]
%SwitchElse:
ret void
%SwitchProng:
%w.sroa.2.0..sroa_cast10 = bitcast * %w.sroa.2 to *
store i1 %v.i.sroa.2.0.v.i.sroa.2.8.result.i.sroa.3.0.copyload, * %w.sroa.2.0..sroa_cast10, align 8
%w.sroa.3.9.result.sroa.4.0..sroa_idx8.sroa_idx = gep inbounds * %result.sroa.4, 7 x i64 0, 1 x i64 0
%w.sroa.3.9..sroa_idx = gep inbounds * %w.sroa.3, 7 x i64 0, 1 x i64 0
memcpy * %w.sroa.3.9..sroa_idx align 1, * %w.sroa.3.9.result.sroa.4.0..sroa_idx8.sroa_idx align 1, i64 7
%w.sroa.0.0..sroa_cast = bitcast * %1 to *
store i64 %arg.sroa.0.sroa.0.0.copyload, * %w.sroa.0.0..sroa_cast, align 1
%w.sroa.2.0..sroa_idx = gep inbounds * %1, 16 x i64 0, 1 x i64 0, 1 x i64 8
%w.sroa.2.0..sroa_cast = bitcast * %w.sroa.2.0..sroa_idx to *
%w.sroa.2.0.w.sroa.2.0.copyload = load i8, * %w.sroa.2, align 8
store i8 %w.sroa.2.0.w.sroa.2.0.copyload, * %w.sroa.2.0..sroa_cast, align 1
%w.sroa.3.0..sroa_raw_cast = bitcast * %1 to *
%w.sroa.3.0..sroa_raw_idx = gep inbounds * %w.sroa.3.0..sroa_raw_cast, 1 x i64 9
%w.sroa.3.0..sroa_idx = gep inbounds * %w.sroa.3, 7 x i64 0, 1 x i64 0
memcpy * %w.sroa.3.0..sroa_raw_idx align 1, * %w.sroa.3.0..sroa_idx align 1, i64 7
%3 = gep inbounds * %1, 16 x i32 0, 1 x i64 0
call void @std.os.write(i32 1, * %3)
ret void
%SwitchElse3.i:
ret void
}
Transformation seems to be correct!
ERROR: Unsupported instruction: %15 = call i64 asm sideeffect "syscall", "={rax},{rax},{rdi},{rsi},{rdx},~{rcx},~{r11},~{memory},~{dirflag},~{fpsr},~{flags}"(i64 %12, i64 undef, i64 %13, i64 %14)
ERROR: Could not translate 'std.os.write' to Alive IR
Summary:
1 correct transformations
0 incorrect transformations
1 Alive2 errors
Good news (?), I've managed to pin-point the bad code generation to an unfortunate interaction between the inliner pass and the SROA one. The bad news is that I don't have enough time to dig into the pass implementation, it's better to punt this bug to the LLVM developers.
Here's a nice not-so-reduced test case, I've already run it trough
opt -inline
.You can observe the bug by running:
opt-11 -sroa <.ll file> | lli-11 # The output is corrupted lli-11 <.ll file> # The output is fine
It is hard to say whether the original input also exibits the same problems, it's pretty big :) With increased timeout limits, alive2 still doesn't pinpoint any miscompiles in it, which usually means that there aren't any, because usually it comes up with them real fast. full-log.txt
Just for completeness, if i disable poison/undef input in alive2, it starts complaining about @std.os.unexpectedErrno
:
----------------------------------------
@stderr_mutex = global 1 bytes, align 1
@63 = constant 16 bytes, align 8
@99 = constant 40 bytes, align 8
@101 = constant 16 bytes, align 8
@102 = constant 16 bytes, align 8
@103 = constant 16 bytes, align 8
@104 = constant 16 bytes, align 8
@105 = constant 16 bytes, align 8
@106 = constant 16 bytes, align 8
@86 = constant 40 bytes, align 8
@87 = constant 16 bytes, align 8
@19 = constant 16 bytes, align 8
@100 = constant 22 bytes, align 1
@84 = constant 49 bytes, align 1
@18 = constant 18 bytes, align 1
define i16 @std.os.unexpectedErrno(i64 %0) {
#init:
%__copy_0 = {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} { { undef, 0, [padding] }, { undef, 0, [padding] }, 2, { 32, undef } }
%__copy_3 = {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} { { undef, 0, [padding] }, { undef, 0, [padding] }, 2, { 32, undef } }
store [22 x i8] { 117, 110, 101, 120, 112, 101, 99, 116, 101, 100, 32, 101, 114, 114, 110, 111, 58, 32, 123, 125, 10, 0 }, * @100, align 1
store {i64, i1, i56} { undef, 0, [padding] }, * @102, align 8
store {i64, i1, i56} { undef, 0, [padding] }, * @103, align 8
store {i64, i1, i56} { undef, 0, [padding] }, * @104, align 8
store {i64, i1, i56} { undef, 0, [padding] }, * @105, align 8
store [18 x i8] { 100, 101, 97, 100, 108, 111, 99, 107, 32, 100, 101, 116, 101, 99, 116, 101, 100, 0 }, * @18, align 1
store {{*}, i1, i56} { undef, 0, [padding] }, * @63, align 8
store [49 x i8] { 85, 110, 97, 98, 108, 101, 32, 116, 111, 32, 100, 117, 109, 112, 32, 115, 116, 97, 99, 107, 32, 116, 114, 97, 99, 101, 58, 32, 100, 101, 98, 117, 103, 32, 105, 110, 102, 111, 32, 115, 116, 114, 105, 112, 112, 101, 100, 10, 0 }, * @84, align 1
store {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_3, * @86, align 8
store {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_0, * @99, align 8
br label %Entry
%Entry:
%result.i.i1.i1 = alloca i64 2, align 2
%options.i.i.i2 = alloca i64 40, align 8
%1 = alloca i64 2, align 2
%result.i.i3 = alloca i64 2, align 2
%2 = alloca i64 4, align 4
%result.i.i.i4 = alloca i64 4, align 4
%3 = alloca i64 4, align 4
%stderr.i5 = alloca i64 4, align 4
%4 = alloca i64 2, align 2
%result.i.i.i.i.i.i = alloca i64 2, align 2
%int_value.i.i.i.i.i.i = alloca i64 8, align 8
%5 = alloca i64 8, align 8
%6 = alloca i64 40, align 8
%7 = alloca i64 4, align 4
%value.i.i.i.i.i.i = alloca i64 8, align 8
%result.i.i.i.i.i = alloca i64 2, align 2
%8 = alloca i64 8, align 8
%9 = alloca i64 40, align 8
%10 = alloca i64 4, align 4
%value.i.i.i.i.i = alloca i64 8, align 8
%result.i.i.i.i = alloca i64 2, align 2
%11 = alloca i64 8, align 8
%12 = alloca i64 40, align 8
%13 = alloca i64 4, align 4
%value.i.i.i.i = alloca i64 8, align 8
%max_depth.i.i.i.i = alloca i64 8, align 8
%result.i.i1.i = alloca i64 2, align 2
%options.i.i.i = alloca i64 40, align 8
%14 = alloca i64 2, align 2
%15 = alloca i64 8, align 8
%16 = alloca i64 40, align 8
%17 = alloca i64 4, align 4
%18 = alloca i64 2, align 2
%19 = alloca i64 2, align 2
%result.i.i = alloca i64 2, align 2
%20 = alloca i64 4, align 4
%21 = alloca i64 8, align 8
%self.i.i.i = alloca i64 8, align 8
%22 = alloca i64 16, align 8
%self.i.i = alloca i64 8, align 8
%result.i.i.i = alloca i64 4, align 4
%held.i = alloca i64 8, align 8
%23 = alloca i64 4, align 4
%stderr.i = alloca i64 4, align 4
%24 = alloca i64 8, align 8
%25 = alloca i64 2, align 2
%result = alloca i64 2, align 2
%26 = alloca i64 8, align 8
%err = alloca i64 8, align 8
store i64 %0, * %err, align 8
%27 = load i64, * %err, align 8
%28 = gep inbounds * %26, 8 x i32 0, 1 x i64 0
store i64 %27, * %28, align 8
store * @stderr_mutex, * %self.i.i, align 8
%36 = load *, * %self.i.i, align 8
store * %36, * %self.i.i.i, align 8
%38 = load *, * %self.i.i.i, align 8
%39 = gep inbounds * %38, 1 x i32 0, 1 x i64 0
%40 = load i1, * %39, align 1
br i1 %40, label %Then.i.i.i, label %Else.i.i.i
%Else.i.i.i:
%41 = load *, * %self.i.i.i, align 8
%42 = gep inbounds * %41, 1 x i32 0, 1 x i64 0
store i1 1, * %42, align 1
%43 = gep inbounds * %22, 16 x i32 0, 1 x i64 8
store i1 1, * %43, align 1
%44 = gep inbounds * %22, 16 x i32 0, 1 x i64 0
%45 = gep inbounds * %44, 8 x i32 0, 1 x i64 0
%46 = load *, * %self.i.i.i, align 8
store * %46, * %45, align 8
%47 = gep inbounds * %22, 16 x i32 0, 1 x i64 8
store i1 1, * %47, align 1
%48 = gep inbounds * %22, 16 x i32 0, 1 x i64 0
%49 = bitcast * %44 to *
%50 = bitcast * %48 to *
memcpy * %50 align 8, * %49 align 8, i64 8
br label %std.mutex.Dummy.tryAcquire.exit.i.i
%Then.i.i.i:
%52 = bitcast * %22 to *
%__constexpr_0 = bitcast * @63 to *
memcpy * %52 align 8, * %__constexpr_0 align 8, i64 16
br label %std.mutex.Dummy.tryAcquire.exit.i.i
%std.mutex.Dummy.tryAcquire.exit.i.i:
%54 = gep inbounds * %22, 16 x i32 0, 1 x i64 8
%55 = load i1, * %54, align 1
br i1 %55, label %std.mutex.Dummy.acquire.exit.i, label %OptionalNull.i.i
%std.mutex.Dummy.acquire.exit.i:
%56 = gep inbounds * %22, 16 x i32 0, 1 x i64 0
%57 = bitcast * %56 to *
%58 = bitcast * %held.i to *
memcpy * %58 align 8, * %57 align 8, i64 8
%61 = gep inbounds * %23, 4 x i32 0, 1 x i64 0
store i32 2, * %result.i.i.i, align 4
%63 = load i32, * %result.i.i.i, align 4
store i32 %63, * %61, align 4
%65 = gep inbounds * %stderr.i, 4 x i32 0, 1 x i64 0
%66 = bitcast * %23 to *
%67 = bitcast * %65 to *
memcpy * %67 align 4, * %66 align 4, i64 4
%68 = bitcast * %26 to *
%69 = bitcast * %24 to *
memcpy * %69 align 8, * %68 align 8, i64 8
%73 = bitcast * %stderr.i to *
%74 = bitcast * %20 to *
memcpy * %74 align 4, * %73 align 4, i64 4
%75 = bitcast * %26 to *
%76 = bitcast * %21 to *
memcpy * %76 align 8, * %75 align 8, i64 8
%85 = bitcast * %options.i.i.i to *
%__constexpr_1 = bitcast * @99 to *
memcpy * %85 align 8, * %__constexpr_1 align 8, i64 40
%86 = call i16 @std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).writeAll(nonnull align(4) * %stderr.i, nonnull align(8) * @101)
store i16 %86, * %14, align 2
%87 = icmp ne i16 %86, 0
br i1 %87, label %ErrRetReturn.i.i.i, label %ErrRetContinue.i.i.i
%ErrRetReturn.i.i.i:
%88 = load i16, * %14, align 2
store i16 %88, * %result.i.i1.i, align 2
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%ErrRetContinue.i.i.i:
%97 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 0
%98 = bitcast * %97 to *
%__constexpr_2 = bitcast * @102 to *
memcpy * %98 align 8, * %__constexpr_2 align 8, i64 16
%99 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 16
%100 = bitcast * %99 to *
%__constexpr_3 = bitcast * @103 to *
memcpy * %100 align 8, * %__constexpr_3 align 8, i64 16
%101 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 32
store i2 2, * %101, align 1
%102 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 33
store i8 32, * %102, align 1
%103 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 16
%104 = bitcast * %103 to *
%__constexpr_4 = bitcast * @104 to *
memcpy * %104 align 8, * %__constexpr_4 align 8, i64 16
%105 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 0
%106 = bitcast * %105 to *
%__constexpr_5 = bitcast * @105 to *
memcpy * %106 align 8, * %__constexpr_5 align 8, i64 16
%107 = gep inbounds * %26, 8 x i32 0, 1 x i64 0
%108 = load i64, * %107, align 8
store i64 %108, * %15, align 8
%109 = bitcast * %options.i.i.i to *
%110 = bitcast * %16 to *
memcpy * %110 align 8, * %109 align 8, i64 40
%111 = bitcast * %stderr.i to *
%112 = bitcast * %17 to *
memcpy * %112 align 4, * %111 align 4, i64 4
store i64 %108, * %value.i.i.i.i, align 8
store i64 3, * %max_depth.i.i.i.i, align 8
%119 = load i64, * %value.i.i.i.i, align 8
store i64 %119, * %11, align 8
%120 = bitcast * %options.i.i.i to *
%121 = bitcast * %12 to *
memcpy * %121 align 8, * %120 align 8, i64 40
%122 = bitcast * %stderr.i to *
%123 = bitcast * %13 to *
memcpy * %123 align 4, * %122 align 4, i64 4
store i64 %119, * %value.i.i.i.i.i, align 8
%129 = load i64, * %value.i.i.i.i.i, align 8
store i64 %129, * %8, align 8
%130 = bitcast * %options.i.i.i to *
%131 = bitcast * %9 to *
memcpy * %131 align 8, * %130 align 8, i64 40
%132 = bitcast * %stderr.i to *
%133 = bitcast * %10 to *
memcpy * %133 align 4, * %132 align 4, i64 4
store i64 %129, * %value.i.i.i.i.i.i, align 8
%140 = load i64, * %value.i.i.i.i.i.i, align 8
store i64 %140, * %int_value.i.i.i.i.i.i, align 8
%141 = load i64, * %int_value.i.i.i.i.i.i, align 8
store i64 %141, * %5, align 8
%142 = bitcast * %options.i.i.i to *
%143 = bitcast * %6 to *
memcpy * %143 align 8, * %142 align 8, i64 40
%144 = bitcast * %stderr.i to *
%145 = bitcast * %7 to *
memcpy * %145 align 4, * %144 align 4, i64 4
%146 = call i16 @std.fmt.formatInt(i64 %141, i8 10, i1 0, nonnull align(8) * %options.i.i.i, nonnull align(4) * %stderr.i)
store i16 %146, * %result.i.i.i.i.i.i, align 2
%147 = load i16, * %result.i.i.i.i.i.i, align 2
store i16 %147, * %result.i.i.i.i.i, align 2
%154 = load i16, * %result.i.i.i.i.i, align 2
store i16 %154, * %result.i.i.i.i, align 2
%160 = load i16, * %result.i.i.i.i, align 2
store i16 %160, * %18, align 2
%167 = icmp ne i16 %160, 0
br i1 %167, label %ErrRetReturn2.i.i.i, label %ErrRetContinue3.i.i.i
%ErrRetReturn2.i.i.i:
%168 = load i16, * %18, align 2
store i16 %168, * %result.i.i1.i, align 2
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%ErrRetContinue3.i.i.i:
%177 = call i16 @std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).writeAll(nonnull align(4) * %stderr.i, nonnull align(8) * @106)
store i16 %177, * %19, align 2
%178 = icmp ne i16 %177, 0
br i1 %178, label %ErrRetReturn4.i.i.i, label %ErrRetContinue5.i.i.i
%ErrRetReturn4.i.i.i:
%179 = load i16, * %19, align 2
store i16 %179, * %result.i.i1.i, align 2
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%ErrRetContinue5.i.i.i:
store i16 0, * %result.i.i1.i, align 2
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i:
%196 = phi i16 [ %88, %ErrRetReturn.i.i.i ], [ %168, %ErrRetReturn2.i.i.i ], [ %179, %ErrRetReturn4.i.i.i ], [ 0, %ErrRetContinue5.i.i.i ]
store i16 %196, * %result.i.i, align 2
%197 = load i16, * %result.i.i, align 2
store i16 %197, * %25, align 2
%201 = icmp ne i16 %197, 0
br i1 %201, label %UnwrapErrError.i, label %UnwrapErrOk.i
%UnwrapErrError.i:
%202 = gep inbounds * %held.i, 8 x i32 0, 1 x i64 0
%203 = load *, * %202, align 8
%204 = gep inbounds * %203, 1 x i32 0, 1 x i64 0
store i1 0, * %204, align 1
br label %std.debug.print.exit
%UnwrapErrOk.i:
%210 = gep inbounds * %held.i, 8 x i32 0, 1 x i64 0
%211 = load *, * %210, align 8
%212 = gep inbounds * %211, 1 x i32 0, 1 x i64 0
store i1 0, * %212, align 1
br label %std.debug.print.exit
%std.debug.print.exit:
%221 = gep inbounds * %3, 4 x i32 0, 1 x i64 0
store i32 2, * %result.i.i.i4, align 4
%223 = load i32, * %result.i.i.i4, align 4
store i32 %223, * %221, align 4
%225 = gep inbounds * %stderr.i5, 4 x i32 0, 1 x i64 0
%226 = bitcast * %3 to *
%227 = bitcast * %225 to *
memcpy * %227 align 4, * %226 align 4, i64 4
%230 = bitcast * %stderr.i5 to *
%231 = bitcast * %2 to *
memcpy * %231 align 4, * %230 align 4, i64 4
%235 = bitcast * %options.i.i.i2 to *
%__constexpr_6 = bitcast * @86 to *
memcpy * %235 align 8, * %__constexpr_6 align 8, i64 40
%236 = call i16 @std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).writeAll(nonnull align(4) * %stderr.i5, nonnull align(8) * @87)
store i16 %236, * %1, align 2
%237 = icmp ne i16 %236, 0
br i1 %237, label %ErrRetReturn.i.i.i6, label %ErrRetContinue.i.i.i7
%ErrRetReturn.i.i.i6:
%238 = load i16, * %1, align 2
store i16 %238, * %result.i.i1.i1, align 2
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.9.exit.i
%ErrRetContinue.i.i.i7:
store i16 0, * %result.i.i1.i1, align 2
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.9.exit.i
%std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.9.exit.i:
%245 = phi i16 [ %238, %ErrRetReturn.i.i.i6 ], [ 0, %ErrRetContinue.i.i.i7 ]
store i16 %245, * %result.i.i3, align 2
%246 = load i16, * %result.i.i3, align 2
store i16 %246, * %4, align 2
%249 = icmp ne i16 %246, 0
br i1 %249, label %UnwrapErrError.i8, label %UnwrapErrOk.i9
%UnwrapErrError.i8:
br label %std.debug.dumpCurrentStackTrace.exit
%UnwrapErrOk.i9:
br label %std.debug.dumpCurrentStackTrace.exit
%std.debug.dumpCurrentStackTrace.exit:
store i16 11, * %result, align 2
%256 = load i16, * %result, align 2
ret i16 %256
%OptionalNull.i.i:
call void @std.builtin.default_panic(nonnull align(8) * @19, align(8) * null) noreturn
assume i1 0
}
=>
@stderr_mutex = global 1 bytes, align 1
@63 = constant 16 bytes, align 8
@99 = constant 40 bytes, align 8
@101 = constant 16 bytes, align 8
@102 = constant 16 bytes, align 8
@103 = constant 16 bytes, align 8
@104 = constant 16 bytes, align 8
@105 = constant 16 bytes, align 8
@106 = constant 16 bytes, align 8
@86 = constant 40 bytes, align 8
@87 = constant 16 bytes, align 8
@19 = constant 16 bytes, align 8
@100 = constant 22 bytes, align 1
@84 = constant 49 bytes, align 1
@18 = constant 18 bytes, align 1
define i16 @std.os.unexpectedErrno(i64 %0) {
#init:
%__copy_0 = {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} { { undef, 0, [padding] }, { undef, 0, [padding] }, 2, { 32, undef } }
%__copy_3 = {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} { { undef, 0, [padding] }, { undef, 0, [padding] }, 2, { 32, undef } }
store [22 x i8] { 117, 110, 101, 120, 112, 101, 99, 116, 101, 100, 32, 101, 114, 114, 110, 111, 58, 32, 123, 125, 10, 0 }, * @100, align 1
store {i64, i1, i56} { undef, 0, [padding] }, * @102, align 8
store {i64, i1, i56} { undef, 0, [padding] }, * @103, align 8
store {i64, i1, i56} { undef, 0, [padding] }, * @104, align 8
store {i64, i1, i56} { undef, 0, [padding] }, * @105, align 8
store [18 x i8] { 100, 101, 97, 100, 108, 111, 99, 107, 32, 100, 101, 116, 101, 99, 116, 101, 100, 0 }, * @18, align 1
store {{*}, i1, i56} { undef, 0, [padding] }, * @63, align 8
store [49 x i8] { 85, 110, 97, 98, 108, 101, 32, 116, 111, 32, 100, 117, 109, 112, 32, 115, 116, 97, 99, 107, 32, 116, 114, 97, 99, 101, 58, 32, 100, 101, 98, 117, 103, 32, 105, 110, 102, 111, 32, 115, 116, 114, 105, 112, 112, 101, 100, 10, 0 }, * @84, align 1
store {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_3, * @86, align 8
store {{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_0, * @99, align 8
br label %Entry
%Entry:
%options.i.i.i2 = alloca i64 40, align 8
%stderr.i5 = alloca i64 4, align 4
%1 = alloca i64 40, align 8
%2 = alloca i64 40, align 8
%3 = alloca i64 40, align 8
%options.i.i.i = alloca i64 40, align 8
%4 = alloca i64 40, align 8
%.sroa.6 = alloca i64 7, align 1
%stderr.i = alloca i64 4, align 4
%6 = gep inbounds * @stderr_mutex, 1 x i32 0, 1 x i64 0
%7 = load i1, * %6, align 1
br i1 %7, label %Then.i.i.i, label %Else.i.i.i
%Else.i.i.i:
%8 = gep inbounds * @stderr_mutex, 1 x i32 0, 1 x i64 0
store i1 1, * %8, align 1
br label %std.mutex.Dummy.tryAcquire.exit.i.i
%Then.i.i.i:
%__constexpr_0 = gep inbounds * @63, 16 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.06.0.copyload = load *, * %__constexpr_0, align 8
%__constexpr_1 = gep inbounds * @63, 16 x i64 0, 1 x i64 8
%.sroa.3.0.copyload = load i1, * %__constexpr_1, align 8
%.sroa.6.0..sroa_idx = gep inbounds * %.sroa.6, 7 x i64 0, 1 x i64 0
%__constexpr_3 = bitcast * @63 to *
%__constexpr_2 = gep inbounds * %__constexpr_3, 1 x i64 9
memcpy * %.sroa.6.0..sroa_idx align 1, * %__constexpr_2 align 1, i64 7
br label %std.mutex.Dummy.tryAcquire.exit.i.i
%std.mutex.Dummy.tryAcquire.exit.i.i:
%.sroa.06.0 = phi * [ %.sroa.06.0.copyload, %Then.i.i.i ], [ @stderr_mutex, %Else.i.i.i ]
%.sroa.3.0 = phi i1 [ %.sroa.3.0.copyload, %Then.i.i.i ], [ 1, %Else.i.i.i ]
br i1 %.sroa.3.0, label %std.mutex.Dummy.acquire.exit.i, label %OptionalNull.i.i
%std.mutex.Dummy.acquire.exit.i:
%.sroa.04.0..sroa_idx = gep inbounds * %stderr.i, 4 x i64 0, 1 x i64 0, 1 x i64 0
store i32 2, * %.sroa.04.0..sroa_idx, align 4
%.sroa.010.0..sroa_idx = gep inbounds * %stderr.i, 4 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.010.0.copyload = load i32, * %.sroa.010.0..sroa_idx, align 4
%10 = bitcast * %options.i.i.i to *
%__constexpr_4 = bitcast * @99 to *
memcpy * %10 align 8, * %__constexpr_4 align 8, i64 40
%11 = call i16 @std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).writeAll(nonnull align(4) * %stderr.i, nonnull align(8) * @101)
%12 = icmp ne i16 %11, 0
br i1 %12, label %ErrRetReturn.i.i.i, label %ErrRetContinue.i.i.i
%ErrRetReturn.i.i.i:
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%ErrRetContinue.i.i.i:
%14 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 0
%15 = bitcast * %14 to *
%__constexpr_5 = bitcast * @102 to *
memcpy * %15 align 8, * %__constexpr_5 align 8, i64 16
%16 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 16
%17 = bitcast * %16 to *
%__constexpr_6 = bitcast * @103 to *
memcpy * %17 align 8, * %__constexpr_6 align 8, i64 16
%18 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 32
store i2 2, * %18, align 1
%19 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 33
store i8 32, * %19, align 1
%20 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 16
%21 = bitcast * %20 to *
%__constexpr_7 = bitcast * @104 to *
memcpy * %21 align 8, * %__constexpr_7 align 8, i64 16
%22 = gep inbounds * %options.i.i.i, 40 x i32 0, 1 x i64 0
%23 = bitcast * %22 to *
%__constexpr_8 = bitcast * @105 to *
memcpy * %23 align 8, * %__constexpr_8 align 8, i64 16
%24 = bitcast * %options.i.i.i to *
%25 = bitcast * %4 to *
memcpy * %25 align 8, * %24 align 8, i64 40
%.sroa.012.0..sroa_idx = gep inbounds * %stderr.i, 4 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.012.0.copyload = load i32, * %.sroa.012.0..sroa_idx, align 4
%26 = bitcast * %options.i.i.i to *
%27 = bitcast * %3 to *
memcpy * %27 align 8, * %26 align 8, i64 40
%.sroa.014.0..sroa_idx = gep inbounds * %stderr.i, 4 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.014.0.copyload = load i32, * %.sroa.014.0..sroa_idx, align 4
%28 = bitcast * %options.i.i.i to *
%29 = bitcast * %2 to *
memcpy * %29 align 8, * %28 align 8, i64 40
%.sroa.015.0..sroa_idx = gep inbounds * %stderr.i, 4 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.015.0.copyload = load i32, * %.sroa.015.0..sroa_idx, align 4
%30 = bitcast * %options.i.i.i to *
%31 = bitcast * %1 to *
memcpy * %31 align 8, * %30 align 8, i64 40
%.sroa.016.0..sroa_idx = gep inbounds * %stderr.i, 4 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.016.0.copyload = load i32, * %.sroa.016.0..sroa_idx, align 4
%32 = call i16 @std.fmt.formatInt(i64 %0, i8 10, i1 0, nonnull align(8) * %options.i.i.i, nonnull align(4) * %stderr.i)
%33 = icmp ne i16 %32, 0
br i1 %33, label %ErrRetReturn2.i.i.i, label %ErrRetContinue3.i.i.i
%ErrRetReturn2.i.i.i:
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%ErrRetContinue3.i.i.i:
%35 = call i16 @std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).writeAll(nonnull align(4) * %stderr.i, nonnull align(8) * @106)
%36 = icmp ne i16 %35, 0
br i1 %36, label %ErrRetReturn4.i.i.i, label %ErrRetContinue5.i.i.i
%ErrRetReturn4.i.i.i:
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%ErrRetContinue5.i.i.i:
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i
%std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.15.exit.i:
%39 = phi i16 [ %11, %ErrRetReturn.i.i.i ], [ %32, %ErrRetReturn2.i.i.i ], [ %35, %ErrRetReturn4.i.i.i ], [ 0, %ErrRetContinue5.i.i.i ]
%40 = icmp ne i16 %39, 0
br i1 %40, label %UnwrapErrError.i, label %UnwrapErrOk.i
%UnwrapErrError.i:
%41 = gep inbounds * %.sroa.06.0, 1 x i32 0, 1 x i64 0
store i1 0, * %41, align 1
br label %std.debug.print.exit
%UnwrapErrOk.i:
%43 = gep inbounds * %.sroa.06.0, 1 x i32 0, 1 x i64 0
store i1 0, * %43, align 1
br label %std.debug.print.exit
%std.debug.print.exit:
%.sroa.017.0..sroa_idx = gep inbounds * %stderr.i5, 4 x i64 0, 1 x i64 0, 1 x i64 0
store i32 2, * %.sroa.017.0..sroa_idx, align 4
%.sroa.018.0..sroa_idx = gep inbounds * %stderr.i5, 4 x i64 0, 1 x i64 0, 1 x i64 0
%.sroa.018.0.copyload = load i32, * %.sroa.018.0..sroa_idx, align 4
%46 = bitcast * %options.i.i.i2 to *
%__constexpr_9 = bitcast * @86 to *
memcpy * %46 align 8, * %__constexpr_9 align 8, i64 40
%47 = call i16 @std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).writeAll(nonnull align(4) * %stderr.i5, nonnull align(8) * @87)
%48 = icmp ne i16 %47, 0
br i1 %48, label %ErrRetReturn.i.i.i6, label %ErrRetContinue.i.i.i7
%ErrRetReturn.i.i.i6:
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.9.exit.i
%ErrRetContinue.i.i.i7:
br label %std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.9.exit.i
%std.io.writer.Writer(std.fs.file.File,std.os.WriteError,std.fs.file.File.write).print.9.exit.i:
%49 = phi i16 [ %47, %ErrRetReturn.i.i.i6 ], [ 0, %ErrRetContinue.i.i.i7 ]
%50 = icmp ne i16 %49, 0
br i1 %50, label %UnwrapErrError.i8, label %UnwrapErrOk.i9
%UnwrapErrError.i8:
br label %std.debug.dumpCurrentStackTrace.exit
%UnwrapErrOk.i9:
br label %std.debug.dumpCurrentStackTrace.exit
%std.debug.dumpCurrentStackTrace.exit:
ret i16 11
%OptionalNull.i.i:
call void @std.builtin.default_panic(nonnull align(8) * @19, align(8) * null) noreturn
assume i1 0
}
Transformation doesn't verify!
ERROR: Mismatch in memory
Example:
i64 %0 = any
Source:
{{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_0 = { { any, #x0 (0), poison }, { any, #x0 (0), poison }, #x2 (2, -2), { #x20 (32), < any, any, any, any, any, any > } }
{{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_3 = { { any, #x0 (0), poison }, { any, #x0 (0), poison }, #x2 (2, -2), { #x20 (32), < any, any, any, any, any, any > } }
* %result.i.i1.i1 = pointer(local, block_id=64, offset=0)
* %options.i.i.i2 = pointer(local, block_id=65, offset=0)
* %1 = pointer(local, block_id=66, offset=0)
* %result.i.i3 = pointer(local, block_id=67, offset=0)
* %2 = pointer(local, block_id=68, offset=0)
* %result.i.i.i4 = pointer(local, block_id=69, offset=0)
* %3 = pointer(local, block_id=70, offset=0)
* %stderr.i5 = pointer(local, block_id=71, offset=0)
* %4 = pointer(local, block_id=72, offset=0)
* %result.i.i.i.i.i.i = pointer(local, block_id=73, offset=0)
* %int_value.i.i.i.i.i.i = pointer(local, block_id=74, offset=0)
* %5 = pointer(local, block_id=75, offset=0)
* %6 = pointer(local, block_id=76, offset=0)
* %7 = pointer(local, block_id=77, offset=0)
* %value.i.i.i.i.i.i = pointer(local, block_id=78, offset=0)
* %result.i.i.i.i.i = pointer(local, block_id=79, offset=0)
* %8 = pointer(local, block_id=80, offset=0)
* %9 = pointer(local, block_id=81, offset=0)
* %10 = pointer(local, block_id=82, offset=0)
* %value.i.i.i.i.i = pointer(local, block_id=83, offset=0)
* %result.i.i.i.i = pointer(local, block_id=84, offset=0)
* %11 = pointer(local, block_id=85, offset=0)
* %12 = pointer(local, block_id=86, offset=0)
* %13 = pointer(local, block_id=87, offset=0)
* %value.i.i.i.i = pointer(local, block_id=88, offset=0)
* %max_depth.i.i.i.i = pointer(local, block_id=89, offset=0)
* %result.i.i1.i = pointer(local, block_id=90, offset=0)
* %options.i.i.i = pointer(local, block_id=91, offset=0)
* %14 = pointer(local, block_id=92, offset=0)
* %15 = pointer(local, block_id=93, offset=0)
* %16 = pointer(local, block_id=94, offset=0)
* %17 = pointer(local, block_id=95, offset=0)
* %18 = pointer(local, block_id=96, offset=0)
* %19 = pointer(local, block_id=97, offset=0)
* %result.i.i = pointer(local, block_id=98, offset=0)
* %20 = pointer(local, block_id=99, offset=0)
* %21 = pointer(local, block_id=100, offset=0)
* %self.i.i.i = pointer(local, block_id=101, offset=0)
* %22 = pointer(local, block_id=102, offset=0)
* %self.i.i = pointer(local, block_id=103, offset=0)
* %result.i.i.i = pointer(local, block_id=104, offset=0)
* %held.i = pointer(local, block_id=105, offset=0)
* %23 = pointer(local, block_id=106, offset=0)
* %stderr.i = pointer(local, block_id=107, offset=0)
* %24 = pointer(local, block_id=108, offset=0)
* %25 = pointer(local, block_id=109, offset=0)
* %result = pointer(local, block_id=110, offset=0)
* %26 = pointer(local, block_id=111, offset=0)
* %err = pointer(local, block_id=112, offset=0)
i64 %27 = any
* %28 = pointer(local, block_id=111, offset=0)
* %36 = pointer(non-local, block_id=15, offset=0)
* %38 = pointer(non-local, block_id=15, offset=0)
* %39 = pointer(non-local, block_id=15, offset=0)
i1 %40 = #x0 (0)
* %41 = pointer(non-local, block_id=15, offset=0)
* %42 = pointer(non-local, block_id=15, offset=0)
* %43 = pointer(local, block_id=102, offset=8)
* %44 = pointer(local, block_id=102, offset=0)
* %45 = pointer(local, block_id=102, offset=0)
* %46 = pointer(non-local, block_id=15, offset=0)
* %47 = pointer(local, block_id=102, offset=8)
* %48 = pointer(local, block_id=102, offset=0)
* %49 = pointer(local, block_id=102, offset=0)
* %50 = pointer(local, block_id=102, offset=0)
* %52 = pointer(local, block_id=102, offset=0)
* %__constexpr_0 = pointer(non-local, block_id=1, offset=0)
* %54 = pointer(local, block_id=102, offset=8)
i1 %55 = #x1 (1)
* %56 = pointer(local, block_id=102, offset=0)
* %57 = pointer(local, block_id=102, offset=0)
* %58 = pointer(local, block_id=105, offset=0)
* %61 = pointer(local, block_id=106, offset=0)
i32 %63 = #x00000002 (2)
* %65 = pointer(local, block_id=107, offset=0)
* %66 = pointer(local, block_id=106, offset=0)
* %67 = pointer(local, block_id=107, offset=0)
* %68 = pointer(local, block_id=111, offset=0)
* %69 = pointer(local, block_id=108, offset=0)
* %73 = pointer(local, block_id=107, offset=0)
* %74 = pointer(local, block_id=99, offset=0)
* %75 = pointer(local, block_id=111, offset=0)
* %76 = pointer(local, block_id=100, offset=0)
* %85 = pointer(local, block_id=91, offset=0)
* %__constexpr_1 = pointer(non-local, block_id=2, offset=0)
i16 %86 = #x0000 (0)
i1 %87 = #x0 (0)
i16 %88 = #x0000 (0)
* %97 = pointer(local, block_id=91, offset=0)
* %98 = pointer(local, block_id=91, offset=0)
* %__constexpr_2 = pointer(non-local, block_id=4, offset=0)
* %99 = pointer(local, block_id=91, offset=16)
* %100 = pointer(local, block_id=91, offset=16)
* %__constexpr_3 = pointer(non-local, block_id=5, offset=0)
* %101 = pointer(local, block_id=91, offset=32)
* %102 = pointer(local, block_id=91, offset=33)
* %103 = pointer(local, block_id=91, offset=16)
* %104 = pointer(local, block_id=91, offset=16)
* %__constexpr_4 = pointer(non-local, block_id=6, offset=0)
* %105 = pointer(local, block_id=91, offset=0)
* %106 = pointer(local, block_id=91, offset=0)
* %__constexpr_5 = pointer(non-local, block_id=7, offset=0)
* %107 = pointer(local, block_id=111, offset=0)
i64 %108 = any
* %109 = pointer(local, block_id=91, offset=0)
* %110 = pointer(local, block_id=94, offset=0)
* %111 = pointer(local, block_id=107, offset=0)
* %112 = pointer(local, block_id=95, offset=0)
i64 %119 = any
* %120 = pointer(local, block_id=91, offset=0)
* %121 = pointer(local, block_id=86, offset=0)
* %122 = pointer(local, block_id=107, offset=0)
* %123 = pointer(local, block_id=87, offset=0)
i64 %129 = any
* %130 = pointer(local, block_id=91, offset=0)
* %131 = pointer(local, block_id=81, offset=0)
* %132 = pointer(local, block_id=107, offset=0)
* %133 = pointer(local, block_id=82, offset=0)
i64 %140 = any
i64 %141 = any
* %142 = pointer(local, block_id=91, offset=0)
* %143 = pointer(local, block_id=76, offset=0)
* %144 = pointer(local, block_id=107, offset=0)
* %145 = pointer(local, block_id=77, offset=0)
i16 %146 = #x0000 (0)
i16 %147 = #x0000 (0)
i16 %154 = #x0000 (0)
i16 %160 = #x0000 (0)
i1 %167 = #x0 (0)
i16 %168 = #x0000 (0)
i16 %177 = #x0000 (0)
i1 %178 = #x0 (0)
i16 %179 = #x0000 (0)
i16 %196 = #x0000 (0)
i16 %197 = #x0000 (0)
i1 %201 = #x0 (0)
* %202 = pointer(local, block_id=105, offset=0)
* %203 = pointer(non-local, block_id=15, offset=0)
* %204 = pointer(non-local, block_id=15, offset=0)
* %210 = pointer(local, block_id=105, offset=0)
* %211 = pointer(non-local, block_id=15, offset=0)
* %212 = pointer(non-local, block_id=15, offset=0)
* %221 = pointer(local, block_id=70, offset=0)
i32 %223 = #x00000002 (2)
* %225 = pointer(local, block_id=71, offset=0)
* %226 = pointer(local, block_id=70, offset=0)
* %227 = pointer(local, block_id=71, offset=0)
* %230 = pointer(local, block_id=71, offset=0)
* %231 = pointer(local, block_id=68, offset=0)
* %235 = pointer(local, block_id=65, offset=0)
* %__constexpr_6 = pointer(non-local, block_id=9, offset=0)
i16 %236 = #xffff (65535, -1)
i1 %237 = #x1 (1)
i16 %238 = #xffff (65535, -1)
i16 %245 = #xffff (65535, -1)
i16 %246 = #xffff (65535, -1)
i1 %249 = #x1 (1)
i16 %256 = #x000b (11)
SOURCE MEMORY STATE
===================
NON-LOCAL BLOCKS:
Block 0 > size: 0 align: 1 alloc type: 0
Block 1 > size: 16 align: 8 alloc type: 0
Block 2 > size: 40 align: 8 alloc type: 0
Block 3 > size: 16 align: 8 alloc type: 0
Block 4 > size: 16 align: 8 alloc type: 0
Block 5 > size: 16 align: 8 alloc type: 0
Block 6 > size: 16 align: 8 alloc type: 0
Block 7 > size: 16 align: 8 alloc type: 0
Block 8 > size: 16 align: 8 alloc type: 0
Block 9 > size: 40 align: 8 alloc type: 0
Block 10 > size: 16 align: 8 alloc type: 0
Block 11 > size: 16 align: 8 alloc type: 0
Block 12 > size: 22 align: 1 alloc type: 0
Block 13 > size: 49 align: 1 alloc type: 0
Block 14 > size: 18 align: 1 alloc type: 0
Block 15 > size: 1 align: 1 alloc type: 0
Block 16 > size: 22 align: 8 alloc type: 2
LOCAL BLOCKS:
Block 64 > size: 2 align: 2 alloc type: 1
Block 65 > size: 40 align: 8 alloc type: 1
Block 66 > size: 2 align: 2 alloc type: 1
Block 67 > size: 2 align: 2 alloc type: 1
Block 68 > size: 4 align: 4 alloc type: 1
Block 69 > size: 4 align: 4 alloc type: 1
Block 70 > size: 4 align: 4 alloc type: 1
Block 71 > size: 4 align: 4 alloc type: 1
Block 72 > size: 2 align: 2 alloc type: 1
Block 73 > size: 2 align: 2 alloc type: 1
Block 74 > size: 8 align: 8 alloc type: 1
Block 75 > size: 8 align: 8 alloc type: 1
Block 76 > size: 40 align: 8 alloc type: 1
Block 77 > size: 4 align: 4 alloc type: 1
Block 78 > size: 8 align: 8 alloc type: 1
Block 79 > size: 2 align: 2 alloc type: 1
Block 80 > size: 8 align: 8 alloc type: 1
Block 81 > size: 40 align: 8 alloc type: 1
Block 82 > size: 4 align: 4 alloc type: 1
Block 83 > size: 8 align: 8 alloc type: 1
Block 84 > size: 2 align: 2 alloc type: 1
Block 85 > size: 8 align: 8 alloc type: 1
Block 86 > size: 40 align: 8 alloc type: 1
Block 87 > size: 4 align: 4 alloc type: 1
Block 88 > size: 8 align: 8 alloc type: 1
Block 89 > size: 8 align: 8 alloc type: 1
Block 90 > size: 2 align: 2 alloc type: 1
Block 91 > size: 40 align: 8 alloc type: 1
Block 92 > size: 2 align: 2 alloc type: 1
Block 93 > size: 8 align: 8 alloc type: 1
Block 94 > size: 40 align: 8 alloc type: 1
Block 95 > size: 4 align: 4 alloc type: 1
Block 96 > size: 2 align: 2 alloc type: 1
Block 97 > size: 2 align: 2 alloc type: 1
Block 98 > size: 2 align: 2 alloc type: 1
Block 99 > size: 4 align: 4 alloc type: 1
Block 100 > size: 8 align: 8 alloc type: 1
Block 101 > size: 8 align: 8 alloc type: 1
Block 102 > size: 16 align: 8 alloc type: 1
Block 103 > size: 8 align: 8 alloc type: 1
Block 104 > size: 4 align: 4 alloc type: 1
Block 105 > size: 8 align: 8 alloc type: 1
Block 106 > size: 4 align: 4 alloc type: 1
Block 107 > size: 4 align: 4 alloc type: 1
Block 108 > size: 8 align: 8 alloc type: 1
Block 109 > size: 2 align: 2 alloc type: 1
Block 110 > size: 2 align: 2 alloc type: 1
Block 111 > size: 8 align: 8 alloc type: 1
Block 112 > size: 8 align: 8 alloc type: 1
Target:
{{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_0 = { { #x0000000000000000 (0), #x0 (0), poison }, { any, #x0 (0), poison }, #x2 (2, -2), { #x20 (32), < any, any, any, any, any, any > } }
{{i64, i1, i56}, {i64, i1, i56}, i2, {i8, [6 x i8]}} %__copy_3 = { { #x0000000000000000 (0), #x0 (0), poison }, { any, #x0 (0), poison }, #x2 (2, -2), { #x20 (32), < any, any, any, any, any, any > } }
* %options.i.i.i2 = pointer(local, block_id=64, offset=0)
* %stderr.i5 = pointer(local, block_id=65, offset=0)
* %1 = pointer(local, block_id=66, offset=0)
* %2 = pointer(local, block_id=67, offset=0)
* %3 = pointer(local, block_id=68, offset=0)
* %options.i.i.i = pointer(local, block_id=69, offset=0)
* %4 = pointer(local, block_id=70, offset=0)
* %.sroa.6 = pointer(local, block_id=71, offset=0)
* %stderr.i = pointer(local, block_id=72, offset=0)
* %6 = pointer(non-local, block_id=15, offset=0)
i1 %7 = #x0 (0)
* %8 = pointer(non-local, block_id=15, offset=0)
* %__constexpr_0 = pointer(non-local, block_id=1, offset=0)
* %.sroa.06.0.copyload = any
* %__constexpr_1 = pointer(non-local, block_id=1, offset=8)
i1 %.sroa.3.0.copyload = #x0 (0)
* %.sroa.6.0..sroa_idx = pointer(local, block_id=71, offset=0)
* %__constexpr_3 = pointer(non-local, block_id=1, offset=0)
* %__constexpr_2 = pointer(non-local, block_id=1, offset=9)
* %.sroa.06.0 = pointer(non-local, block_id=15, offset=0)
i1 %.sroa.3.0 = #x1 (1)
* %.sroa.04.0..sroa_idx = pointer(local, block_id=72, offset=0)
* %.sroa.010.0..sroa_idx = pointer(local, block_id=72, offset=0)
i32 %.sroa.010.0.copyload = #x00000002 (2)
* %10 = pointer(local, block_id=69, offset=0)
* %__constexpr_4 = pointer(non-local, block_id=2, offset=0)
i16 %11 = #x0000 (0)
i1 %12 = #x0 (0)
* %14 = pointer(local, block_id=69, offset=0)
* %15 = pointer(local, block_id=69, offset=0)
* %__constexpr_5 = pointer(non-local, block_id=4, offset=0)
* %16 = pointer(local, block_id=69, offset=16)
* %17 = pointer(local, block_id=69, offset=16)
* %__constexpr_6 = pointer(non-local, block_id=5, offset=0)
* %18 = pointer(local, block_id=69, offset=32)
* %19 = pointer(local, block_id=69, offset=33)
* %20 = pointer(local, block_id=69, offset=16)
* %21 = pointer(local, block_id=69, offset=16)
* %__constexpr_7 = pointer(non-local, block_id=6, offset=0)
* %22 = pointer(local, block_id=69, offset=0)
* %23 = pointer(local, block_id=69, offset=0)
* %__constexpr_8 = pointer(non-local, block_id=7, offset=0)
* %24 = pointer(local, block_id=69, offset=0)
* %25 = pointer(local, block_id=70, offset=0)
* %.sroa.012.0..sroa_idx = pointer(local, block_id=72, offset=0)
i32 %.sroa.012.0.copyload = #x00000002 (2)
* %26 = pointer(local, block_id=69, offset=0)
* %27 = pointer(local, block_id=68, offset=0)
* %.sroa.014.0..sroa_idx = pointer(local, block_id=72, offset=0)
i32 %.sroa.014.0.copyload = #x00000002 (2)
* %28 = pointer(local, block_id=69, offset=0)
* %29 = pointer(local, block_id=67, offset=0)
* %.sroa.015.0..sroa_idx = pointer(local, block_id=72, offset=0)
i32 %.sroa.015.0.copyload = #x00000002 (2)
* %30 = pointer(local, block_id=69, offset=0)
* %31 = pointer(local, block_id=66, offset=0)
* %.sroa.016.0..sroa_idx = pointer(local, block_id=72, offset=0)
i32 %.sroa.016.0.copyload = #x00000002 (2)
i16 %32 = #x0000 (0)
i1 %33 = #x0 (0)
i16 %35 = #x0000 (0)
i1 %36 = #x0 (0)
i16 %39 = #x0000 (0)
i1 %40 = #x0 (0)
* %41 = pointer(non-local, block_id=15, offset=0)
* %43 = pointer(non-local, block_id=15, offset=0)
* %.sroa.017.0..sroa_idx = pointer(local, block_id=65, offset=0)
* %.sroa.018.0..sroa_idx = pointer(local, block_id=65, offset=0)
i32 %.sroa.018.0.copyload = #x00000002 (2)
* %46 = pointer(local, block_id=64, offset=0)
* %__constexpr_9 = pointer(non-local, block_id=9, offset=0)
i16 %47 = #xffff (65535, -1)
i1 %48 = #x1 (1)
i16 %49 = #xffff (65535, -1)
i1 %50 = #x1 (1)
TARGET MEMORY STATE
===================
LOCAL BLOCKS:
Block 64 > size: 40 align: 8 alloc type: 1
Block 65 > size: 4 align: 4 alloc type: 1
Block 66 > size: 40 align: 8 alloc type: 1
Block 67 > size: 40 align: 8 alloc type: 1
Block 68 > size: 40 align: 8 alloc type: 1
Block 69 > size: 40 align: 8 alloc type: 1
Block 70 > size: 40 align: 8 alloc type: 1
Block 71 > size: 7 align: 1 alloc type: 1
Block 72 > size: 4 align: 4 alloc type: 1
Mismatch in pointer(non-local, block_id=1, offset=0)
Source value: null, byte offset=0
Target value: pointer(local, block_id=96, offset=0), byte offset=0
But i'm not sure if that is the problem you're seeing (cc @nunoplopes)
Thank you LemonBoy for your work on tracking this down!
This is either over-reduced, or the original IR is already broken, because alive2 says that the transformation is valid:
That's weird, running it with lli
prints two different strings when the sroa is enabled/disabled.
The transform looks ok, alive2 confirms that and so does the lack of assertions being triggered in LLVM, but the problem is only noticeable at runtime: instead of printing what happened to me
you only get a w
, for some reason the optimized code stomps over the length field for the string slice.
Changing %TempRef = type { i64, i1 }
into %TempRef = type { i64, i8 }
makes the problem disappear, it seems to me that the presence of a non-byte-sized field there confuses the pass (or simply hides the error).
You need to be careful when reducing test cases to avoid introducing UB. If that happens, anything goes 😀 I'll have a look at the report to see if I spot smth.
Ok, I've checked the report posted by @LebedevRI: it's a bug in Alive2 😎 Alive2 isn't doing the right thing for global constants initialized with undef pointers. I changed those to poison and the report goes away. So Alive2 can't find any bug in the repro (the big one). It means one of 3 things:
alive-exec
which can run an LLVM IR function and tell you if it runs into UB. But it's very limited for now: functions can't have arguments and can't call other functions. So not very useful here.So I send the ball back to you guys: it's not clear it's a bug in LLVM. It might just be UB on your side. Someone has to dig in.
@nunoplopes thank you for taking a look!
Ok, I tracked this down as I was wondering if it was a bug in Alive2 or not. There's no bug in LLVM or Alive2. It's a bug in Zig!
TL;DR: You copy a structure with type { i8*, i64 }
to a temporary with type { i64, i1 }
. The string size gets truncated to 1, hence you only see 1 character being printed when using SROA.
Reduced test case:
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%CallArg = type { %"[]u8" }
%"[]u8" = type { i8*, i64 }
%ExpressionResult = type { i64, i1 }
@0 = internal unnamed_addr constant [20 x i8] c"what happened to me\00", align 1
@1 = internal unnamed_addr constant %CallArg { %"[]u8" { i8* getelementptr inbounds ([20 x i8], [20 x i8]* @0, i64 0, i64 0), i64 19 } }, align 8
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0
define void @main() {
%v.i = alloca %"[]u8", align 8
%w.i = alloca %"[]u8", align 8
%result = alloca %ExpressionResult, align 8
%derp = alloca %ExpressionResult, align 8
; copy @1 -> %v.1 ([]u8 -> []u8)
%x3 = getelementptr inbounds %CallArg, %CallArg* @1, i32 0, i32 0
%x7 = bitcast %"[]u8"* %x3 to i8*
%x8 = bitcast %"[]u8"* %v.i to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %x8, i8* %x7, i64 16, i1 false)
; copy %v.1 -> %result ([]u8 -> ExpressionResult) BUG!
%x14 = bitcast %ExpressionResult* %result to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %x14, i8* %x8, i64 16, i1 false)
; copy %result -> %w.i
%x18 = bitcast %"[]u8"* %w.i to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %x18, i8* %x14, i64 16, i1 false)
; output %w.i
call void @std.os.write(i32 1, %"[]u8"* %w.i)
ret void
SwitchProng1.i3: ; No predecessors!
%x35 = getelementptr inbounds %ExpressionResult, %ExpressionResult* %derp, i32 0, i32 1
store i1 false, i1* %x35, align 1
ret void
SwitchProng2.i4: ; No predecessors!
%x37 = bitcast %ExpressionResult* %result to i8*
%x38 = bitcast %ExpressionResult* %derp to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %x38, i8* %x37, i64 24, i1 false)
ret void
}
define void @std.os.write(i32 %x0, %"[]u8"* %x1) {
; size
%x2 = getelementptr inbounds %"[]u8", %"[]u8"* %x1, i32 0, i32 1
%x3 = load i64, i64* %x2, align 8
; ptr
%x6 = getelementptr inbounds %"[]u8", %"[]u8"* %x1, i32 0, i32 0
%x7 = load i8*, i8** %x6, align 8
%x10 = ptrtoint i8* %x7 to i64
%x15 = call i64 asm sideeffect "syscall", "={rax},{rax},{rdi},{rsi},{rdx},~{rcx},~{r11},~{memory},~{dirflag},~{fpsr},~{flags}"(i64 1, i64 0, i64 %x10, i64 %x3)
ret void
}
It's a bug in Zig!
:scream:
Thank you for pinpointing the problem, the tagged enum representation in the IR assumes there's always enough padding after { i64, i1 }
so that it has the same shape in memory as { i8*, i64 }
, this also explains why the SROA made the problem surface.
it's a bug in Alive2
Talk about killing two birds with one stone! :D
This bug left me thinking.. It's debatable whether there's a bug in SROA or not. For load/store instructions, we (LLVM's LangRef) explicitly rules out reading/writing padding bits. But there's no similar wording for memcpy. I guess we need to patch either the documentation or SROA.
This is no longer happening. Could it be possible to have a small test which covers this case to add it to the behavior tests?
I've been hammering at a weird bug for a while and finally got a reduced case. See the code and output. I have a string slice being passed around and at some point the slice value (ptr and/or len?) is getting corrupted.
union(enum)
s are involved but I don't know if that's relevant.The weirdest thing is the problem will shift or go away if you change seemingly innocuous bits of code (see the comments I left throughout the code sample). This makes it a bit tricky to pare the test case down even further.
This problem only happens in ReleaseSmall and ReleaseFast. It has been around since at least 0.6.0. I'm sure the release build of my projects used to work at some point, I'm going to guess it broke some time after 0.5.0?
Apologies for the size of the code snippet, a lot of it is types. Scroll down and start with the main function and it's pretty easy to follow I think.
I'm on a Mac.