Open Quuxplusone opened 12 years ago
Attached y.ll
(4683 bytes, application/octet-stream): test case
There is something very strange in your testcase:
declare i32 @relocate_kernel(i32, i32, i32, i32, i32)
@__LOAD_PHYSICAL_ADDR = external global [0 x i8]
[...]
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %call8, i8* inttoptr (i32 add (i32 add (i32 ptrtoint (i32 (i32, i32, i32, i32, i32)* @relocate_kernel to i32), i32 ptrtoint ([0 x i8]* @__LOAD_PHYSICAL_ADDR to i32)), i32 -1073741824) to i8*), i32 2048, i32 1, i1 false)
What in the world that constant expression argument to memcpy!? The address of
the function relocate_kernel plus the address of external global
__LOAD_PHYSICAL_ADDR plus -2^30 (aka. 0xC0000000)?
(In reply to comment #1)
> There is something very strange in your testcase:
>
> declare i32 @relocate_kernel(i32, i32, i32, i32, i32)
> @__LOAD_PHYSICAL_ADDR = external global [0 x i8]
> [...]
> call void @llvm.memcpy.p0i8.p0i8.i32(i8* %call8, i8* inttoptr (i32 add (i32
> add (i32 ptrtoint (i32 (i32, i32, i32, i32, i32)* @relocate_kernel to i32),
i32
> ptrtoint ([0 x i8]* @__LOAD_PHYSICAL_ADDR to i32)), i32 -1073741824) to i8*),
> i32 2048, i32 1, i1 false)
>
> What in the world that constant expression argument to memcpy!? The address of
> the function relocate_kernel plus the address of external global
> __LOAD_PHYSICAL_ADDR plus -2^30 (aka. 0xC0000000)?
I just find tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %6, i8*
getelementptr inbounds ([3 x i8]* @label416, i64 0, i64 0), i32 3, i32 1, i1
false) nounwind
Where do you see the words relocate, kernel etc.?
Are you sure you opened the right file?
I could reduce it to a very small testcase:
(you can replace the target datalayout by your own)
target datalayout = "e-p:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-
f32:32:32-f64:32:64-f80:32:32-v64:64:64-v128:128:128-a0:0:64"
@char = internal constant [2 x i8] c"g " ; change this string to test the other
case
@hasspace = internal constant [15 x i8] c"it has a space\00"
@nospace = internal constant [16 x i8] c"it has no space\00"
declare void @puts(i8*)
declare i8* @malloc(i32)
define void @main() {
start:
%a = call i8*(i32)* @malloc(i32 4)
store i8 32, i8* %a
br label %test
test:
%loopvar = phi i32 [%loopvar2, %body], [0, %start]
%loopvar2 = add i32 %loopvar, 1
%finished = icmp sgt i32 %loopvar, 2 ; or any other constant or non-constant
br i1 %finished, label %continue, label %body
body:
; sth fancy
%valp = getelementptr [2 x i8]* @char, i32 0, i32 %loopvar
%val = load i8* %valp
%val2 = load i8* %a
%otherbreak = icmp eq i8 %val, %val2
br i1 %otherbreak, label %exception, label %test
continue:
call void(i8*)* @puts(i8* getelementptr([16 x i8]* @nospace, i32 0, i32 0))
ret void
exception:
call void(i8*)* @puts(i8* getelementptr([15 x i8]* @hasspace, i32 0, i32 0))
ret void
}
Ok, using comment 3 as the testcase. This is an ordering issue. "opt -gvn -loop-
unroll -simplifycfg" produces:
define void @main() {
start:
%a = call i8* @malloc(i32 4)
store i8 32, i8* %a
call void @puts(i8* getelementptr inbounds ([15 x i8]* @hasspace, i32 0, i32 0))
ret void
}
The problem is that %a = malloc + store in %start, then there's the load inside
the loop body. GVN isn't fooled at all, but SCEV doesn't know what that load
will load.
Another possible fix would be adding a heap->stack transform then letting
mem2reg mop it up.
> The problem is that %a = malloc + store in %start, then there's the load
inside
> the loop body. GVN isn't fooled at all, but SCEV doesn't know what that load
> will load.
>
Yes, I had to add the malloc instruction to reproduce the above test case.
> Another possible fix would be adding a heap->stack transform then letting
> mem2reg mop it up.
That would fix the problem from comment 3, but the original test case is way
more complex.
opt -O3 -S gives me:
; ModuleID = 'x.ll'
target datalayout = "e-p:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-
f32:32:32-f64:32:64-f80:32:32-v64:64:64-v128:128:128-a0:0:64"
@hasspace = internal constant [15 x i8] c"it has a space\00"
@nospace = internal constant [16 x i8] c"it has no space\00"
declare void @puts(i8* nocapture) nounwind
declare noalias i8* @malloc(i32) nounwind
define void @main() nounwind {
start:
%a = tail call i8* @malloc(i32 4)
store i8 32, i8* %a, align 1
br label %test
test: ; preds = %body, %start
%loopvar = phi i32 [ %loopvar2, %body ], [ 0, %start ]
%finished = icmp sgt i32 %loopvar, 2
br i1 %finished, label %continue, label %body
body: ; preds = %test
%loopvar2 = add i32 %loopvar, 1
%otherbreak = icmp eq i32 %loopvar, 1
br i1 %otherbreak, label %exception, label %test
continue: ; preds = %test
tail call void @puts(i8* getelementptr inbounds ([16 x i8]* @nospace, i64 0, i64 0))
ret void
exception: ; preds = %body
tail call void @puts(i8* getelementptr inbounds ([15 x i8]* @hasspace, i64 0, i64 0))
ret void
}
running opt a second time workarounds the issue. The problem is that the first
case already had two opt runs, so after running opt 4 times the correct result
was produced. Can't opt take a parameter to run itself as often as needed?
y.ll
(4683 bytes, application/octet-stream)