llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.34k stars 12.13k forks source link

LoopRotate causes a miscompilation #42226

Open aykevl opened 5 years ago

aykevl commented 5 years ago
Bugzilla Link 42881
Version 9.0
OS Linux
CC @efriedma-quic

Extended Description

I have the following IR:


target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" target triple = "arm-none-eabi"

@​_sbss = external global i8 @​_ebss = external global i8

define fastcc void @​runtime.preinit() { entry: br label %for.loop

for.loop: ; preds = %for.body, %entry %0 = phi i32 [ ptrtoint (i8 @​_sbss to i32), %entry ], [ %3, %for.body ] %1 = icmp eq i32 %0, ptrtoint (i8 @​_ebss to i32) br i1 %1, label %for.done, label %for.body

for.body: ; preds = %for.loop %2 = inttoptr i32 %0 to i32 store i32 0, i32 %2, align 4 %3 = add i32 %0, 4 br label %for.loop

for.done: ; preds = %for.loop ret void }


When I run the following command:

opt-9 -S -loop-rotate -o bug.opt.ll bug.ll

the runtime.preinit function is optimized to the following:


define fastcc void @​runtime.preinit() { entry: br label %for.body

for.body: ; preds = %entry, %for.body %0 = phi i32 [ ptrtoint (i8 @​_sbss to i32), %entry ], [ %2, %for.body ] %1 = inttoptr i32 %0 to i32 store i32 0, i32 %1, align 4 %2 = add i32 %0, 4 %3 = icmp eq i32 %2, ptrtoint (i8 @​_ebss to i32) br i1 %3, label %for.done, label %for.body

for.done: ; preds = %for.body ret void }


In other words, it looks like LoopRotate thinks the comparison in the loop header can be done after the first store. I believe this optimization is incorrect, as there is no indication in the IR that it is okay to store to @​_sbss unconditionally. In fact, because @​_sbss and @​_ebss happen to be aliases this causes a loop zeroing the entire address space until it crashes.

Thinking a bit more about this, it seems like LoopRotate introduces an off-by-one error in the loop: before this optimization the loop would zero all memory before @​_ebss while after the optimization it zeroes all memory including the word behind @​_ebss (which lies outside of the range it is supposed to zero).

If this is a correct optimization, please let me know why so I can fix this in the compiler frontend or the runtime library.

Note that this bug is present in both LLVM 9 and LLVM 8 (from apt.llvm.org), so it affects the current stable version as well.

efriedma-quic commented 5 years ago

I think the optimization is correct. You have two distinct variables of type i8*; that means they should have different addresses.

If you use "@_sbss = external global [0 x i8]" or something like that, it should do what you want.

The logic is in areGlobalsPotentiallyEqual in llvm/lib/IR/ConstantFold.cpp.