AliveToolkit / alive2

Automatic verification of LLVM optimizations
MIT License
769 stars 97 forks source link

Odd behavior with function calls, alive says functions don't return or trigger UB #975

Closed jeremy-rifkin closed 10 months ago

jeremy-rifkin commented 10 months ago

Hello, I have two implementations of a factorial function (one recursive and the other not), and src/tgt functions corresponding to the following C code calling each of the factorials

int fn(int num) {
    return num * num + fact(5);
}
define noundef i32 @fact1(i32 noundef %z0) {
bb0:
    %z1 = or i32 %z0, 0
    br label %bb2
bb2:
    %t2 = icmp eq i32 %z1, 0
    br i1 %t2, label %bb3, label %bb4
bb3:
    %z3 = or i32 1, 0
    br label %bb5
bb4:
    %z4 = add nsw i32 %z1, -1
    %z5 = call noundef i32 @fact1(i32 noundef %z4)
    %z6 = or i32 %z5, 0
    %z7 = mul nsw i32 %z1, %z6
    br label %bb5
bb5:
    %z8 = phi i32 [ %z3, %bb3 ], [ %z7, %bb4 ]
    ret i32 %z8
}

define noundef i32 @src(i32 noundef %0) {
    %2 = mul nsw i32 %0, %0
    %3 = tail call noundef i32 @fact1(i32 noundef 5)
    %4 = add nsw i32 %3, %2
    ret i32 %4
}

define dso_local noundef i32 @fact2(i32 noundef %0) local_unnamed_addr #0 {
  br label %2

2:                                                ; preds = %6, %1
  %3 = phi i32 [ 1, %1 ], [ %8, %6 ]
  %4 = phi i32 [ %0, %1 ], [ %7, %6 ]
  %5 = icmp eq i32 %4, 0
  br i1 %5, label %9, label %6

6:                                                ; preds = %2
  %7 = add nsw i32 %4, -1
  %8 = mul nsw i32 %3, %4
  br label %2

9:                                                ; preds = %2
  %10 = mul nsw i32 %3, 1
  ret i32 %10
}

attributes #0 = { mustprogress nofree nosync nounwind willreturn memory(none) uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

; Function Attrs: mustprogress nofree nosync nounwind willreturn memory(none) uwtable
define dso_local noundef i32 @tgt(i32 noundef %0) local_unnamed_addr #0 {
  %2 = mul nsw i32 %0, %0
  %3 = tail call noundef i32 @fact2(i32 noundef 5)
  %4 = add nsw i32 %3, %2
  ret i32 %4
}

Alive2 says the functions do not verify as the source is more defined than the target https://alive2.llvm.org/ce/z/MPLtAa


----------------------------------------
declare i32 @fact1(noundef i32)

define i32 @src(i32 noundef %#0) noundef {
#1:
  %#2 = mul nsw i32 noundef %#0, noundef %#0
  %#3 = call i32 @fact1(noundef i32 5) noundef
  %#4 = add nsw i32 %#3, %#2
  ret i32 %#4
}
=>
declare i32 @fact2(noundef i32)

define i32 @tgt(i32 noundef %#0) nofree noundef willreturn memory(none) {
#1:
  %#2 = mul nsw i32 noundef %#0, noundef %#0
  %#3 = call i32 @fact2(noundef i32 5) nofree noundef willreturn memory(none)
  %#4 = add nsw i32 %#3, %#2
  ret i32 %#4
}
Transformation doesn't verify!

ERROR: Source is more defined than target

Example:
i32 noundef %#0 = #x00000000 (0)

Source:
i32 %#2 = #x00000000 (0)
i32 %#3 = function did not return!

SOURCE MEMORY STATE
===================
NON-LOCAL BLOCKS:
Block 0 >   size: 0 align: 1    alloc type: 0   alive: false    address: 0
Block 1 >   size: 0 align: 1    alloc type: 0   alive: true address: 0

Target:
i32 %#2 = #x00000000 (0)
Function @fact2 triggered UB

I'm confused how either fact1 or fact2 are triggering UB or not returning.

Not a perfect indicator but clang seems to constant propagate them just fine if I do a test such as https://alive2.llvm.org/ce/z/58qw8Z.

Looking at just the factorial implementations alone alive2 says they are equivalent: https://alive2.llvm.org/ce/z/cHHFGs

Am I missing something here, is there an alive2 bug, or is this just a known limitation of analyzing functions with calls in alive2?

regehr commented 10 months ago

So yeah the basic issue here is that Alive does not understand recursion, nor can it reason about unbounded loops

Alive has an unroller that you can try to use, if you unroll the loop as many times as it's going to execute for real, then you should be able to verify your function for small input values. However, the unroller does not understand recursive loops, you'll need to either do that by hand or else perhaps you can get the LLVM inliner to completely unroll the recursion.

nunoplopes commented 10 months ago

Right, recursion is not supported and won't be any time soon, sorry.

jeremy-rifkin commented 10 months ago

Hi, thank you both so much for the replies. It's helpful to know recursion isn't supported (I tried to look for any mention of recursion but didn't see one).

Out of curiosity have you considered doing some sort of bounded inlining for recursion?

And one last quick question, if I replace the recursive factorial with a non-recursive factorial alive2 is still not verifying: https://alive2.llvm.org/ce/z/hZ-jF2. Is this somehow still considered an unbounded loop?

nunoplopes commented 10 months ago

Please note that Alive2 is a compiler verification tool, not a generic prover or a generic program equivalence checker. Also, Alive2 doesn't support inter-procedural optimizations, so inlining is not supported. It's not that we don't want to support it, it's just that it hasn't been done. But supporting inlining is of extremely low priority, as it's not a common source of compiler bugs.

Your example requires inter-procedural reasoning, so it's not supported.

jeremy-rifkin commented 10 months ago

Thank you again! I'm enrolled in a course about formal verification and I've been working on a project related to transformation verification. At the moment I'm trying to understand the capabilities and scopes of current tools.

nunoplopes commented 10 months ago

Alive2 supports all intra-procedural optimizations of LLVM. That's it 🙂 There are some limitations in the set of IR features that is supported (as LLVM IR is huge), of course. There are other tools out there for transformation verification, but they usually target smaller languages. And often they are not bit-precise (they do proofs over rationals instead of over modular arithmetic like Alive2).