AliveToolkit / alive2

Automatic verification of LLVM optimizations
MIT License
758 stars 95 forks source link

Escaped pointers have non-deterministic address #954

Open dzaima opened 10 months ago

dzaima commented 10 months ago

Potentially I'm not understanding alive2's intended behavior, but this:

define i1 @src() {
entry:
  %x = call ptr @malloc(i64 8)
  %y = call ptr @malloc(i64 8)
  %xb = ptrtoint ptr %x to i64
  %yb = ptrtoint ptr %y to i64
  %ybEnd = add i64 %yb, 8
  %cmp = icmp eq i64 %xb, %ybEnd
  ret i1 %cmp
}

define i1 @tgt() {
entry:
  %x = call ptr @malloc(i64 8)
  %y = call ptr @malloc(i64 8)
  %xb = ptrtoint ptr %x to i64
  %yb = ptrtoint ptr %y to i64
  %ybEnd = add i64 %yb, 8
  %cmp = icmp eq i64 %xb, %ybEnd
  ; %iszero = icmp eq i1 %cmp, 0
  ; call void @llvm.assume(i1 %iszero)
  ret i1 0
}

declare void @llvm.assume(i1)
declare ptr @malloc(i64 noundef)
declare void @use(ptr noundef, ptr noundef)

(alive2.llvm.org) is reported as correct, but uncommenting the commented-out assume results in it finding a case of %cmp == 1 even though before it reported %cmp0 as a valid transformation.

nunoplopes commented 10 months ago

This is a tricky case. The address of malloc'ed objects is a non-deterministic value. The compiler is assume whatever it wants, as long as it makes consistent choices. So it may either assume that the allocations are contiguous or not. So both ret false and ret true are valid return values. That's because the addresses are not observable by the program, so anything goes. LLVM will just remove the allocations altogether.

The assume is wrong because you cannot force the comparison to go to one way in the target. You would have to place it in the src program.

Hope this explanation makes sense.

dzaima commented 10 months ago

That makes sense.

Though, at first I had a call void @use(ptr %x, ptr %y) after the malloc calls (https://alive2.llvm.org/ce/z/ERH5Fj), which should allow the program to observe the pointers, e.g. @use could have if ((intptr_t)x != (intptr_t)y+8) abort(); at which point @src/@tgt would be required to return be able to return 1 if use doesn't choose to guarantee always taking the abort path. I don't think it's possible to guarantee the consistent choice across functions?

But yes, very tricky; wouldn't be surprised if I've missed something here too; it's also hitting inter-procedural stuff.

nunoplopes commented 10 months ago

You're right. That second example shouldn't verify for the exact reason you mention.