AliveToolkit / alive2

Automatic verification of LLVM optimizations
MIT License
721 stars 93 forks source link

unexpected memcmp() behavior #1007

Closed regehr closed 5 months ago

regehr commented 5 months ago

I'm getting results that I did not expect from this code:

define i32 @src(ptr, ptr) {
  %3 = tail call i32 @memcmp(ptr %0, ptr %1, i64 1)
  ret i32 %3
}

declare i32 @memcmp(ptr, ptr, i64)

first, the result from alive-tv is that the LLVM middle end is committing a "Target is more poisonous than source" error when in fact the optimized code looks just fine, it's just loading two bytes, zexting them, and returning the difference. I can't make sense of the CEX, it seems to be saying that it's poison to load from index 0 of a memory block of size 1.

the second problem is that with --tgt-is-asm, Alive is giving a value mismatch here

Johns-MacBook-Pro:tmp regehr$ ~/alive2-regehr/build/alive-tv foo.ll --tgt-is-asm

----------------------------------------
declare i32 @memcmp(ptr, ptr, i64)

define i32 @src(ptr %#0, ptr %#1) {
#2:
  %#3 = memcmp ptr %#0, ptr %#1, i64 1
  ret i32 %#3
}
=>
define i32 @src(ptr nocapture nowrite %#0, ptr nocapture nowrite %#1) nofree willreturn asm memory(argmem: read) {
#2:
  %lhsc = load i8, ptr nocapture nowrite %#0, align 1
  %lhsv = zext i8 %lhsc to i32
  %rhsc = load i8, ptr nocapture nowrite %#1, align 1
  %rhsv = zext i8 %rhsc to i32
  %chardiff = sub nsw i32 %lhsv, %rhsv
  ret i32 %chardiff
}
Transformation doesn't verify!

ERROR: Value mismatch

Example:
ptr %#0 = pointer(non-local, block_id=1, offset=4, attrs=4)
ptr %#1 = pointer(non-local, block_id=1, offset=0, attrs=4)

Source:
i32 %#3 = #x80000000 (2147483648, -2147483648)

SOURCE MEMORY STATE
===================
NON-LOCAL BLOCKS:
Block 0 >   size: 0 align: 1    alloc type: 0   alive: false    address: 0
Block 1 >   size: 8 align: 1    alloc type: 0   alive: true address: 4
Contents:
if (and (not (= idx #x0)) (not (= idx #x2)))
  pointer(non-local, block_id=0, offset=1), byte offset=2
else
  if (= idx #x2)
    #x00
  else
    if (and (= idx #x0) (not (= idx #x2)))
      pointer(non-local, block_id=2, offset=2), byte offset=1
    else
      poison

Block 2 >   size: 1 align: 1    alloc type: 0   alive: true address: 12
Contents:
*: #x00

Target:
i8 %lhsc = #x00 (0)
i32 %lhsv = #x00000000 (0)
i8 %rhsc = #x00 (0)
i32 %rhsv = #x00000000 (0)
i32 %chardiff = #x00000000 (0)
Source value: #x80000000 (2147483648, -2147483648)
Target value: #x00000000 (0)

Summary:
  0 correct transformations
  1 incorrect transformations
  0 failed-to-prove transformations
  0 Alive2 errors
Johns-MacBook-Pro:tmp regehr$ 

cc @tanmaytirpankar

aqjune commented 5 months ago

My two cents is that the memcmp'ing location is storing pointer bytes. Loading a pointer byte as an integer type (as tgt does) returns a poison value, that is the reason why the target is more poisonous, maybe.

nunoplopes commented 5 months ago

My two cents is that the memcmp'ing location is storing pointer bytes. Loading a pointer byte as an integer type (as tgt does) returns a poison value, that is the reason why the target is more poisonous, maybe.

Yep, that's correct. I think the bug is in memcmp; the tgt return 0 in asm mode looks correct. memcmp may not be comparing pointer bytes correctly (should compare ptr addresses).

regehr commented 5 months ago

sigh... I looked at the memcmp implementation but spotting a subtle flaw in it is beyond me