AliveToolkit / alive2

Automatic verification of LLVM optimizations
MIT License
777 stars 98 forks source link

Unexpected behavior with proving identical adds are identical #837

Closed jeremy-rifkin closed 2 years ago

jeremy-rifkin commented 2 years ago

Alive2 times out proving the following is correct:

define i32 @src(i32 %a, i32 %b) {
  %add = add i32 %a, %b
  ret i32 %add
}

define i32 @tgt(i32 %a, i32 %b) {
  %add = add i32 %a, %b
  ret i32 %add
}

https://alive2.llvm.org/ce/z/bvGZ8Q

However, interestingly, it does not time out proving the following is correct:

define i33 @src(i33 %a, i33 %b) {
  %add = add i33 %a, %b
  ret i33 %add
}

define i33 @tgt(i33 %a, i33 %b) {
  %add = add i33 %a, %b
  ret i33 %add
}

https://alive2.llvm.org/ce/z/nKhweK

This pattern continues, somehow alive2 is able to prove odd bit-width int adds are equivalent but not even bit-width int adds are equivalent.

I would expect the trivial proof that two identical functions are identical would not require exhausting the input space but I could understand if some limitation of the backend necessitates it. I am very surprised by the behavior with i{2n+1}.

regehr commented 2 years ago

sigh, these are issues we've (and by "we" I mean almost entirely Nuno) have been fighting all the way from the start of alive2. the problem stems from undef, which forces some quantifiers into the mix that are not very friendly, and then Z3 is not particularly great at eliminating the quantifiers, so they make it to the bit blasting phase, and then we get the timeout. the solution appears to be some fairly serious Z3 hacking. another possible solution would be for us to target a solver with a better quantifier elimination subsystem (my students say that CVC5 is pretty strong at this), but this is a fair amount of engineering.

a workaround (not a solution!) is to add -disable-undef-input to the alive-tv command line, in which case the solver queries that alive2 emits are pretty much as simple as you'd expect from reading the formula, and then many refinement checks such as your ones here complete pretty much instantly.

if you're curious, there's more detail on the undef issue in our PLDI paper.

regehr commented 2 years ago

(another way to see the quantifier issues, if you care, is to ask alive2 to dump its Z3 queries as smtlib)

nunoplopes commented 2 years ago

Another option is to just remove undef values from LLVM 🙂 (which is something we are working on actively)

More seriously, there's nothing we can for this particular example right now. I filled a bug report in Z3 (https://github.com/Z3Prover/z3/issues/6299), but I don't have much hope. The quantifier instantiation algorithms of Z3 need a redesign..

jeremy-rifkin commented 2 years ago

Thank you both so much, I look forward to reading the paper!

How can I ask alive2 to dump queries it's feeding into z3?

regehr commented 2 years ago

the --smt-verbose option should do it, but see the --help output for more things you can try

regehr commented 2 years ago

I recommend starting with your query above and trying all 4 combinations of disabling or not undef and poison, this really helps see what alive2 is thinking about

jeremy-rifkin commented 2 years ago

Thank you!