Resolution involving an assumption

BrunoDutertre commented 1 year ago

Here's a small test:

(set-logic QF_UF)
(declare-sort I 0)
(declare-fun o (I I) I)
(declare-fun e () I)
(declare-fun e0 () I)
(assert (or (= e0 (o e0 e0)) (= e (o e0 e))))
(assert (and (not (= e0 (o e0 e0))) (not (= e (o e0 e)))))
(check-sat)

and here's the proof produced by cvc5:

(assume a0 (or (= e0 (o e0 e0)) (= e (o e0 e))))
(assume a1 (and (not (= e0 (o e0 e0))) (not (= e (o e0 e)))))
(step t1 (cl (not (= e (o e0 e)))) :rule and :premises (a1))
(step t2 (cl (not (= e0 (o e0 e0)))) :rule and :premises (a1))
(step t3 (cl (= e (o e0 e))) :rule resolution :premises (a0 t2))
(step t4 (cl) :rule resolution :premises (t1 t3))

carcara complains on this:

carcara check test.prf test.smt2 
[ERROR] checking failed on step 't3' with rule 'resolution': pivot was not eliminated: '(not (= e0 (o e0 e0)))'
invalid

bpandreotti commented 1 year ago

While the Carcara error message could be more helpful, I believe this is actually an error in the proof produced by cvc5.

or terms in assumptions are not considered clauses, but individual or terms. So, the way it's written, t3 is attempting to do resolution between the two unit clauses (cl (or (= e0 (o e0 e0)) (= e (o e0 e)))) and (cl (not (= e0 (o e0 e0)))), which naturally fails.

One way of fixing this proof would be to add a new step that uses the or rule to break the term in a0 into a clause of its arguments. For example (note the new t3.t1 step):

(assume a0 (or (= e0 (o e0 e0)) (= e (o e0 e))))
(assume a1 (and (not (= e0 (o e0 e0))) (not (= e (o e0 e)))))
(step t1 (cl (not (= e (o e0 e)))) :rule and :premises (a1))
(step t2 (cl (not (= e0 (o e0 e0)))) :rule and :premises (a1))
(step t3.t1 (cl (= e0 (o e0 e0)) (= e (o e0 e))) :rule or :premises (a0))
(step t3 (cl (= e (o e0 e))) :rule resolution :premises (t3.t1 t2))
(step t4 (cl) :rule resolution :premises (t1 t3))

HanielB commented 1 year ago

Indeed the cvc5 proof is wrong, as @bpandreotti explains. I believe I fixed this bug in cvc5 a while ago, though. Was it in the newest version of proof-new you had this bug, @BrunoDutertre ? That version produces the proof below, which Carcara validates:

(assume a0 (or (= e0 (o e0 e0)) (= e (o e0 e))))
(assume a1 (and (not (= e0 (o e0 e0))) (not (= e (o e0 e)))))
(step t1 (cl (not (or (= e0 (o e0 e0)) (= e (o e0 e)))) (= e0 (o e0 e0)) (= e (o e0 e))) :rule or_pos)
(step t2 (cl (not (= e0 (o e0 e0)))) :rule and :premises (a1))
(step t3 (cl (not (= e (o e0 e)))) :rule and :premises (a1))
(step t4 (cl (not (or (= e0 (o e0 e0)) (= e (o e0 e))))) :rule resolution :premises (t1 t2 t3) :args ((= e0 (o e0 e0)) true (= e (o e0 e)) true))
(step t5 (cl) :rule resolution :premises (a0 t4) :args ((or (= e0 (o e0 e0)) (= e (o e0 e))) true))

BrunoDutertre commented 1 year ago

The proof was produced with a very recent cvc5:

f859d837dd3c6d055c021bc5cb55f537644bcf1 (HEAD -> main, origin/main)
Author: Andrew Reynolds <andrew.j.reynolds@gmail.com>
Date:   Mon Jan 23 14:25:21 2023 -0600

    Minor simplification to Minisat theory check (#9445)

    Removes an unused enum value for check.

BrunoDutertre commented 1 year ago

Not using the proof-new branch

HanielB commented 1 year ago

I can't find this hash, @BrunoDutertre. At the current main head (cvc5 1.0.4-dev.90.2f859d837d [git 2f859d837d on branch main] assertions:on) the proof generated, with --proof-format-mode=alethe --proof-granularity=theory-rewrite --lang=smt2 --dump-proofs --dag-thresh=0, produces a correct proof (with --proof-alethe-res-pivots there is bug with the order of args vs premises that will be fixed with a PR that is the pipeline):

(assume a0 (or (= e0 (o e0 e0)) (= e (o e0 e))))
(assume a1 (and (not (= e0 (o e0 e0))) (not (= e (o e0 e)))))
(step t1 (cl (not (or (= e0 (o e0 e0)) (= e (o e0 e)))) (= e0 (o e0 e0)) (= e (o e0 e))) :rule or_pos)
(step t2 (cl (not (= e0 (o e0 e0)))) :rule and :premises (a1))
(step t3 (cl (not (= e (o e0 e)))) :rule and :premises (a1))
(step t4 (cl (not (or (= e0 (o e0 e0)) (= e (o e0 e))))) :rule resolution :premises (t1 t2 t3))
(step t5 (cl) :rule resolution :premises (a0 t4))

BrunoDutertre commented 1 year ago

Cut and paste mistake: the commit is 2f859d837dd3c6d055c021bc5cb55f537644bcf1

BrunoDutertre commented 1 year ago

So I used the same cvc5 version as you. To reproduce the bug: add option --simplification=none

cvc5 --dump-proofs --proof-format=alethe --dag-thresh=0 --simplification=none \
   --proof-granularity=theory-rewrite iso_icl762_reduced.smt2 | sed -e 's/unsat//' > proof.txt

carcara check proof.txt iso_icl762_reduced.smt2                                                                      
[ERROR] checking failed on step 't3' with rule 'resolution': pivot was not eliminated: '(or (= e0 (o e0 e0)) (= e (o e0 e)))'
invalid

HanielB commented 1 year ago

Now I can reproduce, thanks! I'll take a look.

HanielB commented 1 year ago

I've submitted a PR to cvc5 to fix the issue: https://github.com/cvc5/cvc5/pull/9458

bpandreotti commented 1 year ago

Closing, as this is not a bug in Carcara

ufmg-smite / carcara

Resolution involving an assumption #4