vprover / vampire

The Vampire Theorem Prover
https://vprover.github.io/
Other
293 stars 51 forks source link

Vampire seems to loop when there is $sum #145

Closed mcoulont closed 3 years ago

mcoulont commented 4 years ago

Hi all

I run vampire_z3_Release_static_master_4764.

Everything is ok until I try a file containing $sum:

tff(anti_sum_2_2_5,conjecture,
    ( $sum(2,2) = 5 )).

There the program runs without end (and my PC makes some noise computing).

Does someone know what's going on ?

KR Marc

quickbeam123 commented 4 years ago

Hey Mark,

the problem is your conjecture is not provable. When all the simple ways of proving it fail, Vampire tries more and more complicated ones until the time-limit is reached. There is currently no mechanism implemented to recognise the situation and abort early. In some sense, for deeper theoretical reasons, there can't be such a mechanism in general anyway...

In any case, what you report seems to be the expected behaviour in the presence of arithmetic and a false conjecture.

Best, Martin

mcoulont commented 4 years ago

Hi Martin

Ok, sorry, I had just not understood what was supposed to happen.

But it happens the same thing with

tff(divisorOf_type,type,(
    divisorOf: ( $int * $int ) > $o 
)).

tff(divisorOf_definition,axiom,(
    ! [X: $int,Y: $int] : (
        divisorOf(X, Y) <=> 
        (? [Z: $int] : $product(X, Z) = Y)
    )
)).

although there is only a type definition and an axiom.

And, by the way, Vampire awards speak by themselves and I'm not aware of the state of the art in automated theorem proving, but is it normal that vampire --mode casc -t 300 fails with the following input (i7, 16 Go RAM):


tff(divisorOf_type,type,(
    divisorOf: ( $int * $int ) > $o 
)).

tff(isPrime_type,type,(
    isPrime: $int > $o
)).

tff(divisorOf_definition,axiom,(
    ! [X: $int,Y: $int] : (
        divisorOf(X, Y) <=> (
            ? [Z: $int] : $product(X, Z) = Y
        )
    )
)).

tff(isPrime_definition,axiom,(
    ! [X: $int] : (
        isPrime(X) <=> (
            $less(1, X)
        &
            ! [Y: $int] : (
                divisorOf(Y, X) => (Y = 1 | Y = X)
            )
        )
    )
)).

tff(everyIntegerDividedByPrime,conjecture,(
    ! [X: $int] : (
        $less(X,2)
    |
        ? [Y: $int] : (isPrime(Y) & divisorOf(Y, X))
    )
)).

KR Marc

quickbeam123 commented 4 years ago

Hi Marc,

in a nutshell, the presence of arithmetic, such as the $int type in your first example, triggers Vampire to add a bunch of axioms to support (incomplete) reasoning in the corresponding theory. Although Vampire then correctly recognizes "divisorOf_definition" as an "unused definition" and removes the formula, the added axioms remain and Vampire then goes on and tries to derive all of their consequences (in the vain hope of discovering the contradiction).

This could be slightly improved in the case of this first example, but not much in general, so we leave it as it is for now. You can see the set of added axioms by runnin with --mode clausify. You can also check what going on when vampire is running by adding --show_preprocessing on / --show_active on.

Your second example is trickier. I agree it should be provable, but is currently beyond the capacity of Vampire. The mentioned automatically added axioms most likely don't suffice and, I believe, induction (in my mind, along the "divisorOf" relation) would be needed to complete the proof. There is some support for induction in Vampire that must be turned on explicitly, but this example would still be most likely beyond current Vampire's capacity. Maybe @selig would be able to comment more on this. In any case, we thank you for the nice example!

Best, Martin

selig commented 4 years ago

For the first example, there used to be a mode where we could show that ground problems were non-theorems by bypassing our usual mechanisms and just asking Z3. This was turned off partly because it looks too much like cheating! But for usability reasons, we could put it back. There's probably something more clever we could do to detect cases where we have a decision procedure to allow us to conclude that we have a non-theorem but this isn't the focus of our work and I don't see it happening very soon.

In general, if a problem containing arithmetic is a non-theorem (for problems with conjectures) or satisfiable (for those without) our expectation is that Vampire diverges.

There is current work on adding integer induction to Vampire (currently we have structural induction for SMT datatypes, which I suppose this problem could be reformulated in terms of using Nat... but I don't think that would help!). But as Martin says, ATPs are typically weak at complex inductive arguments. It's certainly an area where we are working and this will be an interesting example to look at. Let me include @hzzv who is doing the current work on integer induction.

mcoulont commented 4 years ago

Well, thank you very much for the spent time.

Must I close the ticket ?

hzzv commented 4 years ago

Thank you for the examples! As Giles already mentioned, I'm working on integer induction, and I am getting close to what is needed to prove the example with primes. I can update this thread once it's ready.

mcoulont commented 4 years ago

I didn't ask so much but ok, do so.

Thank you very much

hzzv commented 4 years ago

No worries, I was already working on this for some time - and your example helps with the development :)