Closed k00ni closed 5 months ago
Hi Konrad,
I am still very new to the theorem prover world and currently experimenting with Vampire. Its hard to find beginner friendly resources about TPTP + Vampire :), so I hope I can find some insight here.
Greetings! It's nice to have newcomers. :-)
I assume question_answering might the function I need here, but it seems it doesn't come into play?
I think the "question answering" mode is probably a distraction for you, unfortunately - let's get back to this.
The use case is translating a few laws/rules to TPTP and use Vampire to answer/verify a "given question". So my TPTP-file contains a few axioms and in the end a question. My assumption is that the type = question is required so it returns a boolean.
This is Vampire's default behaviour (no question
required, conjecture
would be fine here), and it is already giving the answer you want - albeit obtusely.
Your example ?[X]: (p(X) & ~p(X))
is not a theorem. Therefore Vampire doesn't say SZS status Theorem
but instead SZS status CounterSatisfiable
, which is ATP-speak for "not a theorem". You can read more about the status codes here or here.
If you were to try ?[X]: (p(X) | ~p(X))
, Vampire will say SZS status Theorem
and give a proof.
What you may find a little surprising is that both a conjecture and its negation can be non-theorems. If you try
fof(test, conjecture, p).
and
fof(test, conjecture, ~p).
Vampire will say that both are not theorems (because they are not).
What does the question-answering mode do, then? If you have a theorem and you want to know what term to "plug in" to the conjecture to get a proof, Prolog-style, Vampire can (optionally) do that. For example:
fof(pa, axiom, p(a)).
fof(q, question, ?[X]: p(X)).
is a theorem, but the "answer" is a
.
@MichaelRawson thank you for your detailed answer. The link to SZSOntology was helpful. I am still trying to wrap my head around all of this, although its hard to synchronize TPTP peculiarities, the Vampire software and Logic in general. Especially the kind of output Vampire is producing is still odd to me (being a developer who is used to APIs returning "simple" values).
When running Vampire with your example theorem:
fof(pa, axiom, p(a)).
fof(q, question, ?[X]: p(X)).
its output is:
$ vampire data/problem1.tptp
% Running in auto input_syntax mode. Trying TPTP
% Refutation found. Thanks to Tanya!
% SZS status Theorem for problem1
% SZS output start Proof for problem1
1. p(a) [input]
2. ? [X0] : p(X0) [input]
3. ~? [X0] : p(X0) [negated conjecture 2]
4. ! [X0] : ~p(X0) [ennf transformation 3]
5. p(a) [cnf transformation 1]
6. ~p(X0) [cnf transformation 4]
7. $false [resolution 5,6]
% SZS output end Proof for problem1
% ------------------------------
% Version: Vampire 4.8 (commit 8d999c135 on 2023-07-12 16:43:10 +0000)
% Linked with Z3 4.9.1.0 6ed071b44407cf6623b8d3c0dceb2a8fb7040cee z3-4.8.4-6427-g6ed071b44
% Termination reason: Refutation
% Memory used [KB]: 4989
% Time elapsed: 0.074 s
% ------------------------------
% ------------------------------
I don't see the "answer" a
you mentioned in the output. What I see based on SZSOntology is:
SZS status Theorem for problem1
=> the input is a valid theorem$false [resolution 5,6]
The last point is confusing because if a
can be used to satisfy the model (so that all axioms are true), why does Vampire report $false
(= contradiction)? Or is that a proof by contradiction? In a way, that its a theorem and it provides me a solution by using a proof by contradiction?
The proof by contradiction is based on the following quote from SZSOntology:
Theorem (THM): All models of Ax are models of C.
- F is valid, and C is a theorem of Ax.
- Possible dataforms are Proofs of C from Ax.
I don't see the "answer" a you mentioned in the output.
Right, this feature has some overhead so it's not on by default. Pass -qa on
if you want this - but I emphasize again that if you only want yes/no "is it a theorem?" answers you don't need question-answering.
the input is a valid theorem
Correct.
it provides me a proof
Yes, you are fortunate. ;-) You can suppress this with -p off
.
it states
$false [resolution 5,6]
All Vampire proofs are by refutation: note that "negated conjecture" earlier in the proof.
You mention a true/false API for Vampire. The space of possible results is a bit richer than true/false (which is why we have the SZS ontology), but an API would be nice - there are serious implementation challenges here, but "at some point" this could happen in principle.
In the meantime, the Z3 SMT solver may work more in the way that you expect ("satisfiable" or "unsatisfiable"), and it has an API.
Thanks again for the quick answer. My focus on true/false as an answer is because in my use case I use another system (PHP) to interact with Vampire. This system has to able to interpret the result after each Vampire call.
but I emphasize again that if you only want yes/no "is it a theorem?" answers you don't need question-answering.
My mistake, I meant it in the way for a given set of axioms is the given conjecture "correct" (true, holds, ...)
? To be honest, I don't even care if its a theorem or not. My sole focus is a (almost) binary answer. My point with "question-answering" was solely because I read it in a context of LLMs and assumed its related (because of the question
type in TPTP).
On the other hand its very interesting to use Vampire to explain why a certain set of axioms is not satisfiable or to even show a proof. Also, I will have a look into Z3 SMT.
All Vampire proofs are by refutation: note that "negated conjecture" earlier in the proof.
Does this mean, if I read negated conjecture
and find a $false [resolution ...]
in the output, it means Vampire thinks the model is satisfiable? And it reports "unsatisfiable" if my conjecture leads to a contradiction?
I think there is some confusion here. The way Vampire works is that it loads the axioms Ax
and a conjecture C
, negates the conjecture to get ~C
, then forms the conjunction Ax /\ ~C
. If this conjunction implies $false
, then C
is a consequence, a theorem of Ax
.
"correct", "true", "holds" can mean different things depending on the context, but here I'd expect them to be synonymous with "C
is a theorem of Ax
".
To reiterate, and simplifying only very slightly: if Vampire says SZS status Theorem
, the conjecture follows from the axioms. If it says something else, it doesn't.
I am still very new to the theorem prover world and currently experimenting with Vampire. Its hard to find beginner friendly resources about TPTP + Vampire :), so I hope I can find some insight here.
I would like to give Vampire a problem (
problem.tptp
) and want it to returntrue
orfalse
(besides the usual output string). In a way, as if I would use an API-function which returns a boolean:The use case is translating a few laws/rules to TPTP and use Vampire to answer/verify a "given question". So my TPTP-file contains a few axioms and in the end a question. My assumption is that the type =
question
is required so it returns a boolean. I've read #367, but was not able to find insight. Here are my attempts and what I got so far. Any tips are very appreciated.Attempt 1
File
problem.tptp
looks like:Output is:
Attempt 2
I used example TPTP code from https://github.com/vprover/vampire/issues/367#issuecomment-1132588963:
Output is:
The output contains
24. $false [resolution 18,23]
and my assumption is, that the question is answered with false.Attempt 3 - with program parameters
File
problem.tptp
looks like:Vampire is called like:
But its the same output:
I assume question_answering might the function I need here, but it seems it doesn't come into play?