Can't get dummy result from FOLTFMResolution

ivanovjaroslaw commented 6 years ago

Hello! I am trying to use FOLTFMResolution for my postgraduate work about "First Order Logic and Resolution Method".

I have some domain, axioms, premise and conclusion for proving (i.e. some hypothesis to get proved theorem). I proved this with my own program and actually by hands (on paper with pencil). But I can't get answer with FOLTFMResolution (for few months; there were both timeout or OutOfMemory).

Only when I did some dirty tweaks of AIMA source code, I got an asnwer. Following tweaks were applied:

reduced "Answer" clauses to only one (instead of keeping more than one "Answer" clauses to suppress dummy variations of same clauses)
ignored clauses with over 5-8 size of literals (it influenced to time of proving; looks incorrect, but I did it for experiment)
suppressed getting factors of KB, query and resolvents (getting just original clauses, not factored; it seems very incorrect, but I did it for experiment)
of course, increased timeout more than 10-180 seconds

It looks very strange that AIMA can't find proof trace without these dirty tricks. I want to figure out, what's happend. Why my dummy hypothesis can't be proved by AIMA?

So, my hypothesis consists following:

Domain - "F", "F_1", "P", "A", "E" - they are predicates
Axiom 1 - "FORALL a,b,c,d,e,f ( (A(a,c,d) AND A(b,e,f) AND E(a,b)) => (E(c,e) AND E(d,f)) )"
Axiom 2 - "FORALL a,b ( F_1(b,a) <=> ( EXISTS c,d (F(c,a) AND A(c,b,d)) ) )"
Premise - "FORALL a,b,c ( (F(a,c) AND F(b,c)) => E(a,b) )"
Conclusion - "FORALL a,b,c ( (F_1(b,a) AND F_1(c,a)) => E(b,c) )"

Without "factoring" there are following initial clauses:

A11 - [~A(v31,v32,v33), ~A(v34,v35,v36), ~E(v31,v34), E(v32,v35)]
A12 - suppressed cause it doesn't participate in proof trace
A21 - [~F_1(v23,v24), F(SF0(v24,v23),v24)]
A22 - [~F_1(v25,v26), A(SF0(v26,v25),v25,SF1(v26,v25))]
P - [~F(v43,v44), ~F(v45,v44), E(v43,v45)]
C1- [Answer0(v46,v47,v48), F_1(SC1,SC0)]
C2 - [Answer0(v49,v50,v51), F_1(SC2,SC0)]
C3 - [~E(SC1,SC2), Answer0(v52,v53,v54)]

So, then you can see the trace of proving of my program (also proved by hands, paper and pencil):

A21 with C1 -> *1
A21 with C2 -> *2
P with 2 -> 10
1 with 10 -> *20
C1 with A22 -> *3
A22 with C2 -> *4
C3 with A11 -> *5
4 with 5 -> *11
3 with 11 -> *21
20 with 21 -> EMPTY CLAUSE (there is an answer) (note that I used 10 and 20 numbers to show that these clauses are generated on first and second iterations of resolution method)

Can you help me to find an issue with AIMA (it looks like there are a lot of useless clauses generated or something else) or advise me some another FOL library with resolution method? Thank you a lot!

ctjoreilly commented 6 years ago

Hi,

I'm afraid I don't have much time to look at this at the moment. Have you tried the alternative theorem prover: FOLOTTERLikeTheoremProver? Also, have you tried increasing the amount of memory used to a couple of gigs? If neither of these work, its likely a bug in the implementations that will need to be tracked down. Would you be able to create a Java test case with your problem and expected answer so it will be easier to reproduce/fix?

Thanks

Ciaran

ctjoreilly commented 6 years ago

Just a note, the TFM in FOLTFMResolution stands for T)wo F)inger M)ethod and is super inefficient (remember resolution approaches are population based so consume a lot of memory if not implemented smartly). It is only intended to be used with simple problems. A resolution theorem prover to help validate your results, which is easy to use can be found here:

https://www.cs.unm.edu/~mccune/mace4/gui/v05.html

Unfortunately, prover 9 is no longer developed due to the author passing.

Best

Ciaran

ivanovjaroslaw commented 6 years ago

@ctjoreilly

Incentive

As I described before, I have my professor's implementation of dummy TFM resolution and it works very fast (with my sample of Domain/Axioms/Hypothesis). But it has lack of "factorization" and it looks like a little bit incorrect.

So I tried to use TFM of AIMA and get nothing. I tried to wait an hour with increasing "timeOut" option - I got OutOfMemory. I tried to force work of GarbageCollector - I got a silent for several hours. So, I tried to modify source code with a lot of dirty hacks and get some result (which I also validated by myself with pen and paper).

OK, you adviced me to use more efficient FOLOTTERLikeTheoremProver. But, moreover, I got just nothing (without breaking timeOut boundary). Just "Not proved".

And I dived into AIMA codebase. And found that there is producing clauses with several dummy "Answer" literals. And I asked myself - what if I reduce several "Answer" literals only to one. Because, in general, we only needs one "Answer" literal to detect that inferenced result is produced by Premise (not directly from axiom set).

Hypothesis

If it's comfortable I provide Domain/Axioms/Hypothesis with actual code with snippet. If you want, I can share actual codebase with you via github or something else.

public static void main(String[] args) {
    FOLOTTERLikeTheoremProver resolution =
            new FOLOTTERLikeTheoremProver(20 * 1000, false);

    // Domain
    FOLDomain domain = new FOLDomain();
    domain.addPredicate("F");
    domain.addPredicate("F_1");
    domain.addPredicate("A");
    domain.addPredicate("E");

    FOLParser parser = new FOLParser(domain);
    FOLKnowledgeBase kb = new FOLKnowledgeBase(domain, resolution);

    // Axioms
    kb.tell(parser.parse(
            "FORALL a,b ( F_1(b,a) <=> ( EXISTS c,d (F(c,a) AND A(c,b,d)) ) )"
    ));
    kb.tell(parser.parse(
            "FORALL a,b,c,d,e,f ( (A(a,c,d) AND A(b,e,f) AND E(a,b)) => (E(c,e) AND E(d,f)) )"
    ));

    // Premise of Hypothesis
    kb.tell(parser.parse(
            "FORALL a,b,c ( (F(a,c) AND F(b,c)) => E(a,b) )"
    ));

    // Conclusion of Hypothesis and starting prove
    InferenceResult answer = kb.ask(parser.parse(
            "FORALL a,b,c ( (F_1(b,a) AND F_1(c,a)) => E(b,c) )"
    ));
    List<Proof> proofs = answer.getProofs();

    if (proofs.size() > 0) {
        System.out.println("Theorem proved");
        for (Proof p : proofs) {
            System.out.print(ProofPrinter.printProof(p));
            System.out.println("");
        }
    } else {
        System.out.println("Theorem did not proved");
    }
}

So, actually this code returns me nothing, "Theorem did not proved" result (moreover with increasing timeOut to 300s., just get nothing before timeOut).

Changes of source code

aima/core/logic/fol/kb/data/Clause.java: 347 line Right after this:

Map<Variable, Term> renameSubstitituon = _standardizeApart
        .standardizeApart(copyRPosLits, copyRNegLits,
                _saIndexical);

I inserted this (sorry, it looks inefficient but did required job):

List<Literal> copyRPosLitsWithoutAnser = new ArrayList<Literal>();
List<Literal> copyRNegLitsWithoutAnser = new ArrayList<Literal>();
boolean answerPicked = false;
for (Literal l : copyRPosLits) {
    String nameOfLiteral = l.getAtomicSentence().getSymbolicName();
    if (nameOfLiteral.contains("Answer")) {
        if (!answerPicked) {
            copyRPosLitsWithoutAnser.add(l);
            answerPicked = true;
        }
    } else {
        copyRPosLitsWithoutAnser.add(l);
    }
}
answerPicked = false;
for (Literal l : copyRNegLits) {
    String nameOfLiteral = l.getAtomicSentence().getSymbolicName();
    if (nameOfLiteral.contains("Answer")) {
        if (!answerPicked) {
            copyRNegLitsWithoutAnser.add(l);
            answerPicked = true;
        }
    } else {
        copyRNegLitsWithoutAnser.add(l);
    }
}

Finally, I change this:

Clause c = new Clause(copyRPosLits, copyRNegLits);

To this:

Clause c = new Clause(copyRPosLitsWithoutAnser, copyRNegLitsWithoutAnser);

Result

With 10s. timeOut I got 7 answers (among these answers I found my manually proved trace).

What do you think about this optimization? Can it be incorrect somehow by Theory?

Thanks!

ivanovjaroslaw commented 6 years ago

Somebody can help? At my opinion, usage of more than one dummy "answer" literal is redundant and removing them can dramatically speed up the performance. Moreover, with this fix we are able to get answer for correct task. Otherwise we may get nothing (but the answer exsists).

aimacode / aima-java