thahmann / macleod

Ontology development environment for Common Logic (CL)
Other
22 stars 9 forks source link

pyparsing dependency #23

Open thahmann opened 6 years ago

thahmann commented 6 years ago

in macleod/Clif.py, there is one remaining dependency on pyparsing (lines 307 and 308), which causes an error unless the pyparsing module is installed. I thought we wanted to remove that dependency.

Can we replace these lines with new parsing functionality?

thahmann commented 6 years ago

I just noticed that we need to remove it, because pyparsing is not available for Python 3.

Fxhnd commented 6 years ago

The pypy hosted package for pyparsing does have the Python 3+ tags on it. It did work at some point, not sure what may have changed that would've broke the regular tasks.

100% the correct answer is to replace it with the new parsing scheme so we can bring all the functionality back together in one place. Doing so is going to require extension to the logical/ classes. E.g. add an output flag to the base Logical then implement logic in the repr() methods of the classes which change how the string representation gets created. Right now everything is plain old infix FOL. This would take care of the majority of the work needed to get valid .tptp and .p9 files.

thahmann commented 6 years ago

I found the pypy package for Python 3.5 (not 3.6 yet). I will give it a try with that.

As for the long-term plan: don't we need multiple representations? I'm rather thinking of having functions .toTptp() and .toLadr() for each kind of class in the object structure (Logical, Axiom, etc.).

Fxhnd commented 6 years ago

While responding to this I ended up writing a complete to_tptp() method for the Axiom class, it's at the bottom of this repsonse. The only thing it probably needs are few minor tweaks to handle the undocumented intricacies of the tptp format that I don't remember (e.g. if vars needs to be all caps). I'll add the nested function to the Logical base class then it should be good. Any volunteers to write the to_ladr()?

I agree with adding the shortcut methods .to_tptp() and .to_ladr(), I'm thinking more in the backend on how to power those methods with minimal code.

logical = Quantifier(['x'], Predicate('Ex', ['x']))
logical.set_format(Logical.TPTP) # Use some enum defined in the base logical
print (logical) # Will now produce TPTP format

Since every Logical (e.g. Quantifier, Connective, Negation, Predicate, and Function) all subclass Logical we could implement this by adding def setFormat(self, format) and __format to the base class. Then all we do is extend the __repr__(self) methods that are already implemented to check what our output format is supposed to be.

def __repr__(self):
    if __format == Logical.TPTP:
        # Process the TPTP format
    elif __format == Logical.LADR:
        # P9 format
    else:
        # Default infix notation we already have

We could then put the convenience methods to_tptp(self) and to_ladr() on the base class as well. They could just be shortcuts like:

def to_tptp(self):
    original_format = __format
    set_format(Logical.TPTP)
    print(self)
    set_format(original_format)

The different output formats should be very short, basically one-liners taking advantage of string interpolation. For example here is the __repr__(self) method of a Disjunction which takes the prefix object structure and prints it in infix form:

def __repr__(self):
    return "({})".format(" | ".join([repr(t) for t in self.terms]))

By using the __repr__ way we save quite a bit of bootstrap since we don't have to add new methods to every class and we don't have to worry about maintaining them. The only downside I can think of is that set_format(self, format) will have to do a pass through the structure once. If that ever became a performance issue we could optimize by adding a back-references so __repr__(self) would look at the containing Logical's __format and act accordingly.

I've vacillated quite a bit and can't decide between embedding the translators in the object structure or providing single utility functions for each desired output format. This would look something like...

def to_tptp(self):
    """
    Produce a TPTP representation of this axiom.

    :return str tptp, tptp formatted version of this axiom
    """

    def tptp_logical(logical):
        if isinstance(logical, Symbol.Predicate):
            # TODO Does TPTP let you nest functions in predicates?
            return "({}({}))".format(logical.name, ",".join(logical.variables))
        elif isinstance(logical, Symbol.Function):
            return "({}({}))".format(logical.name, ",".join(logical.variables))
        elif isinstance(logical, Negation.Negation):
            return "~ {}".format(tptp_logical(logical.terms[0]))
        elif isinstance(logical, Connective.Conjunction):
            return "({})".format(" & ".join([tptp_logical(t) for t in logical.terms]))
        elif isinstance(logical, Connective.Disjunction):
            return "({})".format(" | ".join([tptp_logical(t) for t in logical.terms]))
        elif isinstance(logical, Quantifier.Universal):
            return "{} {}".format(("! [{}] : " * len(logical.variables)).format(*logical.variables), tptp_logical(logical.terms[0]))
        elif isinstance(logical, Quantifier.Existential):
            return "{} {}".format(("? [{}] : " * len(logical.variables)).format(*logical.variables), tptp_logical(logical.terms[0]))
        else:
            raise ValueError("Not a valid type for TPTP output")

    return "fof(axiom{}, axiom, ( {} ).".format(str(self.id*10),tptp_logical(self.sentence))

... so that works and looks better than expected. How about we just go with that as a class method Axiom.to_tptp()? It produces output that looks like the following :

fof(axiom10, axiom, ( ! [x] :  ((~ ~ (ZEX(x)) | (Cont(x,x))) & (~ (Cont(x,x)) | ~ (ZEX(x)))) ).
fof(axiom20, axiom, ( ! [x] : ! [y] :  (~ ((Cont(x,y)) & (Cont(y,x))) | (=(x,y))) ).
fof(axiom30, axiom, ( ! [x] : ! [y] : ! [z] :  (~ ((Cont(x,y)) & (Cont(y,z))) | (Cont(x,z))) ).
fof(axiom40, axiom, ( ! [x] : ! [y] :  (~ (ZEX(x)) | (~ (Cont(y,x)) & ~ (Cont(x,y)))) ).
fof(axiom50, axiom, ( ! [x] : ! [y] :  (~ ((ZEX(x)) & (ZEX(y))) | (=(x,y))) ).
thahmann commented 6 years ago

I like the to_tptp in one place, that makes it easier to maintain. Btw, I believe the translation you have does not work. TPTP (or at least some of the provers) require each variable to have a unique name (you can't use x in two different axioms). That's why the old code has this weird numbering mechanism for variables.

Fxhnd commented 6 years ago

Ok, that's not too bad to fix, we can add a class visible singleton to serve as the appointed "variable numberer". Are there any other gotchas about the provers that you can think of so we can get them documented here in the issues?

thahmann commented 6 years ago

Great, that is very helpful. All variables must be upper case for TPTP.

The first axiom in the example above has a "~ ~ (". It seems like it is missing a parenthesis between the negation symbols (i.e. whatever you negate needs to be put in parentheses.

Btw, does the double negation jsut arise from a direct translation from CLIF without any simplification? I hope the CNF parser doesn't spit that out!

Fxhnd commented 6 years ago

I remember wondering about the negation and thought it was weird that it wasn't wrapped. That's an even easier fix!

You are correct that the double negation is just the result of the parsing with no simplification applied. The ff-pcnf translated axioms don't contain nested negations. A negation simplified version of the sentence (not completely ff-pcnf translated) is also obtainable via the .push_negation() method on an Axiom.

thahmann commented 6 years ago

In ClifModule, we have two calls to get_translated_file: one for the TPTP translation (in the method get_tptp_file_name) and one for the LADR translation. What does it take to replace the translation to TPTP to the new method and using the new parser? I'd like to try it out here first, before using it in other places as well.

Fxhnd commented 5 years ago

Updated and added a '-t' or '--tptp' flag on the parser.py script in the bin/ directory.

Sample: python3 parser.py -f some/path/to/file.clif --tptp

rpskillet@Fxhnd[±|DL {2} U:5 ?:3 ✗]:~/Vault/code/github/macleod $ python3 bin/parser.py -f qs/multidim_space_codib/generated/codi_bcont.clif --tptp
fof(axiom10, axiom, ( ! [X1] : ! [Y1] :  (~((BCont(X1,Y1))) | ((Cont(X1,Y1)) & (Inc(X1,Y1)))) )).
fof(axiom20, axiom, ( ! [X2] : ! [Y2] : ! [V2] : ! [Z2] :  (~(((SC(X2,Y2)) & (Min(X2)) & (P(X2,V2)) & (Cont(Y2,V2)) & (Cont(Z2,X2)) & (Cont(Z2,Y2)))) | (BCont(Z2,X2))) )).
fof(axiom30, axiom, ( ! [X3] : ! [Y3] : ! [Z3] : ! [V3] :  (~(((SC(X3,Y3)) & (P(X3,V3)) & (P(Y3,V3)) & (Cont(Z3,X3)) & (Cont(Z3,Y3)) & (Covers(V3,Z3)))) | ~((BCont(Z3,V3)))) )).
fof(axiom40, axiom, ( ! [X4] : ! [Y4] : ! [Z4] :  (~(((BCont(X4,Y4)) & (P(Y4,Z4)) & ! [V4] : ! [W4] :  (~(((P(V4,Z4)) & ~((PO(V4,Y4))) & (P(W4,X4)))) | ~((Cont(W4,V4)))))) | (BCont(X4,Z4))) )).
fof(axiom50, axiom, ( ! [X5] : ! [Y5] : ! [Z5] :  (~(((BCont(X5,Y5)) & (Cont(Z5,X5)))) | (BCont(Z5,Y5))) )).

@thahmann it's going to need some external verification to make sure that the translation is correct. In particular I'm not sure if there are any extraneous parenthesis that might make the provers unhappy. If there was a spec somewhere which explained the tptp format that would be useful.

Variables have an integer appended to them that corresponds to that axiom's id to keep things unique; e.g. axiom10 will get a 1 appended to all of it's variables 20 --> 2, 110 --> 11 and so on. Axiom ids are assigned in the order in which they are created. Without translating to FF-PCNF the ids will match the order in which they appear in the read files (sans the duplicate import error issue). If the -p or --ffpcnf flags are passed the id will not match the order found in the file since lots of intermediate axioms are created during the translation process.