Rethink builtin symbols

gfrances commented 6 years ago

I'm starting to think that it might be more elegant to provide a "core language" module and then have all the builtins that we want to provide (perhaps we don't want to provide them monolithically, e.g. the arithmetic function symbols might only be loaded if explicitly required, etc.) separately. I.e. we create a language and then configure it with the "addons" that we want; each addon provides builtin symbols, etc. Both at the syntax and semantic level. To be thought about.

This is somewhat related to issue #6.

miquelramirez commented 6 years ago

I could see this if we changed in run-time the interface of the Term, Variable and Constant objects to provide the overload of the Python operators that allow to write expressions like x + 2 with ease. Doing this is feasible, but seems to me to overcomplicate significantly things with little practical benefit at the moment. As we move on and flesh out more the system, my opinion may of course change :)

miquelramirez commented 6 years ago

Now we have the language symbols being loaded dynamically on demand by importing a specific module. See commit e8c508f5d037748e76793094be4dcdeb90765ce6 for details. Does the current solution look more acceptable?

gfrances commented 6 years ago

Fully agree with your first comment - but now that you have implemented that, I like it!! Not that overcomplicated perhaps, isn't it? The beauty of python...

gfrances commented 6 years ago

The more I look at it, the more I like it. Perhaps we should opt for something more explicit, though, as in Z3, in the following line: when I create a language, I have some kind of simple "configurator method", where I specify exactly and explicitly what type of "addons" I want: arithmetic, etc.... I'll think about it a bit more

miquelramirez commented 6 years ago

I am not sure whether I love the idea of a configurator method, using scopes and import we could get there too in a more open ended way, and without needing to guess combinations. For instance, if "everything" is happening inside a method then you have

def make_language() :
    from tarski.syntax import *

or

def make_language() :
     import tsk.syntax 
     import tsk.syntax.arithmetic

this is compositional, and scalable (by nesting modules). I don't think we would have more than 1 or 2 subcategories of syntax elements. What would that simple configurator method look like?

miquelramirez commented 6 years ago

Fully agree with your first comment - but now that you have implemented that, I like it!! Not that overcomplicated perhaps, isn't it? The beauty of python...

Finding that solution took 8 hours... researching discussions and iterating over a number of possible ways to achieve that in a simple manner. My first attempt, which I think was the most elegant, depended on using local functions and the notion of closure as implemented in Python

for sym, method in arith_op_impl.items():
    def arith_term_op( lhs: Term, rhs: Term):
        arith_op_sym = lhs.language.resolve_symbol(sym, lhs.sort, rhs.sort)
        return arith_op_sym(lhs,rhs)
   setattr(Term, method, arith_term_op)

but the binding of local variables to the nested function arith_term_op did not work as I expected: in fact, as sym ended up pointing to the last sym in the iteration... this probably a result of arith_term_op - a function object - keeping a reference to something which was changing?

Anyways, the result of the above was a completely baffling Poltergeist event so I ended up with a very C++ solution... oh the irony!

gfrances commented 6 years ago

haha that's a very good one! The function keeps a reference to sym, indeed, as it's all references in Python. You could have just added an extra scope that captures sym "by value" , in C++ parlance:

def test_KO():
    all_methods = []
    for i in range(5):
        def inner():
            print("Inner: {}".format(i))
        all_methods.append(inner)
    [m() for m in all_methods]

def test_OK():
    all_methods = []

    def inner(i):
        def double_inner():
            print("Inner: {}".format(i))

        return double_inner

    for i in range(5):
        all_methods.append(inner(i))

    [m() for m in all_methods]

Not too elegant perhaps, I agree...

gfrances commented 6 years ago

I am not sure whether I love the idea of a configurator method, using scopes and import we could get there too in a more open ended way, and without needing to guess combinations. For instance, if "everything" is happening inside a method then you have

def make_language() : from tarski.syntax import * or

def make_language() : import tsk.syntax import tsk.syntax.arithmetic this is compositional, and scalable (by nesting modules). I don't think we would have more than 1 or 2 subcategories of syntax elements. What would that simple configurator method look like?

Configurator method or not, the main idea I'm concerned about is that the approach above will not let us e.g. create two languages using different theories on the same script - once I have imported the arithmetic module, all languages I create will have arithmetic. OK, not that creating several languages will be our main use case as of now, but might be an interesting feature in the future, and I think we can get it right with little extra effort. Let me give it a twist in the next few hours to what you already created, this should be doable pretty easily.

miquelramirez commented 6 years ago

Hello @gfrances,

aha! that's someting I had not contemplated: defining more than one language at the same time. You're totally right... so we'll need something like a configurator method :+1:

miquelramirez commented 6 years ago

I have done a pass on the configurator method idea, see commit: 02b329dc4ea7a5085946392aff8a2d5acbfe6c08

There's an issue with one of the unit tests, which is working and it should be failing.... I suspect the unit testing framework doing something weird with the bytecode.

miquelramirez commented 6 years ago

After some more Python Hacks I think we're now very close to what you were envisioning @gfrances, see commits 02b329dc4ea7a5085946392aff8a2d5acbfe6c08 and 600a85b9eedf61988323d1731f18fdcba90f08a9 and 4b9dc4f94e85a358ecf659ba60d9e92ba3033c09

gfrances commented 6 years ago

Ok, ok, I see that perhaps my design forces us to go somewhere too hacky for my taste. Nice hack though, but we should think whether we want to complicate the design so much... (my mistake, I didn't think too much on the implications of what we wanted...). Having to create classes on the fly might look to weird from the outside as well - i.e. as a user of the library, wouldn't you expect your recently created term to be of type "Term"? But perhaps this is the price we have to pay if we want the ability of generating several languages in the same script - what do you think? I'd say let us think about pros and cons and make a decision ...

gfrances commented 6 years ago

Ok, just realized that the design you had before fits probably much better with our objectives now: we have the Term hierarchy as before, but the python magic methods such as __add__, etc., essentially do:

Check that the languages of both terms are the same
Forward the call to a proxy method in the associated language:

def add(self, rhs): return self.language.resolve_operator('+', self, rhs)

(i.e. I think this is more or less what you had implemented one week ago). The difference being: the method resolve_operator looks up into some registry (we had this before as well?), which might get affected through the modular "theory loading" mechanism. By default, if no theory has registered any "expression builder" for a symbol +, we just raise an Exception making clear that "no theory has been loaded that supports the construction of expression "+ ... ..."; otherwise, we delegate the construction of the formula / term to the registered "theory" module. In other words, we have the same as you had already implemented, save some minor details :-) One of them would be that the load_module, or load_theory method, takes the language, and loads the theory only for that language. The benefit of this: we don't need to do python hacks and have on-the-fly classes, etc. What do you think?

miquelramirez commented 6 years ago

Let me reply in two times

Ok, ok, I see that perhaps my design forces us to go somewhere too hacky for my taste. Nice hack though, but we should think whether we want to complicate the design so much... (my mistake, I didn't think too much on the implications of what we wanted...).

This is turning out to be a quite fun exercise in OOP/OOD @gfrances, I have mentioned several times how fond I am of Smalltalk... so while the solution I found looks "hacky" from the point of view of statically typed languages (such as C++ and JAVA), from the point of view of Smalltalk and ObjectiveC the criticism is that the syntax could be more terse.

In any case, I did some research, and creating types on the fly has been quite stable for a number of years, and the basis for stuff like the namedtuple() factory method of the collections module.

Having to create classes on the fly might look to weird from the outside as well - i.e. as a user of the library, wouldn't you expect your recently created term to be of type "Term"? But perhaps this is the price we have to pay if we want the ability of generating several languages in the same script - what do you think?

It may turn out "weird" but as we had discussed already, the Term hierarchy is meant to be a "closed" one, not for the user to play with, just to use it. Therefore, interventions like this one seem to me to be easily maintained over time.

I'd say let us think about pros and cons and make a decision ...

I am not entirely sure what can be the short term cons of this approach, I think it is well hidden (for Python standards, which are a bit low). The notebooks illustrate typical use cases, and the "weirdness" of the solution we have found for language modularity is invisible from the point of view captured there.

miquelramirez commented 6 years ago

The difference being: the method resolve_operator looks up into some registry (we had this before as well?),

Yes, that is exactly what it did - look up the "type" or "signature" of what was being requested on a symbol table which was an attribute of the language.

which might get affected through the modular "theory loading" mechanism. By default, if no theory has registered any "expression builder" for a symbol +, we just raise an Exception making clear that "no theory has been loaded that supports the construction of expression "+ ... ..."; otherwise, we delegate the construction of the formula / term to the registered "theory" module. In other words, we have the same as you had already implemented, save some minor details :-) One of them would be that the load_module, or load_theory method, takes the language, and loads the theory only for that language. The benefit of this: we don't need to do python hacks and have on-the-fly classes, etc. What do you think?

The problem why the above won't work is that theory loading defines what the interface of the objects needs to be. I do prefer to create the methods dynamically rather than having a massive number of overloads on the interfaces of every subclass of Term. The force which you haven't taken into account is that Python doesn't do polymorphism on the operator methods... that's why we had the overloads on every Term subclass.

By the way, good point re: load_module actually needing to be load_theory.

miquelramirez commented 6 years ago

My position right now @gfrances is that, unless we run into trouble, the current solution is the one that tackles all the forces/challenges. Further feedback on the design, at this point, can only be gathered by moving forward and exploring/developing more use cases for Tarski.

I also need Tarski to be operational sooner than later :-)

gfrances commented 6 years ago

Related to this, @miquelramirez : I think it'd be good to devote one of the Python operators to effect creation. As of now, it seems to me that we create an effect loc(b) := to by doing: loc(b) == to. I don't like this at all, it's confusing and potentially misleading. Of course we should always be able to create and effect by doing: eff = Effect(lhs=loc(b), rhs=to)

but I was thinking that perhaps using some of the builtin operators we could also accept: eff = loc(b) << to

For the reference, from a syntactic point of view, an effect is a tuple <lhs, rhs>, where lhs and rhs are arbitrary terms, and both lhs and rhs have matching sorts (up to upcasts). What do you think? I wish there was an operator :=, but there's none.

gfrances commented 6 years ago

(Just to contextualize, I'm raising this up for the FSTRIPS reader that I'm implementing)

miquelramirez commented 6 years ago

Hello @gfrances,

I am happy to "consume" the << operator, lshift, for that purpose.

Regarding action effects I was toying with the idea of formalising the constraints on the transition system as LTL. The notebook 008 is a proof of concept showing how we can do "reactive" synthesis for specs C U G - or the notion of valid plans we have now.

Under that formalisation, the effect formula becomes

X(rhs == lhs)

which means

[rhs]^s' = [lhs]^s

Where the subterms of rhs and lhs are interpreted in s.

Using << as a short hand for X is perfectly possible, and avoids creating "effect formulas".

gfrances commented 6 years ago

[rhs]^s' = [lhs]^s

Where the subterms of rhs and lhs are interpreted in s.

Aha, good point, and correct, but then it's no longer correct to write [rhs]^s', right? Anyway, that was just a minor comment :-) I'll hijack the lshift operator soon!

miquelramirez commented 6 years ago

Aha, good point, and correct, but then it's no longer correct to write [rhs]^s', right?

What would the notation be like? Would it be something like [f([t_1]^s, ... , f[t_n]^s)]^s' ? I was abusing notation very badly :-)

gfrances commented 6 years ago

hehe yes, why not. You can always extend the definition of denotation to tuples \bar{o}, and then say [ f(\bar{o}^s) ]^s'

Agree it's not too readable either ... :-)

miquelramirez commented 5 years ago

I think we're quite satisfied for now with our current setup.

aig-upf / tarski

Rethink builtin symbols #9