Better interoperability with deep learning frameworks

noskill commented 5 years ago

This issue is to document and discuss changes necessary to use deep learning frameworks with opencog.

Our usecases:

We would like to be able to pass between ExecutionOutputLinks python objects, particularly pytorch tensors. Pytorch saves in it's tensor instances necessary information for performing backward propagation.
More convenient api than GroundedSchemaNode for calling objects methods.

Example:

Our motivating example is implementing transparent by design networks(https://arxiv.org/pdf/1803.05268) with opencog. The idea of this example is answering some question about a picture by applying series of filters, implemented as pytorch neural networks. Each nn accepts original picture + mask from previous filter(initially mask is all zeroes) generating new mask.

First i describe current implementation: ExecutionOutputLinks for the question "What is the large purple object is made of?":

https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L140-L166

    (ExecutionOutputLink
      (GroundedSchemaNode "py:filter")
      (ListLink
        (ConceptNode "material")
        (VariableNode "$X")
        (ExecutionOutputLink
          (GroundedSchemaNode "py:filter")
          (ListLink
            (ConceptNode "color")
            (ConceptNode "purple")
            (ExecutionOutputLink
              (GroundedSchemaNode "py:filter")
              (ListLink
                (ConceptNode "size")
                (ConceptNode "large")
                (ExecutionOutputLink
                  (GroundedSchemaNode "py:init_scene")
                  (ListLink
                    (VariableNode "$Scene")
                  )
                )
              )
            )
          )
        )
      )
    )

Here pattern matcher grounds VariableNode "$X" to different ConceptNodes representing materials since the constraint:

(InheritanceLink
      (VariableNode "$X")
      (ConceptNode "material")
)

where filter is wrapper to call some pytorch module object: https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L389-L404

def filter(filter_type, filter_type_instance, data_atom):
       module_type = 'filter_' + filter_type.name + '[' + filter_type_instance.name + ']'
       module = tbd.function_modules[module_type]
       atomspace = data_atom.atomspace
       key_attention, key_scene, key_shape_attention, key_shape_scene = generate_keys(atomspace)
       feat_input = extract_tensor(data_atom, key_scene, key_shape_scene)
       feat_attention = extract_tensor(data_atom, key_attention, key_shape_attention)
       out = module(feat_input.float(), feat_attention.float())
       set_attention_map(data_atom, key_attention, key_shape_attention, out)
       return data_atom

and init_scene accept scene atom and generate new atom which holds dummy attention map and features from scene. This atom is then reused to pass values between filters.

There are issues with current implementation
a. It requires to convert back and forth between pytorch tensor objects and FloatValue for each ExecutionOutputLink application. See https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L281
b. This implementation doesn't allow to backpropagate error to neural network weights since information is lost due to conversion. Pytorch Tensor object keeps both: current numeric values and link to computation graph which allows to backpropagate error automatically.

PtrValue Both issues may be addressed with introduction of new value type: PtrValue. Then to implement storing of values for different binding language one would inherit from base PtrValue type. For python there will be c++ class PythonValue(PtrValue) and for haskell HaskelValue(PtrValue) etc..

Then extracting tensor object in "py:filter" will look like:

atom.get_value(PredicateNode("PythonValue"))

and returning value will be done by creating a new atom to hold value:

layer_result = ConceptNode(gen_random_uuid())
layer_result.set_value(PredicateNode("PythonValue"), PythonValue(tensor_mask))

ExecutionValueLink In addition to PtrValue we may introduce new link type: ExecutionValueLink which would return a PtrValue. This would allow to return from "py:filter" PythonValue(tensor_mask).

That's for addressing of usecase 1.

To address usecase 2 - more convenient api than GroundedSchemaNode for calling objects methods:

One way is to use proposed PtrValue alongside with wrapper function.

def callPythonMethod(atom_obj, atom_method_name, *args):
    obj = atom_obj.get_value(PredicateNode("py:callPythonMethod"))
    return getattr(obj, atom_method_name.name)(*args)

Then calling method will be a bit verbose but quite straightforward:

ExecutionOutputLink(
        GroundedSchemaNode("py:callPythonMethod"),
        ListLink(ConceptNode("FilterRed"),
                 ConceptNode("forward"),
                 ExecutionOutputLink...
                )

GroundedObjectNode

Another way to address the same issue is to use LibraryManager to store python objects. GroundedObjectNode atom type would register python object in LibraryManager. Like:

import torch
GroundedObjectNode("dot", torch.dot)

then calling with ExecutionOutputLink or any other executable link:

ExecutionOutputLink
   GroundedSchemaNode("obj: dot.__call__")
   ListLink
      VariableNode("$OtherWeights")
      VariableNode("$SomeWeights")

noskill commented 5 years ago

Recursion is common for all functional languages, it is something expected to be in Atomese instead of for loop. But IfThenElseLink would be nice to have: Examples look somewhat awkward since all the functions used for side effects or some computation are forced to be predicates:

print function returns truth value: (define (print-stuff) (display "hi there!\n") (stv 1 1))

If True then assign variable:

(True (Put
             (State (Anchor "sum-A") (Variable "$x"))

linas commented 5 years ago

The stream.scm example might be a simpler place to start. It started life as a prototype neural-net example, showing how to move neural-net outputs into the atomspace. It used RandomStream as an example of a "typical" neural net. Again, this is just a cut-n-paste from the mailing list, Alexey should have walked you through these examples.

It is simpler to understand than the looping examples, and might be a better place to start.

linas commented 5 years ago

We already have an IfThenElseLink; it is called SequentialOrLink See above.

linas commented 5 years ago

Re: IfThenElseLink, and "examples look somewhat awkward". Keep in mind:

Yes, the examples will feel like reading and writing assembly code; it will seem awkward and verbose to humans. There is a reason for that.
The reason for the awkwardness is to make it as easy as possible to perform algorithmic reasoning on Atomese. The goal is to make it easy for PLN, for the pattern miner, for the pattern matcher, for openpsi to take a SequentialOrLink as input and perform manipulations on it -- taking it apart, reassembling it in different ways.
Having minimalist, highly uniform atoms makes this easier., for the algorithms. Kind of like RISC vs. CISC -- RISC is easier for the machine, for the algo; CISC is easier for the human. Humans should not be writing assembly code. Humans should not be writing Atomese.
Or rather, write just enough Atomese by hand, to see how it is might work, and then give the whole thing over PLN+pattern-miner+openpsi+moses, and let these high-level systems create the Atomese for you. You don't program in assembly; you should not be programming in Atomese, either.

To recap: Atomese is not meant for humans. It's not a functional programming language. It is a knowledge representation language.

However, it is also very clear that we need to somehow make it easier for humans. We never explained values very well, it seems that they are very hard to understand. Looking at the https://grakn.ai examples also makes it clear that we have ignored the ease-of-use of EvaluationLink/PredicateNode for far too long. Humans are not using the EvaluationLink/PredicateNode combination because it is just too verbose, takes up too much space, requires too much typing.

Just right now, it feels like it would be a good idea if we had some higher-level system that was as easy to use as grakn.ai, and hid the messy EvaluationLink/PredicateNode combo from "ordinary users". it would also be a place to add if-then constructs, c++/python-style for-loops, and stuff that procedural programmers are used to. Somehow hide values there too, so they don't trip you up as much. Right now, I cannot think of any easy solution for this higher-level, easier-for-humans layer. It's gonna take a lot of work.

ngeiswei commented 5 years ago

The problem with SequentialOrLink is that it returns a TV while IfThenElseLink would return any atom type executed in its branches. That could be worked around I suppose, but the most elegant solution I think is to introduce CondLink or IfThenElseLink.

linas commented 5 years ago

SequentialOrLink is that it returns a TV while IfThenElseLink would return any atom

Ah, OK, yes. Down this path is the tangle that pull req #1996 just scratches the surface of: when are we working with "just TV's", when do we work with "just Atoms", and when with "general Values" (keeping in mind that Atoms are a special case of values).

This helps make it increasingly clear that, for performance reasons (not just usability) we need to have a crisp boolean yes/no inside of the pattern matcher and other places, instead of using TV's for this. That is, IfThenElse is very clearly a crisp-truth-value thing. Its not a fuzzy-logic thing, or a probabilistic-logic thing, or a probabilistic-programming random sampler.

Should we give up on the dream of having AndLink, OrLink, etc. ever being anything other than crisp-truth ? Yes, AndLink is kind-of-like "set intersection", OrLink is like "set union", but none of our code anywhere ever does set intersection to compute the truth value of AndLink.

There's a link, called ValueOfLink, that was added last summer, for the deep-learning API. See https://github.com/opencog/atomspace/blob/master/examples/atomspace/stream.scm

linas commented 5 years ago

SequentialOrLink is that it returns a TV while IfThenElseLink would return any atom

This also exposes the unfulfilled dream of Atomese. The goal of Atomese is NOT to just invent another programming language, badly. The goal is to have something that a reasoning engine can examine, and reason with.

Unless PLN has an axiom that DEFINES what IfThenElse "actually means", unless PLN has a rule that takes IfThenElse as input, and pits out something else as output, then it's kind-of pointless to have it. Instead of programming in Atomese, write your programs in C++ or python or whatever.

For PLN, you can substitute MOSES and Reduct. One of the Reduct rules should take IfThenElseLink as input, and do something with it.

For PLN, you can substitute the PatternMiner: If mining reveals that IfThenElse has a large "surprisingness", then do .. something with it.

IfThenElse is a core component of state machines and behavior trees. Yet, oddly enough, none of the four state-machine demos need it: https://github.com/opencog/atomspace/blob/master/examples/pattern-matcher/fsm-basic.scm https://github.com/opencog/atomspace/blob/master/examples/pattern-matcher/fsm-full.scm https://github.com/opencog/atomspace/blob/master/examples/pattern-matcher/fsm-mealy.scm https://github.com/opencog/atomspace/blob/master/examples/pattern-matcher/markov-chain.scm

So, do we really need IfThenElse?

For PLN, substitute the natural-language comprehension and the natural-language generation subsystems. Look at the Sophia-robot late-nite TV show host interview problem again: the TV show host says something to Sophia, and she converts that to Atomese. What computational process allows her to generate a reasonable reply? How does that process need IfThenElse link?

linas commented 5 years ago

Meanwhile: this comment: https://github.com/opencog/atomspace/issues/2004#issuecomment-456251375 explains a way of doing what you wanted to do with ValuePtr in a way that would work well with the existing system.

ngeiswei commented 5 years ago

Unless PLN has an axiom that DEFINES what IfThenElse "actually means", unless PLN has a rule that takes IfThenElse as input, and pits out something else as output, then it's kind-of pointless to have it.

Absolutely. My plan is to ultimately add tons of axioms about math, and atomese in particular in

https://github.com/opencog/opencog/tree/master/opencog/pln/facts

so that PLN can reason about Atomese programs, in a much more open-ended way than MOSES or the pattern miner.

noskill commented 5 years ago

@linas You just renamed GroundedObjectNode to PythonEvaluationLink. Introduction of GroundedObjectNode doesn't solve the problem of passing of arbitrary python objects between ExecutionOutputLinks. I mean i could wrap every returned object in new PythonEvaluationLink, but what's the point? We already have PtrValue for that.

noskill commented 5 years ago

Example of why ValuePtr is usefull This is query to rule engine with rewritten fuzzy conjunction formula:

(AndLink
  (EvaluationLink
    (GroundedPredicateNode "py:CogModule.callMethod")
    (ListLink
      (ConceptNode "green") 
      (ConceptNode "call_forward_tv") 
      (ListLink
        (ConceptNode "apple") 
      ) 
    ) 
  ) 
  (InheritanceLink (stv 0.800000 0.990000)
    (ConceptNode "green") 
    (ConceptNode "color") s
  ) 
)

Here (ConceptNode "apple") has some data(pytorch tensor) attached using ValuePtr.
(ConceptNode "green") has some pytorch model attached, which is run on data from (ConceptNode "apple") . The InheritanceLink also has ValuePtr mirroring (stv 0.800000 0.990000). Function "call_forward_tv" runs the model and attaches pytorch array to the EvaluationLink it called from. Then fuzzy conjunction rule takes these two arrays - one from InheritanceLink and one from EvaluationLink and and computes new tensor value which is attached to the AndLink.

Now if we have training set with correct answer we can use backprogation to update both: weights of the model which classifies green objects and strength of the InheritanceLink that green is a color. After weights update is done it is possible to update simple truth value of InheritanceLink to mirror the changes. Thus we can learn truth values of InheritanceLinks using backpropagation.

Examples are here https://github.com/singnet/semantic-vision/tree/master/experiments/opencog/cog_module, they require singnet atomspace and pytorch to run. They are all runnable, but still work in progress.

linas commented 5 years ago

Can you open a new issue that describes what you are actually proposing, and how it works?

The discussion above has gotten very long, and it's impossible to tell which ideas were implemented, which ideas were abandoned, which ideas were rejected.

I'm not sure, but I think that you are proposing that GroundedPredicateNode should support the format:

GroundedPredicateNode "py: instance_of_some_python_class.method_on_that_class"

or maybe

GroundedPredicateNode "pyobj: instance_of_some_python_class % method_on_that_class"

or something like that. That seems reasonable to me. I think that this can be accomplished without having to store any actual C++ pointers inside of GroundedPredicateNode

This is already "trivially" possible in scheme. because scheme OO-programming objects are just closures. (i.e. closures are more-or-less the same thing as objects. They resemble javascript objects a lot more than they resemble python objects) In scheme, the following should work:

ExecutionOutput
     GroundedPredicate "scm:name-of-closure" 
     ListLink
            ConceptNode "name-of-method"
            ConceptNode "... the other args to the method ..."

It should work. If someone wanted to get fancy, the could implement

GroundedPredicate "scm-obj: name-of-closure 'name-of-method"

linas commented 5 years ago

Regarding values, what was done for OctoMap was to create a value, called OctoMapValue that knows which instance of the OctoMap is being queried. In all other respects, it behaves as a FloatValue i.e. returns x,y,z when queried (which are the x,y,z coordinates of the object in the octomap)

noskill commented 5 years ago

@linas I made TensorTruthValue which wraps python object in truth value, so it retains torch computation graph and behaves like truth value from c++ side - https://github.com/singnet/atomspace/pull/92.

The same can easily be done for FloatValue<->torch.Tensor. Do you think this design is good enough?

linas commented 5 years ago

It seems reasonable. I suggest the following minor changes:

Call it PythonTruthValue or maybe DynamicTruthValue since it has nothing at all to do with tensors, and just seems to be a generic way of getting a value from some generic python code.
The TruthValue::clone() is probably not needed. (?) The clone() methods are left-over cruft from an earlier era, and I don't think they are used anywhere (except in some unit tests)
It is probably more efficient if you alter it to get both confidence and mean in one go. Must users of truth values request both in rapid succession.
Its probably better if you change this to derive from FloatValue instead of TruthValue -- that way, someone else could track 6 or 12 different numbers if they wished, instead of being limited to only only 2 or 3. (So, for example, in my work, I usually track: the count n, the normalized frequency n/N between 0.0 and 1.0 and also log-of-frequency, and sometimes log-of-count. Why so many? Its handy, convenient. None of them are truly "mean" or "confidence", which are something else again...)

Other than that, I think I like it.

I might spend the rest of the afternoon surgically removing clone() from truth values -- I really think its just obsolete/unused/un-needed. I need to look more carefully...

linas commented 5 years ago

@noskill I just removed the clone() method here: #2156 If you make the suggested changes above, I'd be happy to merge this into mainline. It might be better if the code lived in the opencog/cython directory, instead of the opencog/atoms/truthvalue directory, but I'm not sure it matters all that much; I guess either is probably OK in either directory.

noskill commented 4 years ago

We published paper describing this proposal http://agi-conf.org/2019/wp-content/uploads/2019/07/paper_19.pdf It was implemented in singet repository.

It can be reimplemented with this DynamicTruthValue, since it is still pointer to arbitrary pyobject..

opencog / atomspace

Better interoperability with deep learning frameworks #1970