opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
813 stars 225 forks source link

Pass truth values from GroundedPredicateNodes to rule engine from pattern matcher. #1868

Closed noskill closed 5 years ago

noskill commented 5 years ago

We need to use rule engine along with virtual evaluation links. Current pattern matcher implementation does not store truth value returned by grounded predicate. Instead matched evaluation link keeps its default value of stv(1 0). And here matched means having mean tv more than hardcoded threshold of 0.5.

Example:

(use-modules (opencog logger))

(load-from-path "conjunction-rule-base-config.scm")

(Inheritance (stv 0.99 0.99) (Concept "BB1") (Concept "BoundingBox"))
; default truth value 1.0 0.0 give 1.0 0 for any inference 
(Inheritance (stv 0.99 0.99) (Concept "BB2") (Concept "BoundingBox"))

(define Red (Predicate "Red"))

(define (redness box ) 
    (stv 0.55 (random 1.0)) 
)

(define RedBox1
    (EvaluationLink
         (GroundedPredicateNode  "scm: redness")
         (Variable "$X")
    )
)

(define RedBox2
   (EvaluationLink (stv 0.56 0.7)
         Red
         (ConceptNode "BB2")
   )
)

; need to restrict the search by inheritance link
(define AndRedBox1 (AndLink RedBox1 (InheritanceLink (VariableNode "$X") (ConceptNode "BoundingBox")) ) )
; don't need to restrict the search by inheritance link here
; do it anyway
(define AndRedBox2 (AndLink RedBox2 (InheritanceLink (ConceptNode "BB2") (ConceptNode "BoundingBox"))  ))

(define (run-bc and-box) 
    (define result (conj-bc and-box))
    (display "\n")
    (display result)
    (display "truth value for the first element in the set\n")
    (display (cog-tv (car (cog-outgoing-set result))))
    (display "\n")
    result
)

(ure-logger-set-level! "fine")
(cog-logger-set-level! "fine")
(ure-set-num-parameter conjunction-rule-base "URE:maximum-iterations" 20)
(display "query with virtual evaluation links\n")
(run-bc AndRedBox1)
(display "\n\nquery with real evaluation links\n")
(run-bc AndRedBox2)

==================================================================== Current implementation gives (stv 1.000000 0.000000) for all and links for the first conjunction (run-bc AndRedBox1):

query with virtual evaluation links

(SetLink
   (AndLink
      (EvaluationLink
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
   (AndLink
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB1")
         (ConceptNode "BoundingBox")
      )
      (EvaluationLink
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB1")
      )
   )
)
truth value for the first element in the set
(stv 1.000000 0.000000)

query with real evaluation links

(SetLink
   (AndLink (stv 0.56 0.7)
      (EvaluationLink (stv 0.56 0.7)
         (PredicateNode "Red")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
)
truth value for the first element in the set
(stv 0.560000 0.700000)

One possible fix is to store truth value somewhere at the end of DefaultPatternMatchCB::eval_term

https://github.com/opencog/atomspace/blob/e965598be2cc44fdcfd7d583a73989a4d3b647e1/opencog/query/DefaultPatternMatchCB.cc#L679-L683

Then I get the expected result (stv 0.55 0.77770132)

======================================================================

(SetLink
   (AndLink (stv 0.55 0.77770132)
      (EvaluationLink (stv 0.55 0.77770132)
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
   (AndLink (stv 0.55 0.94014286)
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB1")
         (ConceptNode "BoundingBox")
      )
      (EvaluationLink (stv 0.55 0.94014286)
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB1")
      )
   )
)
truth value for the first element in the set
(stv 0.550000 0.777701)

query with real evaluation links

(SetLink
   (AndLink (stv 0.56 0.7)
      (EvaluationLink (stv 0.56 0.7)
         (PredicateNode "Red")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
)
truth value for the first element in the set
(stv 0.560000 0.700000)
ngeiswei commented 5 years ago

I think @linas is right (I'm actually glad he shed the light on this, cause frankly I was blind).

However I also think that caching (or rather storing) TV in evaluation calls can be a desired feature, I just agree with @linas that this should not be in EvaluationLink.

I think the best would be to create a new link, say called CachedEvaluationLink (for lack of better name), inheriting from EvaluationLink. Such link would have an associated C++ class that would implement its own evaluation method, which would call EvaluationLink::evaluation() + store its TV.

Such CachedEvaluationLink may or may not live inside the atomspace repo. Technically, it could live in another repo. I would think that it is enough of a desired feature to live inside the atomspace repo, but that's another debate.

ngeiswei commented 5 years ago

Maybe better names would be StoreTVEvaluationLink, which itself could be inherited by CachedTVEvaluationLink which would not only store the TV but reuse it if its confidence is above 0.

Or if EvaluationLink turns out to be generalized to work with Value instead of TV, it could be StoreValueEvaluationLink, etc.

ngeiswei commented 5 years ago

I also agree with @linas that the pattern matcher should not require its EvaluationLink virtual links to output TVs, instead they should output boolean (maybe FalseLink or TrueLink, or some to-be-implemented BooleanValue). One could still have grounded schemata or grounded predicates within its pattern matcher query, but before these are used as matching preconditions they should be explicitly be turned into boolean output.

For instance instead of

Bind
  And
    Inheritance X beautiful
    Evaluation
      GroundedPredicate "py:nerdiness"
      X
  Inheritance X beautiful-nerd

we would have

Bind
  And
    Inheritance X beautiful
    GreaterThan
      Number 0.5
      TruthValueOf
        Evaluation
          GroundedPredicate "py:nerdiness"
          X
  Inheritance X beautiful-nerd
noskill commented 5 years ago

I don't understand what this is saying, so I can't answer it.

You said that storing tv in EvaluationLink would bload atomspace, but, at least in our usecase, we need EvaluationLink with tv only during ure's query computation. Anyway Nil's solution with CachedTVEvaluationLink would allow user to choose the desired behavior.

For example, people who want to run neural nets on top of the atomspace really wouldn't need or want the solution you propose.

We wrap neural network in EvaluationLink with GroundedPredicate. We populate atomspace with ConceptNodes with some data for neural network to perform classification.

(cog-execute! (BindLink
  (VariableList
    (TypedVariableLink (VariableNode "$B") (TypeNode "ConceptNode"))
    (TypedVariableLink (VariableNode "$X") (TypeNode "ConceptNode"))
  )
  (AndLink
    (InheritanceLink (VariableNode "$B") (ConceptNode "BoundingBox"))
    (InheritanceLink (VariableNode "$X") (ConceptNode "color"))
    (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (ConceptNode "surfboard")) )
    (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (VariableNode "$X")) )
  )
  (ListLink (Variable "$B") (Variable "$X") (ConceptNode "surfboard"))
)
)

In order to use rule engine we need this evaluation link

    (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (ConceptNode "surfboard")) )

to have a truth value. @linas Perhaps we took an unexpected path to integrate atomspace with neural networks. What is the expected one?

ngeiswei commented 5 years ago

The other option (alternatively to having a CachedEvaluationLink) is to use the URE. Since you're using the URE to begin with it shouldn't be problematic. You could have a rule that assigns the TV on a given grounded predicate evaluation.

For instance, the rule

Evaluation
  <grounded-predicate>
  <args>
|-
Evaluation <TV>
  <grounded-predicate>
  <args>

such that TV is the result of grounded-predicate applied to args.

Let's call this rule set-grounded-predicate-evaluation-tv-rule. Then the job of the URE would be to build the following inference tree

ExecutionOutput
  GroundedSchema "fuzzy-conjunction-introduction-formula"
  List
    And
      ExecutionOutput
        GrouhndedSchema "set-grounded-predicate-evaluation-tv-formula"
        Evaluation
          GroundedPredicate "py:neural-network"
          Concept "BB1"
      Inheritance
        Concept "BB1"
        Concept "red"

See what I mean?

ngeiswei commented 5 years ago

Also, if it wasn't clear, let me remind that cog-evaluate! doesn't create atoms or change TVs, unless this is implemented as a side effect by the user (as I suggested here). cog-execute! on the other hand does create atoms or change TVs. So if you wish to do that in a way that is consistent with the current pattern matcher, you should execute a grounded schema, which is what by proposal above suggests.

noskill commented 5 years ago

I am not that familiar with ure yet, is there a way to force ure to apply this set-grounded-predicate-evaluation-tv-rule to every evaluation link with grounded predicate node it encounters? I mean URE could produce some result without using this rule at all.

For now I am quite happy with setting tv from within grounded predicate. This would allow us, as Alexey noted, to return different value to pattern matcher. This way we can process in URE evaluation links with low associated truth values. But i still think that better solution would be to implement CachedEvaluationLink and to provide the threshold to pattern matcher call. Here "better" means providing more options to change the system's behavior.

ngeiswei commented 5 years ago

@noskill said

This would allow us, as Alexey noted, to return different value to pattern matcher

Oh, you mean that it may return something above 0.5 while setting the TV to its really value, that way you have more control over when the PM would allow a match?

Regarding forcing the URE, I think yes, it's possible to force the URE using inference control rules. But otherwise, what can be done is simply giving the URE enough evaluations and adding preconditions in the PLN conjunction rule to dismiss conjunctions if not all arguments have non null confidence.

ngeiswei commented 5 years ago

Regarding forcing the URE, I think yes, it's possible to force the URE using inference control rules

this would not skip the intermediary inference tree though... It would be nice we have something that allows that. One way actually would be to gather inference trees, wrap them as rules that you'd be able to re-use then to skip inference steps...

noskill commented 5 years ago

@ngeiswei Yes, we want to be able to compute truth value for such and links:

   (AndLink 
      (EvaluationLink (stv 0.35 0.45)
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "B")
      )
      (InheritanceLink (stv 0.45 0.45)
         (ConceptNode "red")
         (ConceptNode "color")
      )
   )

Current pattern matcher implementation would throw out the EvaluationLink with (stv 0.35 0.45), but unexpectedly would not throw out the IheritanceLink. I thought that IheritanceLink wouldn't be stored as well, but since it works, setting tv from within the grounded predicate seems to give same result as would providing threshold for pattern matcher.. I am not sure about other link types..

ngeiswei commented 5 years ago

Interesting, so what I thought was a bug is actually a feature here...

@Necr0x0Der @linas this is another reason to have the pattern matcher support boolean predicate, that what one can choose how what threshold to apply over strength or confidence, instead of having a fixed 0.5 threshold over strength.

linas commented 5 years ago

I think that the BindLink is mal-formed. I believe that what you want is this:

(BindLink
  (VariableList
    (TypedVariableLink (VariableNode "$B") (TypeNode "ConceptNode"))
    (TypedVariableLink (VariableNode "$X") (TypeNode "ConceptNode"))
  )
  (AndLink    ;; This is the premise that must be satisfied for a "match" to occur.
    (InheritanceLink (VariableNode "$B") (ConceptNode "BoundingBox"))
    (InheritanceLink (VariableNode "$X") (ConceptNode "color"))
  )
  ;; Below is the conclusion. The pattern matcher explores all possible values for $B and $X 
  ;; then it plugs them into the below, and then causes "py:runNeuralNetwork" to run.  The 
  ;; "py:runNeuralNetwork" should set the TV, and possibly set other values, e.g. a time-stamp,
  ;; the number of layers in the net, the size of the hill-climbing delta, and your favorite
  ;; hockey team.
  (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") 
       (ListLink (VariableNode "$B") (VariableNode "$X") (ConceptNode "surfboard")) )

The pattern matcher simply finds all possible variable groundings that satisfy the premise, and places those groundings into the conclusion.

linas commented 5 years ago

Re this comment:

Current pattern matcher implementation would throw out the EvaluationLink with (stv 0.35 0.45), but unexpectedly would not throw out the InheritanceLink.

The pattern matcher does NOT look at truth values on atoms. Instead, it is simply looking to see if an atom exists, or not. If the atom exists, then its a match. If it does not exist, then .. well, it doesn't exist. Existence is just a crisp true/false thing.

There are complications. For example, the GreaterThanLink X Y exists "virtually" whenever X is greater than Y, else it doesn't. Since it is impossible to store the infinite number of all possible X,Y pairs, this link is instead evaluated at match-time, to determine if it "exists", or not.

A "relation", in the strict, mathematical sense, is a function of N arguments, that, when evaluated, returns 1 or 0. Relations never return fractions, floats, or any other kind of value. (off-topic, but important: a "relational algebra" implements a collection of relations. SQL is a practical kind of easy-to-use relational algebra system. Relations are important.)

So: GreaterThan is a relation between numbers that has to be virtual, because there are an infinite quantity of numbers. There are other relations that have to be "vritual", i.e. dynamically-evaluatable at run-time. Today, these are handled with GroundedPredicateNode.

In the future, Perhaps we should instead use a RelationPredicateNode instead of a GroundedPredicateNode. Perhaps it should be a BooleanPredicateNode. Perhaps, instead of writting

(EvaluationLink (GroundedPredicateNode "py:foo") (List ...))

we should instead write

(ExistsLink (BooleanPredicateNode "py:foo") (List ...))

which evaluates to a crisp true/false when it is used in the premise-side of a BindLink. It is an accident of history that we did not do this from the beginning. This accident of history is now causing confusion.

noskill commented 5 years ago

@linas Our intention here is to decrease computational complexity:

  (AndLink
    (InheritanceLink (VariableNode "$B") (ConceptNode "BoundingBox"))
    (InheritanceLink (VariableNode "$X") (ConceptNode "color"))
    (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (ConceptNode "surfboard")) )
    (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (VariableNode "$X")) )
  )

Here if pattern matcher finds that evaluation link (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (ConceptNode "BB-11") (ConceptNode "surfboard")) ) doesn't match, it won't consider this (ConceptNode "BB-11") anymore.

If we pass all variables to one evaluation link this optimization won't happen.

linas commented 5 years ago

decrease computational complexity:

Ahh. Well, the pattern matcher is pretty fast: its going to be thousands of times faster than any neural net. Its going to be ten times faster than calling a python function. Doing this will in fact slow things down: the pattern matcher is still forced to explore all possible groundings, because it cannot guess what your "py:runNeuralNetwork" actually does; its a black-box.

noskill commented 5 years ago

I checked and pattern matcher does not call one evaluation link if other doesn't match, it doesn't respect the order in the AndLink though.

linas commented 5 years ago

it doesn't respect the order

The AndLink is an unordered link. SequentialAnd is ordered.

linas commented 5 years ago

... except the pattern matcher doesn't respect SequntialAnd's either. What the AndLink is doing is specifying a set of clauses, which must be satisfied. The clauses form a graph; graphs do not have an "order".

noskill commented 5 years ago

Thanks Linas! SequentialAnd seems to be respected by pattern matcher, at least for Evaluation links. I guess pattern matcher first considers inheritance links, then evaluation links. This can be usefull.

linas commented 5 years ago

Closing; I don't see any work items here, and I think all issues have been resolved.