opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
813 stars 225 forks source link

Pattern matcher does not work with PlusLink (in case of two variables) #2293

Open astroseger opened 5 years ago

astroseger commented 5 years ago

This issue might be connected to #2284 which has been fixed recently.

The following code throw and exception (but "5" and "6" are expected)

(use-modules (opencog) (opencog exec))
(use-modules (opencog logger))
(NumberNode 1)
(NumberNode 2)
(NumberNode 3)
(NumberNode 4)
(NumberNode 5)
(NumberNode 6)
(define x 
       (GetLink 
            (VariableList 
                     (TypedVariableLink (VariableNode "$X1") (TypeNode "NumberNode"))    
                     (TypedVariableLink (VariableNode "$X2") (TypeNode "NumberNode"))) 
            (EqualLink (PlusLink (VariableNode "$X1") (VariableNode "$X2")) (NumberNode "11"))))

(cog-execute! x)

The following exception is returned:

ERROR: In procedure cog-new-link:
Throw to key `C++-EXCEPTION' with args `("cog-new-link" "Variable not groundable: (VariableNode \"$X2\") ; [1118410847907421454][1]\n\n (/tmp/atomspace-master/opencog/atoms/pattern/PatternLink.cc:809)\nFunction args:\n((VariableList\n   (TypedVariableLink\n      (VariableNode \"$X1\")\n      (TypeNode \"NumberNode\")\n   )\n   (TypedVariableLink\n      (VariableNode \"$X2\")\n      (TypeNode \"NumberNode\")\n   )\n)\n (EqualLink\n   (NumberNode \"11.000000\")\n   (PlusLink\n      (VariableNode \"$X1\")\n      (VariableNode \"$X2\")\n   )\n)\n)")'.
linas commented 5 years ago

Pull req #2294 tightens up the error message.

I'm going to treat this an enhancement request to fix a current limitation. The example is wandering deeper into mathematical optimization territory (https://en.wikipedia.org/wiki/Mathematical_optimization) The current pattern matcher does support certain kinds of simple optimization problems, e.g. those involving greater-than, equal, or any test that returns true/false. Such links are called "virtual", because they might not exist directly in the atomspace.

Such queries are handled by defining a new pattern, with the virtual links removed. This new pattern typically has N disconnected components. Each individual component is grounded, and then these are brought together again, with the virtual link applied to each combination, to accept/reject it. This is a brute-force search: so, for example, if the virtual link is (GreaterThan x y) then it will split into 2 disconnected parts. Suppose there are K x's found, and L y's. Then there will be K*L comparisons, between each x and y. So its a combinatoric explosion.

We could also do the same thing for PlusLink, TimesLink, etc. and I don't think its "all that hard" -- most of the work is done already; we just have to rework the part that splits the graph into components, and broaden the coverage for recombining them. So this is a fixable "bug" (enhancement), although it requires a fair amount of effort to restructure everything.

More generally, one might ask: is this a good idea? For non-numeric functions, yes, it makes sense, because it becomes a discrete programming problem, and the brute-force search is unavoidable and expected. For numeric functions, no so much. For numeric functions, one would want to apply some mathematical optimization algo, and if it was linear, then some linear optimization algo (https://en.wikipedia.org/wiki/Linear_programming) ... but this is now wandering off into territories a bit too far away from the pattern matcher.

One could try to turn the pattern matcher into an SMT solver (Satisfiability Modulo Theories) https://en.wikipedia.org/wiki/Satisfiability_modulo_theories -- the pattern matcher already is that, kind-of, anyway; it just does not clearly state which theories it supports, and does not have any particular boundaries for adding new ones (except with GroundedPredicateNodes and GroundedSchemaNodes, which are a very ugly interface for specifiying theories. Although there are certain Russian singnet employees who are trying to hack on these :-/ I think studying SMT would suggest some cleaner, simpler alternatives to GPN's and GSN's.).

noskill commented 5 years ago

This BindLink works though:

  (InheritanceLink
      (NumberNode 7)
      (ConceptNode "range") 
    ) 

  (InheritanceLink
      (NumberNode 0)
      (ConceptNode "range") 
    ) 
(BindLink
  (VariableList
    (TypedVariableLink
      (VariableNode "X")
      (TypeNode "NumberNode") 
    ) 
    (TypedVariableLink
      (VariableNode "Y")
      (TypeNode "NumberNode")
    ) 
  ) 
  (AndLink
    (InheritanceLink
      (VariableNode "X")
      (ConceptNode "range") 
    ) 
    (InheritanceLink
      (VariableNode "Y")
      (ConceptNode "range") 
    ) 
    (EqualLink
      (NumberNode "7.000000") 
      (PlusLink
        (VariableNode "X")
        (VariableNode "Y")
      ) 
    ) 
  ) 
  (ListLink
    (VariableNode "X")
    (VariableNode "Y") 
  )
)
linas commented 5 years ago

The BindLink works, and is "easy", because the two IneritanceLinks cause specific values to be assigned to X and Y, and only later checks to see if a certain relation holds between them.

The original example has to find and then check all possible values of X and Y. There already is code that does this for GreaterThanLink, just not for PlusLink. For example, the below does work:

(use-modules (opencog) (opencog exec))
(Number 1)
(Number 2)
(Number 3)
(Number 4)
(Number 5)
(Number 6)
(define x 
   (Get     
      (VariableList       
         (TypedVariable (Variable "$X1") (TypeNode 'NumberNode))            
         (TypedVariable (Variable "$X2") (TypeNode 'NumberNode)))             
      (GreaterThan (Variable "$X1") (VariableNode "$X2")) ))

(cog-execute! x)
linas commented 4 years ago

FWIW, the generic theory for this is "satisfiability modulo theories" (SMT) so this is kind-of a request for an SMT subsystem for the AtomSpace.