opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
826 stars 234 forks source link

Introduce EvaluationOutput link #2915

Open ngeiswei opened 2 years ago

ngeiswei commented 2 years ago

Overview

This is an exploratory proposal to introduce an EvaluationOutput link for predicates, akin to the ExecutionOutput link for schemata.

It comes from the realization that fuzzy/probabilistic predicates as defined in PLN are in fact Boolean predicates with fuzzy/probabilistic believes of their outcomes. A formal definition of what it means is given, followed by all the ramifications that it entails.

Rational

Let's assume that fuzzy/probabilistic predicates are Boolean, meaning that their type signatures are

Domain -> Boolean

Then how can these seemingly crisp predicates be simulatenously fuzzy/probabilistic? The answer is that the fuzzy/probabilistic aspect comes from the degree of beliefs that the output of such predicate over a particular input is True or False.

Definition

To formalize such crisp/fuzzy/probabilistic unification we provide the following definition

Evaluation <TV>
  P
  X

is semantically equivalent to

Execution <TV>
  P
  X
  True

In other words

Evaluation <TV>
  P
  X

means that P(X) is expected to output True with a (second order) probability described by TV.

EvaluationOutputLink

As we know

Execution <TV>
  F
  X
  Y

is the declarative knowledge that F(X)=Y with degree TV, while

ExecutionOutput
  F
  X

represents the output of F(X) (Y in this case, if TV is absolutely true).

Likewise

Evaluation <TV>
  P
  X

is the declarative knowledge that P(X)=True with degree TV, while

EvaluationOutput
  P
  X

represents the output of P(X), True if TV is absolutely true, False if TV is absolutely false, sometimes True or False if TV is neither absolutely true or false.

For instance

Evaluation <0.9, 1>
  Predicate "Tall"
  Concept "John"

means that John is tall with degree 0.9. However

EvaluationOutput
  Predicate "Tall"
  Concept "John"

will return True 90% of the time, and False 10% of the time.

Fuzzy/probabilistic Interpretation

As posited, predicates are crisp, however evaluations can have various degrees of beliefs, due to being unknown, undeterministic or both. As shown above predicates can be combined with the usual connectors Or/And/Not. The resulting predicates are also crisp, however the degree of beliefs are determined according to fuzzy/probabilistic laws.

The formula in Chapter 2 Section 2.4.1.1 of the PLN book

s = Σₓf(B(x), A(x)) / ΣₓA(x)

where A(x) and B(x) represent degrees of beliefs, clearly indicates that these degrees of beliefs are probabilistic as it perfectly follows the definition of a conditional probability

P(B|A) = P(B ⋂ A) / P(A)

and the fuzziness only comes from the law, captured by f in the formula, with which these probabilities are combined, or equivalently how their events intersect. For instance the resulting degree of a conjunction using the product assumes probabilistic independence, while using the minimum assumes that one event completely overlaps the other, etc. In [1] Ben overloads intersection according to multiset semantics to give the traditional (Goedel) fuzzyness a probabilistic interpretation. However I think it is not a good model because such interpretation, by virtue of using multisets instead of sets, deviates from (and, I claim, is unreconcilable with) standard probability theory. It may seem like a harmless deviation, but I suspect the contrary, because the remaining of PLN still relies on standard probability theory and this creates inconsistencies across certain PLN rules. I could futher develop this point, but this is probably better kept for another issue. All we need to know for now is that f can be defined according to assumptions compatible with standard probability theory.

Virtual Clauses

In practice it follows that the use of predicates in the pattern matcher should use EvaluationOutput, not Evaluation. For instance

Get
  X, Y
  And
    Present
      Inheritance
        X
        Y
    EvaluationOutput
      GroundedPredicate "scm: pred"
      List
        X
        Y

represents the query of all X and Y such that

Inheritance
  X
  Y

is present in the atomspace and pred(X, Y) evaluates to true, where pred is a scheme function that returns #t or #f in scheme (or TrueLink/FalseLink in atomese).

LambdaLink

Likewise, the function/predicate constructor needs EvaluationOutput, not Evaluation.

For instance

Lambda <TV>
  X
  And
    EvaluationOutput
      Predicate "Tall"
      X
    EvaluationOutput
      Predicate "Strong"
      X

is semantically equivalent to

And <TV>
  Predicate "Tall"
  Predicate "Strong"

The And inside the lambda link is overloaded for Boolean logic, while the And right above is overloaded for predicates. The end result is the same, if Tall and Strong are fuzzy/probabilistic, so will be their conjunction, and thus their TVs will be equal.

On the contrary, if Evaluation is used instead of EvaluationOutput, then the following lambda

Lambda
  X
  And
    Evaluation
      Predicate "Tall"
      X
    Evaluation
      Predicate "Strong"
      X

is not a predicate but a schema that given X outputs the hypergraph

  And
    Evaluation
      Predicate "Tall"
      X
    Evaluation
      Predicate "Strong"
      X

not True or False.

Perhaps we could introduce an Imperative operator to turn a declarative statement into an imperative, evaluatable one, see the Declarative to Imperative Section below.

Declarative to Imperative

Let us introduce an Imperative operator to convert a declarative statement into an imperative one.

Imperative
  Evaluation
    P
    A

is equivalent to

EvaluationOutput
  P
  A

Now by seeing (pretty much) any link as a declarative Evaluation, we could write for instance

Lambda
  Imperative
    Member <0.5>
      Concept "John"
      Concept "Rocker"

defining a predicate that when executed would return True half of the time, since

Member
  A
  C

could be seen as

Evaluation
  Predicate "Member"
  List
    A
    C

Alternative to Imperative: Omega

An alternative to Imperative is to introduce an Omega link, that turns a evaluation into a predicate going from Ω, the underlying unknown sample space, to Boolean, then

Omega
  Evaluation
    P
    A

is a crisp predicate which is actually fully determined. The catch of course is that we do not know Ω, thus we cannot pass an argument to it, rather we may just evaluate it, whichever way it might be, could be reading a sensor for instance, and get a Boolean value.

This might relates to a currying aspect mentioned in the Temporal Logic Section, where evaluating a PLN predicate outputs an Omega predicate and evaluating an Omega predicate returns a Boolean. Thus an n-ary PLN predicate would have type: Atomⁿ↦Ωᴮ, and an Omega predicate would have type: Ω↦B, where B stands for Boolean.

Agapistic Logic

As a bonus, the clear distinction between declarative and imperative description allows to unambiguously express statements such as

"Tim likes that John likes Marie"

Evaluation
  Predicate "Like"
  List
    Concept "Tim"
    Evaluation
      Predicate "Like"
      List
        Concept "John"
        Concept "Marie"

As opposed to statements such as

"Tim likes True" or "Tim likes False"

which is probably not what we wanted.

In the absence of such Evaluation vs EvaluationOutput distinction, such ambiguity can still be resolved with quotations, so it is more a bonus than a necessity, but still.

Temporal Logic

To timestamp events, AtTime link it typically used (letting aside the debate on Atom vs Value)

AtTime <TV>
  A
  T

which, given a specialized

Predicate "AtTime"

can be defined as

Evaluation <TV>
  Predicate "AtTime"
  List
    A
    T

So far, so good, the problem comes however when we define temporal predicates using AtTime link. For instance, we have traditionally defined a predicate expressing whether John holds a key over time as

Lambda
  Variable "$T"
  AtTime
    Evaluation
      Predicate "Hold"
      List
        Concept "John"
        Concept "Key"
    Variable "$T"

However, as highlighted in the LambdaLink Section, such lambda does not define a predicate but a schema, because it does not output a Boolean.

There are several ways to address that

  1. Introduce AtTimeOutput link, such that
AtTimeOutput
  A
  T

is equivalent to

EvaluationOutput
  Predicate "AtTime"
  List
    A
    T

Then the temporal predicate above would be

Lambda
  Variable "$T"
  AtTimeOutput
    Evaluation
      Predicate "Hold"
      List
        Concept "John"
        Concept "Key"
    Variable "$T"
  1. Introduce a Temporize operator, such that
Temporize
  P

where P is a n-ari predicate, is equivalent to

Lambda
  X₁, ..., Xₙ, T
  AtTimeOutput
    Evaluation
      P
      List X₁ ... Xₙ
    T
  1. Use Imperative described in the Declarative to Imperative Section.

  2. Use Omega described in the Alternative to Imperative: Omega Section.

Higher Order Logic

PLN allows to build higher order predicates such as

Lambda
  X
  And
    Inheritance
      X
      Concept "Tall"
    Member
      Concept "John"
      X

normally corresponding to the predicate that evaluates whether any concept X inherits from Tall and has John as member. The problem again is that

Inheritance
  X
  Concept "Tall"

and

Member
  Concept "John"
  X

are declarative. To correctly formulate that, one could use the Imperative transformer

Lambda
  X
  Imperative
    And
      Inheritance
        X
        Concept "Tall"
      Member
        Concept "John"
        X

or equivalently

Lambda
  X
  And
    Imperative
      Inheritance
        X
        Concept "Tall"
    Imperative
      Member
        Concept "John"
        X

Alternatively, as described in the PLN book, one could use SatisfyingSet, combined with Indicator define here https://wiki.opencog.org/w/IndicatorLink

Indicator
  SatisfyingSet
    X
    And
      Inheritance
        X
        Concept "Tall"
      Member
        Concept "John"
        X

Loopy

Now it gets loopy. By introducing a specialized

Predicate "Execution"

one can conceivably define

Execution <TV>
  S
  I
  O

as equivalent to

Evaluation <TV>
  Predicate "Execution"
  List
    S
    I
    O

which, according to the Definition Section, is equivalent to

Execution <TV>
  Predicate "Execution"
  List
    S
    I
    O
  True

which, according to the definition above, is equivalent to

Evaluation <TV>
  Predicate "Execution"
  List
    Predicate "Execution"
    List
      S
      I
      O
    True

etc. Fortunately it seems no undesirable paradox results from such recursion as the truth value on the outer atom remains unchanged.

Everything Implicit

An alternative is to ignore all of that, to not introduce EvaluationOutput, Imperative or such, and assume that most atoms are evaluatable. If we do that it should at least be clear that the outcome of such evaluation is Boolean.

Concretely it means that calling cog-evaluate! on most atoms results in a Boolean, for instance

(cog-evaluate! (Concept "A" (stv 0.5 1))

returns True half of the time instead of (stv 0.5 1), which is weird.

Conclusion

To sum-up, it seems one can assume fuzzy/probabilistic predicate to be crisp with unknown or undeterministic evaluations captured by truth values. It's unclear at that point what are the best notations to deal with this assumption. EvaluationOutput could be one way, but there might be better ways. Perhaps one might wonder if having underlying crisp predicates is too limiting to begin with. I personally think it is not and if one really wants genuine fuzzy predicates, then one can use Generalized Distributional Truth Values https://github.com/opencog/atomspace/issues/833 or other more sophisticated constructs built on top of this assumption. The great thing about it, is that it relies solely on standard probability theory, nothing more.

References

[1] Ben Goertzel, A Probabilistic Characterization of Fuzzy Set Membership, with Application to Mixed Fuzzy-Probabilistic Inference (2009)

linas commented 2 years ago

I like the Imperative/Omega idea. A better name might be Sample, so that

Sample
   Evaluation <0.9, 1>
      Predicate "Tall"
      Concept "John"

when evaluated, returns true 90% of the time, and false 10% of the time. For comparison, please note that there already exists TruthValueOf so that evaluating

TruthValueOf
   Evaluation <0.9, 1>
      Predicate "Tall"
      Concept "John"

returns <0.9, 1>. (Offtopic: I use this link heavily in my code, to compute vector dot products: I use GetLink to find collections of Atoms, then a combination of TruthValueOf, TimesLink and AccumulateLink to do the actual arithmetic to compute the dot product. I occasionally day-dream about converting this to bytecode, to make it run faster.)

To stay compatible with the existing naming convention, Sample could be renamed to BooleanSampleOf to make it clear that it's returning crisp t/f values.

linas commented 2 years ago

Continuing with the thoughts above; there could be a LikelihoodSampleOf link, so that

LikelihoodSampleOf
   Evaluation <0.9, 1>
      Predicate "Tall"
      Concept "John"

returns a floating point number x with 0 <= x <= 1 with some distribution whose mean would be centered at 0.9. This proposal is flawed, because it needs additional parameters to make it clear what the width of the distribution was ... So lets try to fix this.

Define GaussianSample to have the form

GaussianSample
     Number 8.3
     Number 2.0

so that evaluating the above returns a random number with a normal distribution, mean 8.3 and stddev of 2.0. It could be used as

GaussianSample
      StrengthOf
           Evaluation <0.9, 1>
               Predicate "Tall"
               Concept "John"
      Number 0.33

which would return streams of floating point numbers, with mean 0.9 and stddev of 0.33.

The StrengthOfLink already exists; it just plucks out the first number from a TV. There is also a ConfidenceOf to get the second number. There's also STImportanceOf, LTImportanceof, etc. and these all work and are unit-tested.

The GaussianSampleLink, as described above, could be coded up in a short afternoon, because all of the infrastructure to make it work already exists. Note that there already exists a RandomNumberLink (see https://wiki.opencog.org/w/RandomNumberLink) which samples from a uniform distribution. It was used heavily to draw samples for the Sophia robot movements.

Continuing in this vein, there could be a BooleanRandomLink used like so:

BooleanRandom
     Number 0.7

that would return true 70% of the time, and false 30% of the time. Then, in place of ImperativeLink, (or the SampleLink in the previous comment) you could write

BooleanRandom
    StrengthOf
         Evaluation <0.9, 1>
             Predicate "Tall"
             Concept "John"

which would return true 90% of the time. The BooleanRandomLink could be coded in an afternoon, including git merge, unit tests and wiki pages, mostly because its a cut-n-paste of RandomNumberLink with some minor changes.

Of course, the full suite of arithmetic links should work:

BooleanRandom
    Min
        Number 1.0
        Plus
            StrengthOf
                  Evaluation <0.9, 1>
                     Predicate "Tall"
                     Concept "John"
             Log2
                  StrengthOf
                       Evaluation <0.3, 1>
                        Predicate "Short"
                        Concept "Susan"

because Log2Link and MinLink already exist and work, and PlusLink knows how to add things (Numbers, FloatValues, TV's and so on.) Some history: the infrastructure for this was developed circa 2017, and used heavily to animate the Hanson Robotics Sophia. I did this sitting at home in Cheung Shue Tan, instead of going into the office at HK STP.

linas commented 2 years ago

FYI, there is a demo for the dot-product, here: https://github.com/opencog/atomspace/blob/master/examples/pattern-matcher/dot-product.scm It is used to determine the similarity between a dog and a cat, based on what traits they share in common.

linas commented 2 years ago

I made three comments above, and they were all about the numeric sampling of probability distributions. By contrast, the original proposal is about Logics (Fuzzy, Probabilistic, Temporal) So what do these two have to do with one another?

I want to claim that, by properly encoding the fuzzy sampling, or probabilistic sampling, or temporal sampling, you can thereby encode the axioms and inference rules of different kinds of logics. That is, in order to construct a theorem prover to determine the likelihood of some proposition, such as "if John is tall and Susan is short then the moon is green on Tuesdays", it is enough to encode the formulas as Atomese.

Thus, for fuzzy logic, maybe you would write

And
    BooleanSample
          Evaluation <0.9,1>
              Predicate "tall"
              Concept "John" 
    BooleanSample
          Evaluation <0.3,1>
              Predicate "short"
              Concept "Susan"   

while for probabilistic logic, you would write

Times
    GaussianSample
          Evaluation <0.9,1>
              Predicate "tall"
              Concept "John" 
    GaussianSample
          Evaluation <0.3,1>
              Predicate "short"
              Concept "Susan"   

The theorem prover/aka reasoning system does not need to explicitly encode either PLN or fuzzy logic, or anything else. Instead, it just needs to do basic algebra: add and subtract known "clearbox" functions (instead of black-box GroundedPredicates), and do some symbolic reduction (a la asmoses reduct) on the algebraic expressions. In this example, the product of two gaussians is a gaussian, and you can do this calculation symbolicaly, without ever once having to actually draw any random samples. (As a bonus, you could draw a random sample, if you wanted to; you just don't need to, to arrive at an algebraic conclusion about the moon being green on Tuesdays).

Here's the catch: symbolic reduction of complex algebraic expressions can become hard, and if you need 5 or 10 steps to prove that "the moon is green on Tuesdays", the algebraic expression for that might be irreducible in any meaningful way. (You would have to define a logic which is always reducible, under reduct...)

Because of the reducibility problem, most scientists use Monte Carlo methods. In AI, this means "probabilistic programming". Well, Atomese already has much of the needed infrastructure for probabilistic programming; what is missing is an Atomese->bytecode compiler to make it run fast.

I hope the above was clear.

bgoertzel commented 2 years ago

Nil,

I like your suggestions...

I also though agree w/ LInas's observation that your Imperative construct is basically a special case of a Sample construct... the need for which we have discussed before e.g. in "OpenCoggy probabilistic programming" context

The relation btw sampling and logics that you mention Linas, I believe is totally aligned with my discussion of probabilistic programming, dependent type systems and (e.g. intuitionistic, paraconsistent) logics in https://arxiv.org/pdf/2012.14474.pdf

So Nil, I think these suggestions of yours actually converge super nicely with what I've been thinking in terms of formulating AGI algos as approximate-stochastic-dynamic-programming-ish probabilistic programming on Atomspace metagraph, https://arxiv.org/abs/2102.10581

ben

On Wed, Feb 9, 2022 at 11:17 AM Linas Vepštas @.***> wrote:

I made three comments above, and they were all about the numeric sampling of probability distributions. By contrast, the original proposal about about Logics (Fuzzy, Probabilistic, Temporal) So what do these two have to do with one another?

I want to claim that, by properly encoding the fuzzy sampling, or probabilistic sampling, or temporal sampling, you can thereby encode the axioms and inference rules of different kinds of logics. That is, in order to construct a theorem prover to determine the likelihood of some proposition, such as "if John is tall and Susan is short then the moon is green on Tuesdays", it is enough to encode the formulas as Atomese.

Thus, for fuzzy logic, maybe you would write

And BooleanSample Evaluation <0.9,1> Predicate "tall" Concept "John" BooleanSample Evaluation <0.3,1> Predicate "short" Concept "Susan"

while for probabilistic logic, you would write

Times GaussianSample Evaluation <0.9,1> Predicate "tall" Concept "John" GaussianSample Evaluation <0.3,1> Predicate "short" Concept "Susan"

The theorem prover/aka reasoning system does not need to explicitly encode either PLN or fuzzy logic, or anything else. Instead, it just needs to do basic algebra: add and subtract known "clearbox" functions (instead of black-box GroundedPredicates), and do some symbolic reduction (a la asmoses reduct) on the algebraic expressions. In this example, the product of two gaussians is a gaussian, and you can do this calculation symbolicaly, without ever once having to actually draw any random samples. (As a bonus, you could draw a random sample, if you wanted to; you just don't need to, to arrive at an algebraic conclusion about the moon being green on Tuesdays).

Here's the catch: symbolic reduction of complex algebraic expressions can become hard, and if you need 5 or 10 steps to prove that "the moon is green on Tuesdays", the algebraic expression for that might be irreducible in any meaningful way. (You would have to define a logic which is always reducible, under reduct...)

Because of the reducibility problem, most scientists use Monte Carlo methods. In AI, this means "probabilistic programming". Well, Atomese already has much (most?) of the needed infrastructure for probabilistic programming; what is missing is an Atomese->bytecode compiler to make it run fast.

I hope the above was clear.

— Reply to this email directly, view it on GitHub https://github.com/opencog/atomspace/issues/2915#issuecomment-1034107372, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNCKXGC3SCSOGRNZ72H7S3U2K4S7ANCNFSM5N5KA5MA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Ben Goertzel, PhD @.***

"My humanity is a constant self-overcoming" -- Friedrich Nietzsche