opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
820 stars 233 forks source link

ImplicationLink bug, bad interaction with MapLink #822

Closed linas closed 1 year ago

linas commented 8 years ago

See discussion on pull requests #784. I will copy some of that discussion here.

The basic proposal is to maybe do one or more of the following:

linas commented 8 years ago

OK -- now, to explain MapLink and how it currently interacts with ImplicationLink.

MapLink can be thought of in three different ways:

I hope the above makes it clear why MapLink is fun, useful, interesting.

The MapLink unit test fails on this third, last thingy -- combining Map+Implication. There is very little code that uses/needs this, but I believe that the AIML converter (or maybe the OpenPsi subset of it?) does depend on this working correctly. So if we break that, the chatbot would break.

However, there are alternatives. Note that BindLink is "exactly" equal to a GetLink followed by a PutLink. We could "get rid of" BindLink, its kind of a historical accident that happens to be used a lot. Recall that BindLink was originally created as a kind-of ImplicationLink that could be evaluated "on the spot".

Similarly, the combination of MapLink+ImplicationLink could probably be replaced by PutLink+MapLink. That is, just like Put+Get is a kind-of-implication acting on all of atomspace, similarly, Put+Map is a kind-of-implication acting only on its input set/list. Both of these perform a re-write of P(x)->Q(x) where BindLink searches the entire atomspace for x, while Put+Map searches only its input set/list for x.

This we could do this: nuke ImplicationLink.cc entirely, update the MapLinkUTet to use Put+Map instead, and update OpenPsi/AIML to use Put+Map instead of Map+Implication.

We could also create a new link type, e.g. ReWriteLink, that performs the P(x)->Q(x) operation. We could also re-define MapLink so that if it has multiple args, the first arg would be P(x) and the second arg would be Q(x). Any of these approaches seem plausible to me.

At any rate, I do not see why we need ImplicationLink.cc at all. It seems to cause semantic problems.

ngeiswei commented 8 years ago

When used with sugar syntax, Implication becomes a scope link, which is why it inherits (in atom_types.script) from SCOPE_LINK, as defined here http://wiki.opencog.org/wikihome/index.php/ImplicationLink#Definition.

All the C++ code that cast a link into a ScopeLink will thus do so for an Implication, it has to, because it is necessary to know whether a variable is free or not, the pattern matcher uses that for instance. But then we need to account for the cases where ImplicationLink is used in its standard un-sugar form

Implication
  <implicant>
  <implicand>

otherwise the ScopeLink code will possibly consider the implicant as a variable declaration.

Perhaps we should just drop the sugar form of the ImplicationLink, I personally don't mind. The upside is that it makes the definition of ImplicationLink more regular, the downside is that it makes ImplicationLink more verbose sometimes.

ngeiswei commented 8 years ago

Anyway, this certainly can wait a couple of weeks.

linas commented 8 years ago

OK, we can return to this in a few weeks. Its not clear to me why you need it to be scoped. Certainly in classical logic, its never scoped, because that is the job of the quantifiers -- its the quantifiers that perform the scoping. I'm not sure how its handled in Bayesian probability, without thinking about it. Anyway, the wiki page is insane, its got e.g. two lambdas in it, which doesn't even make sense.

ngeiswei commented 8 years ago

It makes sense if you look at it the right way. I know that implications in classical logic are never scoped, but they don't need to be, due to its crisp nature you can always assume an implicit universal quantifier without ruining the semantics of the implication, here you cannot, the conditional probabilistic semantics doesn't allow it, there you need either scope at the level of the implication, or directly use lambda which is closer to the actual semantics.

I guess it's not clearer after having said that, but if not so, trust me its not insane.

linas commented 8 years ago

As mentioned, we can return to this later. But, to me, a statement such as "probability that P(x) implies Q(x) conditioned on x being an element of A" seems to have x be a free variable in P(x)->Q(x) and the scoping is done by "conditioned on x belonging to A" However, I have not tried to use this during reasoning, using either PLN, nor have I thought about how to use in Markov logic; although I would think that scoping is always external in Markov logic, as there, the probability assignments are done by giving different weights to different "interpretations" (crisp true/false assignments to predicates).

Unrelated, I meant that the wiki page has an error in it, there's a thingy with two lambda part-way down.

ngeiswei commented 8 years ago

Again, it's not an error, you can/should have two lambdas, even with completely disjoint domains, that's not inconsistent (what would be inconsistent however would be to have a non-zero TV on the implication in such cases). But, sure, a useful implication would have its two lambdas correspond to variable declarations with at least some common intersection. Note that as soon as the type system will be expressive enough (intuitionistic logic based, etc) the cases where using two difference lambdas is useful will probably increase a lot. But for now it's easier to assume that the two lambdas have equal (or at least alpha-equivalent) variable declarations. See what I mean?

linas commented 8 years ago

OK, we should resume this after the big demo. I cannot even parse what it means to have "two lambdas", it comes off as gobbledy-gook nonsense. Maybe I'm missing something? I mean, one can always alpha-convert the variables -- in fact, one MUST alpha-convert the variables, when they are in conflict, as they would be in this case -- so, with two lambdas, the statement reads as P(x)->Q(y) for any x and y and that cannot possibly be right.

I mean -- try it: suppose P(x) is "x thinks" and Q(y) is "y is a rock" and use a valuation "x is Ben" and "y is a rock" and form "Ben thinks" you get the false conclusion that "rocks are human". It has to be thw same variable in both P and Q, which means just one lambda, not two.

ngeiswei commented 8 years ago

If their domains intersect then you can find common valuations for P and Q inputs, regardless of their variable names, for example P(x=Ben) and Q(y=Ben). Again even if their domains don't intersect you can still calculate Probability(Q|P), but it will obviously be zero.

linas commented 8 years ago

Well, but two different variables means two different valuations; there is nothing anywhere that says "find a common valuation".

In a probabilisitic setting, If the variables P,Q are independent, then Prob(Q|P) = Prob(Q) (which is certainly not zero)

In classical logic, the rule "P(x)->P(y)" is called "the rule of inference", and if P(0) is true, then P(x) is true for all (ordinals) x. Writing it as "P(x)->Q(y)" just means if P(0) is true, then for any y, Q(y) is true.

ngeiswei commented 8 years ago

But we are not talking about predicates P and Q as variables, we're talking about their domain, as long as the variables covering their domain is scoped, their names don't matter, as you know well from alpha-equivalence.

Forget classical logic in that respect, it doesn't work, because in classical logic you have an implicit outer universal quantifier over each formula, the crisp semantics allows that, it is compatible with that. But that doesn't work with PLN that has a probabilistic semantics, that is why in PLN all variables should be scoped, it takes a few more structural inference formulas (as compared to classical logic) to deal with it but that is the only way to remain consistent.

linas commented 8 years ago

I have nothing against making it probabilistic, its just that it doesn't make any sort of sense. I don't know how to interpret the meaning of such expressions. "If pencils are x then songbirds are y" -- let x be the set of all colors and y be the set of all mammals. What can that possibly mean? The probability of a pencil being a yellow #2 is around 0.5 -- how do you use that in a formula to compute the probability that songbirds are mammals? The formula itself seems to be invalid: the probability of P(x)->Q(y) should be zero, as otherwise, you can deduce nonsense.

bgoertzel commented 8 years ago

The probability of

P(x) implies Q(y)

is a bit weird, but if we look at

prob[ P(x) and Q(y) ] / prob[ P(x) ]

and consider both probabilities as expectations over the set of all pairs (x,y) [in the implicit context] it does have a logical sense to it...

-- Ben

On Sun, Jul 10, 2016 at 5:48 AM, Linas Vepštas notifications@github.com wrote:

I have nothing against making it probabilistic, its just that it doesn't make any sort of sense. I don't know how to interpret the meaning of such expressions. "If pencils are x then songbirds are y" -- let x be the set of all colors and y be the set of all mammals. What can that possibly mean? The probability of a pencil being a yellow #2 https://github.com/opencog/atomspace/issues/2 is around 0.5 -- how do you use that in a formula to compute the probability that songbirds are mammals? The formula itself seems to be invalid: the probability of P(x)->Q(y) should be zero, as otherwise, you can deduce nonsense.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/opencog/atomspace/issues/822#issuecomment-231557697, or mute the thread https://github.com/notifications/unsubscribe/AFolXFav_cUD-JEHYZPTZSyJI9FAc97gks5qUBcogaJpZM4JAd3A .

Ben Goertzel, PhD http://goertzel.org

Super-benevolent super-intelligence is the thought the Global Brain is currently struggling to form...

linas commented 8 years ago

Sigh. Ben, that is all very nice, but that is not what is written. The scoping is NOT over pairs of (x,y), the scoping is over independent x,y. If you want to scope over pairs, then write the expression so that it actually does that. What is currently written in the wiki page does not actually scope over pairs! Its scopes over two independent variables x and y, which is not the same thing!

linas commented 8 years ago

Here is another way to look at it: the pair of lambdas, one for x and one for y, is a "cartesian product". A single lambda over the pair (x,y) is the "fiber product" or "pullback". Pullbacks and cartesian products are similar, and, in certain cases, can be identical. For pure lambda calculus, they would be identical. For probability in general, they never are, unless the variables are independent. For the PLN formulas, given the way that you (Ben and Nil) are talking about it, it suddenly occurs to me that PLN is using the fiber product, and not the cartesian product. I think this is the root source of the confusion.

ngeiswei commented 8 years ago

Yes, the scopes on X and Y respectively, not over pairs. Note that

Lambda
  X
  Evaluation
    P
    X

is equivalent to

Lambda
  Y
  Evaluation
    P
    Y

which is equivalent to

P

Likewise

Implication <TV>
   Lambda
      X
      Evaluation
         P
         X
   Lambda
      Y
      Evaluation
         Q
         Y

is equivalent to

Implication <TV>
   Lambda
      X
      Evaluation
         P
         X
   Lambda
      X
      Evaluation
         Q
         X

is equivalent to

Implication <TV>
   P
   Q

P and Q ultimately denote events (elements of PowerSet(Omega), where Omega is the probabilitized universe under consideration), the probability (TV.s, purely extensionally speaking), is the probability of having Q knowing we have P, which is estimated as

[sum_{x in Omega} P(x) and Q(x)] / [sum_{x in Omega} P(x)]

Let me add something else

Lambda
  X
  Evaluation P X

is just to define a predicate, which itself define a set (it's valid inputs), which itself defines an event. Is that clear?

linas commented 8 years ago

Nil, you made exactly one statement that I can agree with, and that was about alpha conversion. Everything else you state is explicitly false!

I have given you multiple statements and proofs, and you have NOT bothered to actually rebut any of them. If you believe that what I said was wrong, explain where, how I was wrong, and which steps are incorrect!

You fail to supply any sort of details, any sort of proofs, any sort of algebra or calculations, and instead, just repeat the wrong results. Instead of repeating the results over and over again, provide actual proofs for your claims. Show your work. Indicate the detailed steps that are taken, how you move from one line to the next. Give the name of the axiom that is being applied at each step.

That is what I have done, yet you ignore my proofs completely, you just gain-say them. Instead of gain-saying, supply your own proofs.

ngeiswei commented 8 years ago

@linas, I'm busy with the demo right now, but let's get to the bottom of this soon after.

ngeiswei commented 8 years ago

@linas do you agree with the following equivalences

Implication <TV>
   Lambda
      X
      Evaluation
         P
         X
   Lambda
      Y
      Evaluation
         Q
         Y

equivalent to

Implication <TV>
   Lambda
      X
      Evaluation
         P
         X
   Lambda
      X
      Evaluation
         Q
         X

equivalent to

Implication <TV>
   P
   Q

assuming P and Q and unary predicates.

?

linas commented 8 years ago

No.

linas commented 8 years ago

Nil, you wrote:

Implication <TV>
   P
   Q

<TV> = [sum_{x in Omega} P(x) and Q(x)] / [sum_{x in Omega} P(x)]

The above seems correct and fine and just right. And then in words you said,

<TV> = Prob (Q given P)

which I also agree with -- it fits perfectly, if I interpret all of the above as standard axiomatic probability theory (per Kolmogorov). However, once you start adding variables with lambdas, that's when things fall apart for me -- I don't understand what any of those variables mean, or what they are doing, or how to interpret them.

ngeiswei commented 8 years ago

OK, so I suppose you do not agree with the following

Lambda
  X
  Evaluation
    P
    X

equivalent to

P

right?

ngeiswei commented 8 years ago

X runs over the elements of Omega. P could be seen as random variable with co-domain [0,1].

linas commented 8 years ago

Re the following: I don't really understand what it even means.

Lambda
  X
  Evaluation
    P
    X

If this was lambda-calculus, then I could see that it is a kind of function that takes an atom as input, and returns a TV as output. I can't see how that could be equivalent to just P alone, because P is just a predicate symbol, and you don't know what its valence is -- it could be a propositional symbol, it could be a binary relation, it could be anything. It feels like you are trying to use the Hintikka lemma or something like that, but I can't tell.

Without the lambda, it seems to make sense, in two different ways:

  Evaluation
    P
    X

can be interpreted as the predicate-logic predicate variable P(x). For the case of probability, I can also interpret the above as "probability of the random variable P" so that

  Evaluation <TV>
    P
    X

has <TV> = [sum_{x in Omega} P(x)]

However, when you add the lambda to that expression, I can't tell what it means, or what its supposed to be. It feels like you are trying to write the following:

SetLink
  X
  Evaluation
    P
    X

which would be the "set of all things that satisfy P" and I can see how the set of all things that satisfy P is the same thing as P itself.

ngeiswei commented 8 years ago

On 07/12/2016 03:57 AM, Linas Vepštas wrote:

Re the following: I don't really understand what it even means.

|Lambda X Evaluation P X |

If this was lambda-calculus, then I could see that it is a kind of function that takes an atom as input, and returns a TV as output.

Yes, that's exactly what it is.

I can't see how that could be equivalent to just |P| alone, because |P| is just a predicate symbol, and you don't know what its valence is -- it could be a propositional symbol, it could be a binary relation, it could be anything.

Agreed, that is why I said, assuming that P is a unary predicate. On the other hand using a Lambda can provide such specification, the type of P could as well.

It feels like you are trying to use the Hintikka lemma or something like that, but I can't tell.

I don't see the relationship with Hintikka lemma.

Without the lambda, it seems to make sense, in two different ways:

|Evaluation P X |

can be interpreted as the predicate-logic predicate variable P(x).

It can be interpreted as the application of a predicate-logic over variable x, but as not the abstraction, for that you need a lambda.

For the case of probability, I can also interpret the above as "probability of the random variable P" so that

|Evaluation P X |

has | = [sum_{x in Omega} P(x)]|

Agreed. Ideally that TV should be on the lambda though (putting it on the application feels a bit like an abuse, albeit a tolerable one).

However, when you add the lambda to that expression, I can't tell what it means, or what its supposed to be.

To build an abstraction, something that is equivalent to P rather than P(x), the difference being that P is a function, while P(x) is the output of P applied to x.

It feels like you are trying to write the following:

|SetLink X Evaluation P X |

which would be the "set of all things that satisfy P" and I can see how the set of all things that satisfy P is the same thing as P itself.

In PLN lingo, this would be a concept and being written as

SatisfyingSet
   Lambda
     X
     Evaluation
       P
       X

or

SatisfyingSet P

I think we're getting closer to an agreement :-)

linas commented 8 years ago

OK, I think I now understand what lambda x P x is. Lets go back to the implication link. Let me develop some notation. Let f stand for lambda x . P x and let g stand for lambda x . Q x Then it seems to me that the correct way to write the implication link, as a function, would be to write:

Lambda  <TV>
     Variable z
     Implication
           Evaluation
               f
               z
           Evaluation
               g
               z

The idea being here that f and g are both being evaluated at the same point. You have to evaluate at the same point, in order to get the formula

<TV> = [sum_{z in Omega} f(z) and g(z)] / [sum_{z in Omega} f(z)]

which, after beta reduction, is

<TV> = [sum_{z in Omega} P(z) and Q(z)] / [sum_{z in Omega} P(z)]

The point here is that the above is a fibered product, not a Cartesian product: P(z) and Q(z) is diagonal in z -- fibration is modding out by the diagonal. Without the fibration, if you wrote it as a pure Cartesian product, the TV formula would be

sum_{x,y in Omega} [P(x) and Q(y)] / [sum_{z in Omega} P(z) 
= 
sum_{x in Omega} sum_{y in Omega}  [P(x) and Q(y)] / [sum_{z in Omega} P(z) 
=
[sum_{x in Omega} P(x)] times  [sum_{y in Omega} Q(y)] / [sum_{z in Omega} P(z) 
=
[sum_{y in Omega} Q(y)]

That is, without the fibration, the P and Q act as independent variables ... in probability notation:

Prob(Q|P)
=
Prob (P and Q) / Prob(P)
=
Prob (P) Prob(Q) / Prob(P)
=
Prob (Q)

whenever Q does not depend on P. But the whole point of the implication link is to have Q depend on P ...

linas commented 8 years ago

You can have two variables x and y, but then you need to multiply by a function D(x,y) which is one whenever x == y and is zero otherwise. This is the "diagonal" function (aka "Dirac delta").

ngeiswei commented 8 years ago

@linas, I don't know what is a fibered product and I don't have time to get deep into category theory, though I'd love to after the demo.

To be honest I don't understand why you are even thinking about a Cartesian product. You have to see Implication as being a higher level operator, that has as domain functions (or predicates). Akin to fold, map, etc.

For instance you may write h = f + g where using + as a higher order function, that takes 2 functions and return a third one defined as

h(x) = f(x) + g(x)

Regardless of what variables you have used to defined g, for instance you might have defined g as

g(y) = exp(y)

and

f(x) = x+42

the variable names don't matter, there are locally scoped anyway.

So

Implication P Q is just exactly like that.

Maybe I missed a deeper side of your comment, but that's all I allow myself to write ATM.

ngeiswei commented 8 years ago

Reading again you comment. I got what you mean by the Diract delta. So let say that yes you are write on that, but this not what the Lambdas are showing, this part (the Diract delta) is occurring inside the calculation of the ImplicationLink TV (by PLN). However the arguments of the Implication are predicates (functions) and to define them well and abstract away their variables, Lambdas are used.

linas commented 8 years ago

Ignore the category-theoretic definition, its complicated; just think back to the plain-old ordinary way of multiplying fibers. Its old-school -- just glue the ends of the fibers together, so that for each pair, the ends happen at the same point. The cartesian product is what you get when you don't glue the ends together. So if you have x and y, the cartesian product is the ordered pair (x,y). The fiber product is the ordered pair (x,x) which equals (y,y) -- where x is glued to y so that when one moves, the other does too. Its like having a zipper and zipping together the two sides into one. In python, the operator for the fiber product is called "zip".

linas commented 1 year ago

Closing, there's nothing actionable in here.