PLN modusPonensRule output TruthValue has incorrect strength and confidence

cosmoharrigan commented 10 years ago

Note the confidence value of the output "Anna has cancer": it is 0.001, even though the EvaluationLink for "Anna smokes" and the ImplicationLink for "If X smokes, then X has cancer" both have strength 1 and confidence 1. That seems to be an issue.

-- Output:
(EvaluationLink (stv 1 0.001248)
  (PredicateNode "cancer")
  (ListLink (stv 1 0)
    (ConceptNode "Anna")))

-- using production rule: ModusPonensRule

-- based on this input:
[(ImplicationLink (stv 1 1)
  (EvaluationLink (stv 1 0)
    (PredicateNode "smokes" (stv 1 0))
    (ListLink (stv 1 0)
      (VariableNode "$X" (stv 1 0))))
  (EvaluationLink (stv 1 0)
    (PredicateNode "cancer" (stv 1 0))
    (ListLink (stv 1 0)
      (VariableNode "$X" (stv 1 0)))))
, (EvaluationLink (stv 1 1)
  (PredicateNode "smokes" (stv 1 0))
  (ListLink (stv 1 0)
    (ConceptNode "Anna" (stv 1 0))))]

I looked at the modusPonensFormula: https://github.com/opencog/opencog/blob/b4af7b039b20ceeb78cbe8e81fe40370aafdec6f/opencog/python/pln/rules/formulas.py#L93

it contains the following calculation:

NotAB = TruthValue(0.2, 1)
return preciseModusPonensFormula([AB, NotAB, A])

There are two problems with the expression TruthValue(0.2, 1):

For strength, what does the arbitrary value "0.2" represent?
A count value of 1 represents a confidence of nearly 0; is this supposed to be TruthValue().confidence_to_count(1) instead?

linas commented 10 years ago

On 16 March 2014 15:33, Cosmo Harrigan notifications@github.com wrote:

A count value of 1 represents a confidence of nearly 0; is this supposed to be TruthValue().confidence_to_count(1) instead?

I suggest that the cython interfaces for truth values the changed so that TruthValue(a,b) has a==strength and b==confidence. Currently, it seems that b==count, which lads to bugs.

This seems to be the way that everyone tends to think of truth values anyway -- as strength+confidence, not strength+count. Change the API to be 'natural' to users.

--linas

cosmoharrigan commented 10 years ago

That may be a good idea. But the C++ interface to SimpleTruthValue also uses the syntax <strength, count> in its constructor. If we change one, we should change the other as well.

cosmoharrigan commented 10 years ago

For reference, section 5.7.1 in the PLN book says this about the Modus Ponens formula:

This is naturally approached in a PLN-deduction-ish way via

P(B) = P(B|A)P(A) + P(B|¬A)P(¬A)

But given the evidence provided, we can’t estimate all these terms, so we have no
choice but to estimate P(B|¬A) in some very crude manner. One approach is to set
P(B|¬A) equal to some “default term probability” parameter; call it c. Then we
obtain the rule

sB = sAB sA + c(1 - sA)

cosmoharrigan commented 10 years ago

Does the ModusPonensRule work the same way for EvaluationLink (fuzzy) and InheritanceLink (probabilistic)?

@jadeoneill how do you think this should be addressed?

edajade commented 10 years ago

The c factor is made up. Please do not give it an infinite confidence, we do not have an infinite amount of evidence that some random number is correct!!!!

Also count is much more natural in PLN terms than "confidence". confidence is not a probability, it's just nonsense. Count is the number of pieces of evidence used, so it actually makes sense to use it!

Currently ModusPonens is just assuming that the probability = the fuzzy value. I'd rather just use probabilities most of the time as they're more useful

On Mon, Mar 17, 2014 at 8:27 AM, Cosmo Harrigan notifications@github.comwrote:

Does the ModusPonensRule work the same way for EvaluationLink (fuzzy) and InheritanceLink (probabilistic)?

@jadeoneill https://github.com/jadeoneill how do you think this should be addressed?

Reply to this email directly or view it on GitHubhttps://github.com/opencog/opencog/issues/598#issuecomment-37771579 .

edajade commented 10 years ago

On second thoughts your fix seems OK. makeUpCount just takes the lowest count, so having an infinite certainty for a made up factor is actually ok (and will mean the results have a reasonable count based on the count of the input)

cosmoharrigan commented 10 years ago

The c factor is made up.

How can we improve that situation?

linas commented 10 years ago

On 16 March 2014 18:15, jadeoneill notifications@github.com wrote:

Also count is much more natural in PLN terms than "confidence". confidence is not a probability, it's just nonsense. Count is the number of pieces of evidence used, so it actually makes sense to use it!

Hmm. For my stuff, neither count nor confidence make much sense ... The raw count is scale-free, you need to convert it to a (relative) entropy to have it make sense.

--linas

cosmoharrigan commented 10 years ago

Update: this issue is still outstanding.

The modusPonensFormula will produce output with a truth value strength of 0.2 from an antecedent with truth value of 0. This can lead to nonsensical results.

If you try to use NotLink with truth value strength 1 to represent the negation of a predicate, this will still happen, because the NotEliminationRule will simply convert that representation into a representation without the NotLink that has a truth value strength 0, which will then cause the modusPonensFormula to produce an output with truth value strength 0.2.

Example:

If there were non-friendships defined like this:

friends(Bob, Gary) <0, 1>

along with this statement containing an implication and a biconditional: ∀x,y Friends(x,y) ⟹(Smokes(x) ⟺ Smokes(y))

and then, the ModusPonensRule would take as input the predicate friends(Bob, Gary) <0, 1>

The modusPonensFormula then produced a truth value of 0.2 for the consequent from the antecedent truth value of 0.

That's because the strength calculation is currently implemented like this:

sB = sAB * sA + 0.2 * (1 - sA)

which, in this case, produced:

sB = 0.4 * 0 + 0.2 * 1 = 0.2

causing a new predicate to be formed indicating that Gary smokes.

How should the modusPonensFormula be fixed?

modusPonensFormula preciseModusPonensFormula symmetricModusPonensFormula

edajade commented 10 years ago

I built a PreciseModusPonensRule that takes (Implication A B) and (Implication (Not A) B) as inputs, i.e. it actuallly uses sNotAB instead of a made-up parameter (also called c).

On Tue, Mar 25, 2014 at 10:28 AM, Cosmo Harrigan notifications@github.comwrote:

Update: this issue is still outstanding.

The modusPonensFormula will produce output with a truth value strength of 0.2 from an antecedent with truth value of 0. This can lead to nonsensical results.

If you try to use NotLink with truth value strength 1 to represent the negation of a predicate, this will still happen, because the NotEliminationRulehttps://github.com/opencog/opencog/blob/188b8c856f184383df068d4e9b89087fe503ed9e/opencog/python/pln/rules/boolean_rules.py#L270will simply convert that representation into a representation without the NotLink that has a truth value strength 0, which will then cause the modusPonensFormula to produce an output with truth value strength 0.2.

Example:

If there were non-friendships defined like this:

friends(Bob, Gary)

along with this statement containing an implication and a biconditional: ∀x,y Friends(x,y) ⟹(Smokes(x) ⟺ Smokes(y))

and then, the ModusPonensRule would take as input the predicate friends(Bob, Gary)

The modusPonensFormula then produced a truth value of 0.2 for the consequent from the antecedent truth value of 0.

That's because the strength calculation is currently implemented like this:

sB = sAB * sA + 0.2 * (1 - sA)

which, in this case, produced:

sB = 0.4 * 0 + 0.2 * 1 = 0.2

causing a new predicate to be formed indicating that Gary smokes.

How should the modusPonensFormula be fixed?

modusPonensFormulahttps://github.com/opencog/opencog/blob/b4af7b039b20ceeb78cbe8e81fe40370aafdec6f/opencog/python/pln/rules/formulas.py#L90 preciseModusPonensFormulahttps://github.com/opencog/opencog/blob/b4af7b039b20ceeb78cbe8e81fe40370aafdec6f/opencog/python/pln/rules/formulas.py#L98 symmetricModusPonensFormulahttps://github.com/opencog/opencog/blob/b4af7b039b20ceeb78cbe8e81fe40370aafdec6f/opencog/python/pln/rules/formulas.py#L108

— Reply to this email directly or view it on GitHubhttps://github.com/opencog/opencog/issues/598#issuecomment-38514491 .

cosmoharrigan commented 10 years ago

@jadeoneill but if a user uses modusPonensRule, then modusPonensFormula just creates the NotAB truth value of strength 0.2 and then calls preciseModusPonensFormula and passes that made up truth value strength.

So are you suggesting that a user should use preciseModusPonensRule directly, in which case the user would need to ensure that sNotAB is defined in the atomspace in addition to sAB?
And, if so, what is the use of modusPonensRule?

edajade commented 10 years ago

Or we could make a rule that chooses which formula to apply (i.e. it tries to find the extra input if possible using a custom_compute method)

On Tue, Mar 25, 2014 at 10:50 AM, Cosmo Harrigan notifications@github.comwrote:

@jadeoneill https://github.com/jadeoneill but if a user uses modusPonensRule, then modusPonensFormulahttps://github.com/opencog/opencog/blob/b4af7b039b20ceeb78cbe8e81fe40370aafdec6f/opencog/python/pln/rules/formulas.py#L90just creates the NotAB truth value of strength 0.2 and then calls preciseModusPonensFormula and passes that made up truth value strength.

-

So are you suggesting that a user should use _preciseModusPonensRule_directly, in which case the user would need to ensure that sNotAB is defined in the atomspace in addition to sAB?

And, if so, what is the use of modusPonensRule?

Reply to this email directly or view it on GitHubhttps://github.com/opencog/opencog/issues/598#issuecomment-38515946 .

cosmoharrigan commented 10 years ago

But what's the use of the modusPonensRule in its current form, containing the arbitrary number "0.2"?

edajade commented 10 years ago

It lets you do more standard-looking inferences that don't require (Implication (NotA) B) I agree that if you can make up (or datamine) both implicationlinks then the precise form is a lot better.

On Tue, Mar 25, 2014 at 11:20 AM, Cosmo Harrigan notifications@github.comwrote:

But what's the use of the modusPonensRule in its current form, containing the arbitrary number "0.2"?

Reply to this email directly or view it on GitHubhttps://github.com/opencog/opencog/issues/598#issuecomment-38518071 .

linas commented 10 years ago

As a spectator from the distance....

On 24 March 2014 18:50, Cosmo Harrigan notifications@github.com wrote:

@jadeoneill https://github.com/jadeoneill but if a user uses modusPonensRule, then modusPonensFormulahttps://github.com/opencog/opencog/blob/b4af7b039b20ceeb78cbe8e81fe40370aafdec6f/opencog/python/pln/rules/formulas.py#L90just creates the NotAB truth value of strength 0.2 and then calls preciseModusPonensFormula and passes that made up truth value strength.

Bing!

So, that sounds very reasonable to me. This allows the truth value of notAB to evolve over time, getting stronger or weaker, and 0.2 is just some initial guess... which no crazier than making it be 0.0 or 0.5 or 1.0 initially. In the other email thread, Ben suggests other ways of initializing this, which now sound plausible as long as they're interpreted as "the truth value of notAB".

-

So are you suggesting that a user should use _preciseModusPonensRule_directly, in which case the user would need to ensure that sNotAB is defined in the atomspace in addition to sAB?

... or its created automatically, if not already present?

-

And, if so, what is the use of modusPonensRule?

— Reply to this email directly or view it on GitHubhttps://github.com/opencog/opencog/issues/598#issuecomment-38515946 .

linas commented 8 years ago

Closing; I think this concerns the now-obsolete python-PLN codebase.

opencog / opencog

PLN modusPonensRule output TruthValue has incorrect strength and confidence #598

So are you suggesting that a user should use _preciseModusPonensRule_directly, in which case the user would need to ensure that sNotAB is defined in the atomspace in addition to sAB?