opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
811 stars 225 forks source link

Pass truth values from GroundedPredicateNodes to rule engine from pattern matcher. #1868

Closed noskill closed 5 years ago

noskill commented 5 years ago

We need to use rule engine along with virtual evaluation links. Current pattern matcher implementation does not store truth value returned by grounded predicate. Instead matched evaluation link keeps its default value of stv(1 0). And here matched means having mean tv more than hardcoded threshold of 0.5.

Example:

(use-modules (opencog logger))

(load-from-path "conjunction-rule-base-config.scm")

(Inheritance (stv 0.99 0.99) (Concept "BB1") (Concept "BoundingBox"))
; default truth value 1.0 0.0 give 1.0 0 for any inference 
(Inheritance (stv 0.99 0.99) (Concept "BB2") (Concept "BoundingBox"))

(define Red (Predicate "Red"))

(define (redness box ) 
    (stv 0.55 (random 1.0)) 
)

(define RedBox1
    (EvaluationLink
         (GroundedPredicateNode  "scm: redness")
         (Variable "$X")
    )
)

(define RedBox2
   (EvaluationLink (stv 0.56 0.7)
         Red
         (ConceptNode "BB2")
   )
)

; need to restrict the search by inheritance link
(define AndRedBox1 (AndLink RedBox1 (InheritanceLink (VariableNode "$X") (ConceptNode "BoundingBox")) ) )
; don't need to restrict the search by inheritance link here
; do it anyway
(define AndRedBox2 (AndLink RedBox2 (InheritanceLink (ConceptNode "BB2") (ConceptNode "BoundingBox"))  ))

(define (run-bc and-box) 
    (define result (conj-bc and-box))
    (display "\n")
    (display result)
    (display "truth value for the first element in the set\n")
    (display (cog-tv (car (cog-outgoing-set result))))
    (display "\n")
    result
)

(ure-logger-set-level! "fine")
(cog-logger-set-level! "fine")
(ure-set-num-parameter conjunction-rule-base "URE:maximum-iterations" 20)
(display "query with virtual evaluation links\n")
(run-bc AndRedBox1)
(display "\n\nquery with real evaluation links\n")
(run-bc AndRedBox2)

==================================================================== Current implementation gives (stv 1.000000 0.000000) for all and links for the first conjunction (run-bc AndRedBox1):

query with virtual evaluation links

(SetLink
   (AndLink
      (EvaluationLink
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
   (AndLink
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB1")
         (ConceptNode "BoundingBox")
      )
      (EvaluationLink
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB1")
      )
   )
)
truth value for the first element in the set
(stv 1.000000 0.000000)

query with real evaluation links

(SetLink
   (AndLink (stv 0.56 0.7)
      (EvaluationLink (stv 0.56 0.7)
         (PredicateNode "Red")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
)
truth value for the first element in the set
(stv 0.560000 0.700000)

One possible fix is to store truth value somewhere at the end of DefaultPatternMatchCB::eval_term

https://github.com/opencog/atomspace/blob/e965598be2cc44fdcfd7d583a73989a4d3b647e1/opencog/query/DefaultPatternMatchCB.cc#L679-L683

Then I get the expected result (stv 0.55 0.77770132)

======================================================================

(SetLink
   (AndLink (stv 0.55 0.77770132)
      (EvaluationLink (stv 0.55 0.77770132)
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
   (AndLink (stv 0.55 0.94014286)
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB1")
         (ConceptNode "BoundingBox")
      )
      (EvaluationLink (stv 0.55 0.94014286)
         (GroundedPredicateNode "scm: redness")
         (ConceptNode "BB1")
      )
   )
)
truth value for the first element in the set
(stv 0.550000 0.777701)

query with real evaluation links

(SetLink
   (AndLink (stv 0.56 0.7)
      (EvaluationLink (stv 0.56 0.7)
         (PredicateNode "Red")
         (ConceptNode "BB2")
      )
      (InheritanceLink (stv 0.99 0.99)
         (ConceptNode "BB2")
         (ConceptNode "BoundingBox")
      )
   )
)
truth value for the first element in the set
(stv 0.560000 0.700000)
noskill commented 5 years ago

I would keep current behavior as default and pass threshold and flag to enable storing tv to pattern matcher when necessary. @linas @ngeiswei what do you think?

linas commented 5 years ago

Several short remarks: 1) You don't have to declare-before use. The below accomplishes nothing at all:

(Concept "BB1")
(Concept "BB2")
(Concept "BoundingBox")

2) I don't quite understand what you are trying to do, and I am not convinced that you are using the URE correctly. Normally, one defines a set of rules, and then runs the URE. I don't see any rule-set being declared above. I don't see why you even need chaining.

3) The pattern matcher does not store truth values, because it does not compute truth values, because it does not know how. The formulas used to compute truth values have to be specified. The above does not seem to contain any formulas that show how truth values should be computed.

4) ... which is why your example is confusing. You say that the truth value should be this and such, but how can it even be possible to know what it should be? A formula that computes the correct truth value needs to be provided. Without such a formula, its impossible to know what it should be.

linas commented 5 years ago

FWIW, there is an open task to make sure that the URE can work with formulas for any kind of values, and not just truth values. The formulas are "arbitrary", and user-specified. For example, PLN has a certain set of formulas, but one could also use a different set to implement Bayesian nets, etc.

A different open task would be to have the pattern matcher not use truth values at all, but to instead look at a single-bit true/false boolean flag in the atom. That would fully decouple the mechanics of the pattern matcher from the flow of truth through the network.

noskill commented 5 years ago

@linas Formula for computing the truth value is in examples:

https://github.com/opencog/opencog/tree/master/examples/pln/conjunction/conjunction-rule-base-config.scm https://github.com/opencog/opencog/tree/master/examples/pln/propositional/propositional-rule-base-config.scm

linas commented 5 years ago

There are no formulas whatsoever in the code that you just referenced. Those to files are config files, nothing more.

Here is an example of a formula: lines 16-17 of https://github.com/opencog/opencog/blob/master/opencog/pln/rules/propositional/formulas.scm#L17

Here's more: https://github.com/opencog/opencog/blob/master/opencog/pln/rules/propositional/consequent-disjunction-elimination.scm see lines 75-122

Those formulas are specific to PLN. Other rule collections can and do use different formulas.

Necr0x0Der commented 5 years ago

In this example, PLN is used to compute TV of a compound logical expression. It's absolutely valid usage of PLN/URE. And the problem is - if PredicateNodes are used then the output is exactly what should be expected from PLN, but if equivalent GroundedPredicates are used, then the result is incorrect meaning that the computed TVs are lost.

linas commented 5 years ago

Hi @Necr0x0Der OK, so you are claiming that this is a PLN bug. I am not familiar with PLN, so I don't know. At any rate, PLN is not a part of the atomspace, its not a part of the pattern matcher, its not a part of the URE. So it seems like this is not even the correct github repo to report this bug.

I'm going to close this for now. You have to ask Nil to take a look at it.

The point is that PLN is a layer on top of the URE, and the URE is a layer on top of the pattern matcher. Suggesting some bizarro hack to the pattern matcher to fix a bug in PLN just sounds wrong to me. There are other users of the pattern matcher besides the URE, and there are other users of the URE besides PLN. You'd have to make a much much stronger case that the pattern matcher should work in some new or different kind of way.

Necr0x0Der commented 5 years ago

No, you didn't understand. This is not specifically PLN bug. You could call it an URE bug, because whatever rules you will use in URE, (truth) values calculated by GroundedPredicates will be lost during chaining. However, the problem can be easily fixed if Pattern Matcher will store truth values of GroundedPredicates after evaluating them. URE relies on PM as on interpreter of Atomese, so either PM should be extended, or URE should re-evaluate grounded predicates...

linas commented 5 years ago

Look, can you actually explain this in a way that is understandable? To me, this comes of as a big giant ball of confusion about how the system actually works, and I don't feel I have the time to try to figure out who is confused by what. There is nothing that you said, nor is there anything in the initial bug report, that indicates that there is any bug in the atomspace.

ngeiswei commented 5 years ago

@linas I was writing to explain, but I'm thinking maybe it's better to move to singnet instead, and we'll only ping you if it really ends up concerning the pattern matcher.

linas commented 5 years ago

Look, I just now found out about singnet. Cassio made me an explicit promise that there won't be any development happening on singnet, and, as is clearly visible in github, there seems to be significant development happening there, so WTF. I'm really quite rather very unhappy about that.

It's good that there are more developers joining the project, but that means that it is time to start acting in a professional manner. Yes, the chaos of children's playgrounds can be fun, but it is not an effective way to build good software. This is especially critical at this point: if we get over-run by chaos and crazy ad-hoc patches, you'll just get a pile of shitty code. You cannot build a skyscraper out of random pieces of wood.

Let me be clear: this bug report sounds like some random confusion, with some incoherent proposal to change something for no good reason. That's not engineering, that's hacking. Hacked systems never work, they always fail. Millions of developers have been writing software since before the 1970's, and we know what happens to software when one makes random, incoherent hacks. Doing this in the singnet repo is a high-way to failure.

ngeiswei commented 5 years ago

The issue may look confusing but it's not a reason to dismiss it, if we can't sort it out here, then what? We can use emails but github offers a better platform for that so... I'm all for being super authoritative about merging, but not about discussing, otherwise we'll never make progress... OK, sure having that issue closed doesn't mean it cannot be discussed, maybe we can discuss it here while being closed...

Cassio made me an explicit promise that there won't be any development happening on singnet

Are you sure? Maybe there was a misunderstanding there... But yeah of course the plan is to have singnet repos, wholly or partly being merged back to opencog.

ngeiswei commented 5 years ago

sure having that issue closed doesn't mean it cannot be discussed

though I wonder, can people without admin rights post messages on closed issues?

noskill commented 5 years ago

I can

ngeiswei commented 5 years ago

Oh, yeah, of course, if you can create and close you should be able to post to it (it seems closed PRs however cannot receive further commits, but that's independent of rights I suppose).

linas commented 5 years ago

OK, we can discuss, but if the proposed change is to the pattern matcher (which seems to be what the proposal suggests) then the example code should invoke the pattern matcher only, and compare it to the existing pattern matcher behavior.

Somewhat off-topic, but: it has been nagging me for many many years that the pattern matcher touches truth values. I never really liked that. The thought is now crystallizing that atoms could have a single-bit crisp-logic boolean value, and that this bit could be used for all of the basic pattern matching needs. This way, other formulas, other behaviors could be moved to modules, and it would open the possibility of using other value types, besides truth values, with which to perform formula evaluations.

ngeiswei commented 5 years ago

@noskill, small comment (till I get to understand the whole thing)

  1. you can also use the logger to display debug messages, for instance
    (define (redness box ) 
    (display "redness called with: ")
    (display box)
    (display "\n")
    (stv 0.55 (random 1.0)) 
    )

    can be replaced by

    (define (redness box ) 
    (cog-logger-info "redness called with ~a" box)
    (stv 0.55 (random 1.0)) 
    )

    having previously loaded the logger with

    (use-modules (opencog logger))

That message will by default be printed in a opencog.log file, which is good if you don't want it to clobber the stdout, if you'd rather have it printed on stdout you may add

(cog-logger-set-stdout! #t)

See https://wiki.opencog.org/wikihome/index.php/Scheme#Logger for more info about the logger.

ngeiswei commented 5 years ago

"trace" is not a logger level, see https://wiki.opencog.org/wikihome/index.php/Scheme#Change_the_log_level for valid log levels.

ngeiswei commented 5 years ago

@noskill Please, simplify the issue as much as possible, for instance the display messages are not used at all, I guess they were useful to you for debugging, but they do not appear further in the issue so they are useless to the reader (and we can easily add debugging messages back ourselves if we need to).

It also looks like the conjunction could have 2 instead of 3 arguments.

Don't hesitate to simplify names as well, like replacing "BoundingBox" by "A", or "Herring" and "Red" by "P" and "Q", I mean, I don't know, do whatever you think is gonna simplify the issue, make as flowing as possible to read and understand.

ngeiswei commented 5 years ago

Don't hesitate to use markdown goodies, like specifying that the code is in scheme (as I've just done), etc.

noskill commented 5 years ago

Thanks Nil! I simplified the example a bit more, leaving only one predicate - redness. I doesn't like one-letter names though..

pennachin commented 5 years ago

@linas this is not true:

Look, I just now found out about singnet. Cassio made me an explicit promise that there won't be any development happening on singnet, and, as is clearly visible in github, there seems to be significant development happening there, so WTF. I'm really quite rather very unhappy about that.

You saw the plans for the fork, commented on them, and agreed with them in a call we had a couple of months ago.

noskill commented 5 years ago

@ngeiswei I made a unit test for this issue https://github.com/noskill/atomspace/blob/virt-eval-test/tests/query/CogBindTVUTest.cxxtest

ngeiswei commented 5 years ago

I'm still exploring but it seems the problem could be that executing an evaluation link doesn't return the evaluation instance with the TV on it although evaluating that same link does return the correct TV.

So for instance, given

(define (sp x) (stv 0.55 0.55))
(define GP (GroundedPredicate "scm:sp"))
(define A (Concept "A"))
(define E (Evaluation GP A))

The following returns the correct TV

(cog-evaluate! E)
$3 = (stv 0.550000 0.550000)

But the following

(cog-execute! E)

returns E unchanged.

Do we want (cog-execute! E) to modify its TV? My answer: I don't know.

ngeiswei commented 5 years ago

Although, it's true that if one modifies the TV of E at some point (like in DefaultPatternMatchCB::eval_term as suggested by @noskill ) then we don't need to worry about the effect of executing E.

ngeiswei commented 5 years ago

Or do we want (cog-execute! E) to output a TV too? In that case it would mean that the following

(cog-execute! (OutputExecution (GroundedSchema "scm:f") E))

would have f being called with the TV of E, not E itself (let alone E with its correct TV).

ngeiswei commented 5 years ago

@linas you did mention that you wanted to get rid of cog-evaluate! and only use cog-execute!, did you have a specific plan in mind regarding that sort of cases?

linas commented 5 years ago

Due to historical reasons, cog-execute returns an atom, while cog-evaluate returns a truth value.

Since both atoms and truth values are now a special-case of Value, perhaps there is some way to unify them. Exactly how is a rather complicated question.

Regarding the modification of truth values: if PLN wants to modify a truth value, it should do so. I (currently) see no reason to make this the default behavior of any layer of lower-level code.

At any rate, truth values aren't the only values floating around in the system. I want something that is more general, more flexible. Which is why I don't want special-case treatment of truth values in low-level code.

Regarding the modification of truth values in general: the agi-bio guys have a very different problem, which suggests that modifying truth values is ... problematic. Here's the problem: they want to load large datasets of genomic/proteomic data into the atomspace, mark those datasets as read-only, and then run PLN or MOSES or whatever on top of those datasets, changing truth values every which way, and then save the delta-changes.

The datasets need to be read-only, because they are gigabytes in size, and are shared by hundreds of researchers. Each researcher wants to tweak each dataset several different ways. We don't want to make thousands of copies of a 50-million atom dataset, all of which are almost exactly identical, except for the truth value on several 100K of the atoms. Ideally, we want to work with and save only deltas. Exactly how to accomplish this is ... unclear. I have some ideas.

Issue #1855 describes the agi-bio problem.

ngeiswei commented 5 years ago

Interesting, if you replace sp by

(define (sp x)
  (let* ((tv (stv 0.55 0.55))
         (gp (GroundedPredicate "scm:sp"))
         (ev (Evaluation gp x)))
  (cog-set-tv! ev tv)
  tv))

and run

(cog-evaluate! E)

Then it does set the TV of E to the desired value.

If the pattern matcher can properly maintain that TV then I think it would be the way to go. It would also allow to have complete control over what predicate instances get stored, as opposed to if the storage takes place inside DefaultPatternMatchCB::eval_term or such.

If the pattern matcher cannot properly maintain that TV then I think we have a bug to address.

linas commented 5 years ago

Re: this question:

@linas you did mention that you wanted to get rid of cog-evaluate! and only use cog-execute!, did you have a specific plan in mind regarding that sort of cases?

ngeiswei commented 5 years ago

Agreed, @linas, I think it should be up to the grounded predicates to store their TVs if they want to. However it seems the pattern matcher fails to maintain these TVs, and I think this is a bug.

I'm gonna create an issue that demonstrates that...

ngeiswei commented 5 years ago

@linas, when you say "the atomspace is NOT lambda calculus", do you have in mind relational programming languages like miniKanren?

ngeiswei commented 5 years ago

Weird, the unit test I thought would highlight a bug in the pattern matcher passes...

@noskill @Necr0x0Der, it might indicate that this way might work for you. If it doesn't then maybe the unit test is too minimal to capture your problem.

Necr0x0Der commented 5 years ago

Thanks a lot, @ngeiswei ! We will check this carefully. Although I believe that different results coming from the use of GroundedPredicates and Predicates isn't a good design, it would be enough for us if your hack works, and it would be up to you and @linas if to consider the different behavior of two types of predicates during chaining as a bug or a feature.

ngeiswei commented 5 years ago

Yeah, I do feel we probably want a generic mechanism as well, but I guess we'll see, if that need reoccurs sufficiently often we'll have to bite the bullet at once.

Necr0x0Der commented 5 years ago

@ngeiswei , it seems that your hack works although one should be careful. For example, if the predicate is called somewhere with ListLink:

    (EvaluationLink
         (GroundedPredicateNode  "scm: sp")
         (ListLink (Variable "$X")))

then one need to set TV also with 'ListLink: (ev (Evaluation gp (ListLink x))) in your code. Otherwise it will not work, and visa verse. With more careful inspection with @noskill , we found the place (in EvaluationLink.cc), where one can add setTruthValue to make this wrapping unnecessary. Namely, in EvaluationLink::do_eval_scratch we need to replace return do_evaluate(scratch, sna.at(0), args, silent); (line 303) with

        TruthValuePtr tv(do_evaluate(scratch, sna.at(0), args, silent));
        evelnk->setTruthValue(tv);
        return tv;

This will set the latest evaluated truth value for EvaluationLink. This is the same way as done for TRUTH_VALUE_OF_LINK (line 518, although with 'FIXME' comment). I'm not absolutely sure that setTruthValue in do_eval_scratch is better than your workaround. In general, it seems better - less confusing and more convenient, but it prevents your hack from being usable. I mean if you write something like

(define (sp x)
  (let* ((tv (stv 0.55 0.55))
         (gp (GroundedPredicate "scm:sp"))
         (ev (Evaluation gp x)))
  (cog-set-tv! ev (stv 0.75 0.75))
  tv))

then the resulting TV set on EvaluationLink will be 0.55 - not 0.75. I don't know if someone would even need such strange code, but who knows... So, what do you think, should we modify EvaluationLink::do_eval_scratch or use your hack to explicitly set TV for EvaluationLink?

ngeiswei commented 5 years ago

The corner case of setting the evaluation link with a different value than what the grounded predicate outputs seems like it would almost never be useful, in fact it's perhaps a downside of my suggestion, not sure.

The downside of your suggestion is that it affects all evaluations alike, while I suppose in practice one may want to be selective about what to record and what not to record. But maybe it's not a problem in practice. Maybe you want to have a flag, or maybe systematic TV caching is OK.

Hmm, I kinda wonder what happens to temporary evaluations, ones that get evaluated during the course of searching but that never leads to calling the bindlink rewrite term? I suppose these just never get copied over the user atomspace, that irrespectively of whether their TVs have been cached or not.

Anyway, if we are to go that route I think indeed the place you suggest (line 303 of EvaluationLink.cc) is the right one. However I think it's worth thinking a bit more about the implications of this.

@linas, what do you think? If you're not sure what we're talking about, take a look at https://github.com/opencog/atomspace/blob/master/tests/query/BindTVUTest.cxxtest#L79 it should become pretty clear what they're trying to achieve.

Necr0x0Der commented 5 years ago

@ngeiswei , I have had similar thoughts. However, 1) Setting the latest TV is better than setting nothing at all. 2) Links for different groundings should cache different values, e.g. (Evaluation gp (Variable "$X")) can become (Evaluation gp (Concept "A")) or (Evaluation gp (Concept "B")) and their associated cached TVs should be different. We can try to test this. 3) If the grounded predicate is stochastic, then its TVs will change every time it is evaluated. It can be considered as an intended behavior, and the question is not whether to the last cache TV or not, but is when to evaluate the predicate. 4) If the grounded predicate calculates TV based on values associated with the same Atom, then different TVs will not be cached. In this case, either the predicate itself or the inference engine (URE/specific rules?) should implement the caching mechanism. The predicate will simply return the relevant cached value while being executed (so, yeah, grounded predicates will most likely be responsible for this sort of caching unless we will see the common use of some general mechanism). Again, the question is not whether the returned value is set as a TV for EvaluationLink, but when this link is evaluated. So, I guess it is safe and consistent to assign the latest evaluated TV to the EvaluationLink. And when this link is being evaluated during the inference process is another question.

ngeiswei commented 5 years ago

@Necr0x0Der If I understand correctly, what you're saying is

  1. The TV can be safely systematically cached
  2. Whether or not that cached TV should re-used is up to the grounded predicate

I think I agree with that.

ngeiswei commented 5 years ago

According to my runs such change doesn't break unit tests and doesn't introduce slow down. I've nothing against it.

I just think it should by implemented well, that is it should depend on a flag on EvaluationLink, set to true by default, so we can easily disable it if we want to. Should be commented in the code and the header comment of do_eval_scratch, and on the wiki https://wiki.opencog.org/w/EvaluationLink#Grounded_and_Defined_predicates, and, if you care about it ;-), a unit test should be added.

Let's wait for @linas feedback...

ngeiswei commented 5 years ago

The flag could be a macro, but otherwise, frankly, the RAM and CPU overhead should be abysmal anyway.

ngeiswei commented 5 years ago

Yeah, better be an actually class member that at leaves some run-time control over it, but that's a detail... Let's see what @linas has to say first.

Necr0x0Der commented 5 years ago

The flag would be ok. Yes, the unit test (checking different values for different input atoms) is necessary.

linas commented 5 years ago

On Wed, Sep 26, 2018 at 6:21 AM Nil Geisweiller notifications@github.com wrote:

@linas https://github.com/linas, when you say "the atomspace is NOT lambda calculus", do you have in mind relational programming languages like miniKanren?

No. The atomspace is first and foremost a knowledge representation (KR) system, which means that it implements the axioms of "relational algebra". Another very famous system that implements the relational algebra is SQL. Any KR system is going to implement the relational algebra. For example, prolog. The KR-subsystem of prolog is called "datalog", it is what is left over when you rip out inferencing, chaining, logic out from prolog.

The miniKanren project is something else; they are trying to create a pure-scheme implementation of something that looks like prolog. As it clearly shows, implementing logic and inferencing and KR on top of scheme is awkward and not scalable and provides no marvelous revelations, other than that it's an uneasy partnership.

Roughly speaking, its because one is trying to marry together a "no side-effects" style of thinking, with a "nothing except side-effects" style of thinking. More generally, one should step back and ask "why would one ever want to marry KR and lambda calc?" The obvious answer is "cause it's geeky fun". The next question is: "can it be fast, flexible, efficient, scalable, easy-to-use?" and the answer is, so far, seems to be "no", from what I can tell. But maybe I'm wrong. The last question is "what can the atomspace learn from it?" and right now, the answer seems to be "nothing, so far". I haven't seen anything appealing, but maybe I missed something.

The axioms of relational algebra differ from the axioms or lambda calculus, and I want to avoid adding the axioms of lambda calculus to the atomspace, despite the fact that we've already taken most of the steps to get there. For example, there's a problem with quotation -- you wrote most of the code to do quotation, and surely you understand just how ugly and incompatible it is, when used in the context of knowledge representation? There has to be a better way.

linas commented 5 years ago

On Thu, Sep 27, 2018 at 8:30 AM Necr0x0Der notifications@github.com wrote:

So, what do you think, should we modify EvaluationLink::do_eval_scratch or use your hack to explicitly set TV for EvaluationLink?

Guys,

Look, it would be nice if you did not ignore everything that I write, and continue pressing that an ugly hack be implemented. The goal of good design is to have a consistent system that is easy to understand and behaves as expected. Adding ugly hacks to work around bugs located somewhere else is not good design.

linas commented 5 years ago

On Fri, Sep 28, 2018 at 2:20 AM Nil Geisweiller notifications@github.com wrote:

systematic TV caching is OK.

Systematic TV caching is bad for the following reasons:

1) its slow. It forces everyone to pay the price for this, even if they do not need it.

2) it bloats the atoms and the atomspace. The default TV takes zero bytes to store. Any TV that is not the default TV is going to use up space.

3) It bloats disk-usage. If you save the dataset to disk, all of those TV's will get saved, slowing down atom store performance, atom fetch performance.

4) What's the point of caching, if you do not also provide a way of verifying that the cached value is not stale? How do you know that it is still accurate? What happens if that same evaluation link is used in two different queries, running in two different threads? Which query "wins"? Which of the two values is "correct"? How could you ever know?

There are few, very nearly zero users who will even be aware that there is a cached value being stored. Why should everyone pay a penalty price, just because some obscure user might be doing something ill-defined and ambiguous with some computed value?

@linas https://github.com/linas, what do you think?

I want to remove truth values from the pattern matcher. More and more, it is starting to feel like a design mistake.

-- cassette tapes - analog TV - film cameras - you

noskill commented 5 years ago

I want to remove truth values from the pattern matcher. More and more, it is starting to feel like a design mistake.

@linas What then would be difference between atomese and prolog's family of languages? I somehow got into thinking that atomese is an extension of logic programming with probabilistic reasoning..

noskill commented 5 years ago

3) It bloats disk-usage. If you save the dataset to disk, all of those TV's will get saved, slowing down atom store performance, atom fetch performance.

Can't we just use truth value during computation in rule engine? That's our use case. Currently we create child atomspace before request to rule engine, and destroy it afterwards.. There should be easier way to do some requests without bloating of atomspace.

linas commented 5 years ago

What then would be difference between atomese and prolog's family of languages? I somehow got into thinking that atomese is an extension of logic programming with probabilistic reasoning..

Atomese has zero logic programming in it, and zero probabilistic reasoning. Atomese is a knowledge-representation system, plus a collection of tools and layers that allow you to implement logic programming, if you wish. Currently, we do not implement any logic programming system. (That might change someday) We do implement a probabilistic reasoning system, called PLN, but that lives in a different github repo. There's all sorts of neat stuff we should get around to implementing, but it takes .. time and effort.

linas commented 5 years ago

Can't we just use truth value during computation in rule engine? That's our use case. Currently we create child atomspace before request to rule engine, and destroy it afterwards.. There should be easier way to do some requests without bloating of atomspace.

I don't understand what this is saying, so I can't answer it.

I am trying to replace truth values by just-plain values, which are more general and more flexible. However, since the atomspace is legacy code, there's still a lot of truth values everywhere. Removing all of them will take time.

The rule engine ... well ideally, it would accept, as input, a set of rules, a set of formulas, an atomspace containing the rules and formulas, and maybe a different atomspace containing the data. When a rule fires, it should cause a formula to run. The inputs to a formula should be generic values, not truth values, and the outputs should be generic values as well.

PLN is a specific set of rules and formulas, described in the PLN book. Other rules and formulas are possible.

One of the design goals of the atomspace is to allow simple, easy representation of "generic knowledge", and then allow many different kinds of AI algorithms to manipulate that knowledge. Examples of AI algorithms include: rule-engines, parsers, constraint-solvers, logic solvers, search algos, data-mining algos, neural nets, etc.

So far, only a few of these have been implemented. So far, its been kind-of awkward and kind-of slow. Just about any special-case system, e.g. from apache foundation, is going to outperform the atomspace+layers on top of it. The atomspace is trying to be a generic, neutral platform, for any AI algo, which is something no one else attempts to do. Whether th atomspace is successful in doing this ... is open to debate.

I'm pushing back on this bug report because the proposed solution is not generic. For example, people who want to run neural nets on top of the atomspace really wouldn't need or want the solution you propose.