opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
819 stars 232 forks source link

Replace AtomSpace with AtomSpaceLink #1967

Closed linas closed 1 year ago

linas commented 5 years ago

This is an old idea, but without it's own issue. Time to describe it more carefully. This would resolve issues #1855 and #1921 This depends on #1502 being implemented first.

Proposal: Implement a new kind of very special link, called an AtomSpaceLink, that behaves a lot like a MemberLink. Move Values from Atoms to AtomSpaceLinks. Design pros and cons follow.

Why?

Pros:

Cons:

linas commented 5 years ago

Here's a sketch of one possible implementation:

This seems like the simplest, most straight-forward implementation, and should not be hard. It seems like it will definitely use more RAM than the current design. It will slow down the default TV access by a "little bit" (how much?)

How, exactly does pattern matching work? It would be a real performance bummer if atomspace membership has to be checked for every single atom during a pattern search. Maybe there could be a way of doing a high-speed "get incoming-set-by-atomspace?" That would solve the problem!?

ngeiswei commented 5 years ago

Remove Values from Atoms. Place Values on AtomSpaceLinks.

@linas, could you expand a bit? I understand you'd want the same atom to hold different value per atomspace, but I don't see how values would be accessed, and at what cost.

Wouldn't it be better to have values remain on atoms but wrapped in a map AtomSpace -> value ?

BTW, this is reminiscent from the old ContextLink, maybe we want to resurrect/improve ContextLink instead of introducing AtomSpaceLink... Just an idea...

linas commented 5 years ago

could you expand a bit?

Is there a specific bullet-point that you'd like more info on? Up top is a list for why this seems like a good idea -- it seems to solve a number of different design issues, all with just "one weird trick". I can explain the issues in greater detail...

better to have values remain on atoms but wrapped in a map AtomSpace -> value

Yes, maybe. I'm adding that to the sketch above. I haven't tried to implement this, because of various uncertainties like this.

ContextLink ... resurrect

We can call the new thing "ContextLink"; I was definitely thinking of that while writing this proposal. As to resurrecting ancient code ... noooo! It was horrible, terrible code, and besides, everything has been redesigned maybe three or four times since that code was removed...everything is now completely different. (Actually, it was called ContextualTruthValue, not ContextLink.)

ngeiswei commented 5 years ago

We can call the new thing "ContextLink"

Actually, we probably shouldn't call it ContextLink because it already has a defined PLN semantics (it is almost like an ImplicationLink or InheritanceLink but different). I think the term AtomSpaceLink is right. Then if it turns out ContextLink and AtomSpaceLink can be unified into something more elegant we can do it later.

linas commented 5 years ago

it turns out ContextLink and AtomSpaceLink can be unified into something more elegant we can do it later.

I am not planning on implementing this any time soon; I have a backlog of unfinished work. I've been thinking about this for maybe a year now, but for some reason, there was no distinct github issue for this. This is part of the elminate-SetLink and the distributed-atomspace on-top-of some-other graph-DB discussions; they're all interconnected.

linas commented 5 years ago

Edit: so if you want to take some time to explain clearly what ContextLink is, what it should do, this is not a bad time.

ngeiswei commented 5 years ago

The PLN book definition of ContextLink is

ContextLink <TV>
   C
   R A B

is equivalent to

R <TV>
   (A AND C)
   (B AND C)

However it's not true for all R. It works if R is an Implication or Inheritance or such, but doesn't necessarily work if R is say an Evaluation. I suspect one needs to assume that the predicate is https://wiki.opencog.org/w/Soggy_Predicates or something like that.

Anyway, since ContextLink is not currently used I haven't felt the need to get to the root of it yet.

linas commented 5 years ago

Hmm. That is over-specific, over-constrained. I'm contemplating:

ContextLink <any-value>
    C
    A

is equivalent to  .... umm, yeah, equvalent to, how shall I explain it:

    (A AND C) <any-value>

that is, 

AndLink <any-value>
    C
    A

except more like

MemberLink <any-value>
    C
    A

Hopefully obvious here is that the concepts of "indicator function", "set membership", "set intersection" and "logical-AND" are all kind-of different notations for saying the same thing. So, in a way, I'm trying to demote the atomspace into being "just another set", so that AtomSpaceLink is a lot like a MemberLink.

But the atomspace is not really "just another set", it is a universe of all things, but only one universe out of possibly many. That means that it is a universe in a kripke frame. See https://en.wikipedia.org/wiki/General_frame -- so using the notation of that article, GF=<F,R,V> is a general frame, F is the set of all atomspaces (the set of all contexts), R is the set of rules in the rule engine, and V is the set of values. (Yes, I am violating the strict definition given in that WP article; I'm trying to go for the intended meaning). (Part of the intended meaning is that a "context" is just the local universe in which things currently hold, so that roughly, a context, and a local atomspace are the same thing).

I'm less worried about the mathematical preciseness of this, or the need to stick to some specific proof theory; rather I'm just looking for a good, efficient, direct API to multiple atomspaces that resolves the various technical issues we've had, while also providing a good setting for proof theory in general. So I'm willing to say that an AtomSpceLink is kind-of-like a MemberLink is kind-of-like a ContextLink, is kind-of-like a frame, etc. as long as the final result is eventually morphs into a --- good, efficient, direct API to multiple atomspaces that resolves the various technical issues we've had.

linas commented 5 years ago

Here's maybe another way I'm trying to think of this. In some variants of the backward chainer, you had these BIT things (backward inference trees? some kind of inference trace?) and, according to my understanding of proof theory, each node in that tree corresponds to a , umm "judgement" in natural deduction https://en.wikipedia.org/wiki/Natural_deduction or, rather, that subset of the atomspace that is relevant at that particular point in time, for that context of things that have been introduced. And I'm calling the set of those things a "kripke frame" because they are the set of things that one could possibly infer, given that one has only taken N steps of inference so far. Yes, I am horribly mangling the terminology. And I'm intentionally confusing natural deduction with https://en.wikipedia.org/wiki/Sequent_calculus The reason for the intentional mashup is to make something generic enough allow all these variants at once (e.g. to allow BIT trees to be efficiently stored, without needing a new C++ structure for them) but mainly to resolve issues #1855 and #1921 with #1502 as a pre-req.

So that https://wiki.opencog.org/w/FilterLink and GetLink become "the same thing" Or perhaps, more accurately, https://wiki.opencog.org/w/MapLink and GetLink become the "same thing" So that (looking at the MapLink wikipage), if one replaces the SetLink by the set of all things in an atomspace, the MapLink just becomes the same thing as a GetLink.

Put it a different way: its somehow clear that we've done the concept of "set" wrong, and this is an attempt to fix this, to turn atomspaces into sets or into contexts.

ngeiswei commented 5 years ago

BTW, most of the complexity of the BIT code goes into sticking rules to inference trees. The Back Inference Tree itself is merely a population of inference trees, i.e. BindLinks. It has some caches as well to avoid reapplying rules etc, which could probably be replaced by Value. So overall I think it shouldn't be hard to have most or all of the data structure in an AtomSpace.

linas commented 2 years ago

FYI, Some of what is suggested here has been implemented:

The above bullets are distinct from what was proposed in this issue. What was proposed was that there would be another class, very distinct from the current AtomSpace in design, that would act as a "wrapper" around an Atom. Such a wrapper does seem to solve some design problems.... the biggest problem is that such a wrapper would seem to chew up more RAM than the current design.

linas commented 1 year ago

Closing. Frames have been implemented. There does not (at this time) seem to be any performance advantage from detaching Values from Atoms; there are several ways in which this seems to make things slower. I've been pondering this idea for years, and it is just not working out.

BTW, as to BIT I assume that frames can handle 80% of what BIT does. Note also there is now a UnifierLink that wraps the unifier, and also a RuleLink that is "just like BindLink but without the setup/static-analysis overhead" and so I think that some large fraction of what URE does, both forward and backward chaining, can now be done in "pure Atomese", using the UnifyLink with the RuleLink (see one of the demos), and then using the Frames to store intermediate results. Can even mix forward and backward chaining. What is mixing would be any kind of weighting to halt exploration of branches that are too low in importance.

I am contemplating creating a CacheProxyNode that would be a modernization of the idea of ECAN. The ProxyNode infrastructure makes this "real easy to do". Also, it's now "trivial" to use values coming from any FloatValue, instead of having to use AttentionValues. Any formula can be attached to this (so not plain ECAN, but anything you can write with PlusLink, etc.)

I think I need the CacheProxyNode to manage memory during learning. It has not yet become urgent, but it might get done in the next 6-12 months, maybe. Like I say, it is "easy" because all of the rest of the infrastructure for ProxyNodes is now in place.