Finish implementing ProtoAtom

linas commented 8 years ago

The current ProtoAtom code is experimental and can be found in the ProtoAtom branch. It "works". However its unfinshed, some serious design issues remain and some important decisions with large reprecussions have to be made. These are difficult decisions which is why they have not been made.

A ProtoAtom is like an ordinary Atom, but without a TV, AV or UUID. It is meant to store mutable floating-point data (and associate it with an atom). ProtoAtoms cannot be searched for in the AtomSpace indexes, although the atoms and keys that refer to them can be.

Difficult decisions include:

Should Handles point at protoatoms?
How should casting between AtomPtr and ProtoAtomPtr be handled?
How should ProtoAtoms work with Links? I.e. they belong in the outgoing set, right? If so, then various parts of the code have to check for protoatoms, and cast or not cast them to atoms. Should the outgoing set be a vector of protoatom? How would casting work?
Idea!ly maybe we should also have a special ProtoLink that stores a vector of protoatoms!!! That way, the usual Links are undisturbed, and the protolink handles the required atom/key linkages to the value. The protolink would need to be stored in an atomtable index for fast lookup.

I like the last idea. However, based on experience, liking an idea, and liking its actual implementation are two very different things.

linas commented 8 years ago

FWIW the current code is here:

https://github.com/linas/atomspace/tree/protoatom

linas commented 8 years ago

And I pushed a copy to here https://github.com/opencog/atomspace/tree/protoatom for safe-keeping -- since deleting branches in github is real easy, and would be a major loss if it got deleted.

linas commented 8 years ago

FWIW, the protoatom code could be merged into master today, since all unit tests pass. I have refrained from merging because there is a (minor) performance hit in various places: there is extra casting and checking of handles and atom pointers and what-not. Flip side, atomspace insertion code got simpler.

linas commented 8 years ago

Notes:

Casting AtomPtr to NodePtr and similar casts are very expensive (based on reading the disassembly only; not based on measurements. This should also be measured.).
The above suggests that smart-pointer casts should be avoided at all costs.
The above implies that the ProtoAtom should have dummy virtual methods for everything that a regular Atom or Node or Link might need. The dummies should throw exceptions.
NodePtr, LinkPtr, AtomPtr should all be exactly same as ProtoAtomPtr.
There should be isNode() and isLink() methods to ease decisions, e.g. in the pattern matcher.

linas commented 8 years ago

Cast of Handle to LinkPtr takes about 140 nanoseconds on my machine, and completely dominates performance of calls to methods on class Node, class Link. In short, something about the implementation of smart pointers in the current gcc code is deeply broken. Am fixing this now.

inflector commented 8 years ago

Yeah, I was looking at the code for std::dynamic_pointer_cast in the standard C++ library and it was really ugly. Really really ugly. All sorts of tests and branches which kill modern processors.

inflector commented 8 years ago

I've always done the dummy virtual methods approach as it is very cheap. And the check operation, i.e. is_node() or is_link() can be inlined.

It's just one more level of indirection to load the virtual method address before the function call. Its a fixed offset load from the virtual method dispatch table stored at a fixed memory address as a global.

For a small number of virtual functions of a common class the virtual method dispatch table will almost always be in the fastest processor cache, so it will be a fast operation and won't stall the pipeline so much.

linas commented 8 years ago

Yeah, I'm seriously disappointed. I could have sworn it was much faster than it is, but that would have been on PowerPC, too, not Intel. The Intel x86 architecture is kind of nutty. Oh well. I am performing conversions now. I need this for protoatoms, so I can decorate atoms with a similarity measure, so I can do fuzzy matching correctly, so I can do linguistic processing. One step forward, and four steps back, you can't get to far like that.

inflector commented 8 years ago

Actually, I am wrong about inlining the check. For that to work, you need a non-virtual check and an instance flag that is set differently in the initializations of each of the classes. Which probably doesn't make sense here.

linas commented 8 years ago

argh. $%^&* cython. When I change getName(), getOutgoing() to throw, instead of returning bogus null values, cython fails. Why??? Its like everything slowly decays if I don't keep an eye on it.

inflector commented 8 years ago

What error are you getting?

linas commented 8 years ago

unit tests fail. Found it, fixed it. Anyway, this is the wrong place to have open-ended discussions.

linas commented 8 years ago

Some design notes and ideas are documented here: https://github.com/opencog/atomspace/blob/master/opencog/atoms/base/README.md

linas commented 8 years ago

Design trade-offs are discussed at length in opencog/opencog#2333 -- my current favorite idea is to have a ValuationSpace, as described there.

ngeiswei commented 8 years ago

@linas I don't intend to look into this now, ping me if you need me to.

linas commented 7 years ago

Pull request #1147 mostly finishes the task. Some remaining work items:

Provide Python bindings (see #1161) (mostly done).
Provide a thread-safety unit test (semi-done; AtomSpaceAsyncUTest.cxxtest tests truth values, which are built on top of values. and it works. Would be nice to test values directly, so that no one makes an end-run around them for truth values)
Create a wiki page describing values and how to use them. (Semi-done in http://wiki.opencog.org/w/ProtoAtom)
Create a granular backend API for value save/restore. (Maybe. See "Semantics" section in https://github.com/opencog/atomspace/blob/master/opencog/persist/sql/README.md )

Finished:

Provide a conversion script from the old database format to the new format. (DONE in #1150)
Review and correct thread-safety (DONE in a138de2c85df45e054a3351f3ca653888b80ae0a)
Review and correct opencog/atomspace/README (DONE in 58654c433142ad70c335d2136f65e96dc954ae5e)
Provide Haskell bindings (DONE in #1189)
Provide Scheme bindings (partly done in #1154) (see discussion in #1147 for general idea - the idea(s) there remain to be implemented.) (DONE in #1207)
Move map from ValuationSpace to Atom. (DONE in #1315)
Provide mechanism to remove keys from atoms. (DONE in #1322)
Convert AttentionValues so that they use FloatValue (see #1157) (DONE in #1401)

linas commented 5 years ago

Closing; this is effectively done

opencog / atomspace

Finish implementing ProtoAtom #513