opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
817 stars 228 forks source link

Finish implementing ProtoAtom #513

Closed linas closed 5 years ago

linas commented 8 years ago

The current ProtoAtom code is experimental and can be found in the ProtoAtom branch. It "works". However its unfinshed, some serious design issues remain and some important decisions with large reprecussions have to be made. These are difficult decisions which is why they have not been made.

A ProtoAtom is like an ordinary Atom, but without a TV, AV or UUID. It is meant to store mutable floating-point data (and associate it with an atom). ProtoAtoms cannot be searched for in the AtomSpace indexes, although the atoms and keys that refer to them can be.

Difficult decisions include:

I like the last idea. However, based on experience, liking an idea, and liking its actual implementation are two very different things.

linas commented 8 years ago

FWIW the current code is here:

https://github.com/linas/atomspace/tree/protoatom

linas commented 8 years ago

And I pushed a copy to here https://github.com/opencog/atomspace/tree/protoatom for safe-keeping -- since deleting branches in github is real easy, and would be a major loss if it got deleted.

linas commented 8 years ago

FWIW, the protoatom code could be merged into master today, since all unit tests pass. I have refrained from merging because there is a (minor) performance hit in various places: there is extra casting and checking of handles and atom pointers and what-not. Flip side, atomspace insertion code got simpler.

linas commented 8 years ago

Notes:

linas commented 8 years ago

Cast of Handle to LinkPtr takes about 140 nanoseconds on my machine, and completely dominates performance of calls to methods on class Node, class Link. In short, something about the implementation of smart pointers in the current gcc code is deeply broken. Am fixing this now.

inflector commented 8 years ago

Yeah, I was looking at the code for std::dynamic_pointer_cast in the standard C++ library and it was really ugly. Really really ugly. All sorts of tests and branches which kill modern processors.

inflector commented 8 years ago

I've always done the dummy virtual methods approach as it is very cheap. And the check operation, i.e. is_node() or is_link() can be inlined.

It's just one more level of indirection to load the virtual method address before the function call. Its a fixed offset load from the virtual method dispatch table stored at a fixed memory address as a global.

For a small number of virtual functions of a common class the virtual method dispatch table will almost always be in the fastest processor cache, so it will be a fast operation and won't stall the pipeline so much.

linas commented 8 years ago

Yeah, I'm seriously disappointed. I could have sworn it was much faster than it is, but that would have been on PowerPC, too, not Intel. The Intel x86 architecture is kind of nutty. Oh well. I am performing conversions now. I need this for protoatoms, so I can decorate atoms with a similarity measure, so I can do fuzzy matching correctly, so I can do linguistic processing. One step forward, and four steps back, you can't get to far like that.

inflector commented 8 years ago

Actually, I am wrong about inlining the check. For that to work, you need a non-virtual check and an instance flag that is set differently in the initializations of each of the classes. Which probably doesn't make sense here.

linas commented 8 years ago

argh. $%^&* cython. When I change getName(), getOutgoing() to throw, instead of returning bogus null values, cython fails. Why??? Its like everything slowly decays if I don't keep an eye on it.

inflector commented 8 years ago

What error are you getting?

linas commented 8 years ago

unit tests fail. Found it, fixed it. Anyway, this is the wrong place to have open-ended discussions.

linas commented 8 years ago

Some design notes and ideas are documented here: https://github.com/opencog/atomspace/blob/master/opencog/atoms/base/README.md

linas commented 8 years ago

See also http://wiki.opencog.org/wikihome/index.php/ProtoAtom

linas commented 8 years ago

Design trade-offs are discussed at length in opencog/opencog#2333 -- my current favorite idea is to have a ValuationSpace, as described there.

ngeiswei commented 8 years ago

@linas I don't intend to look into this now, ping me if you need me to.

linas commented 7 years ago

Pull request #1147 mostly finishes the task. Some remaining work items:

Finished:

linas commented 5 years ago

Closing; this is effectively done