Open ngeiswei opened 6 years ago
SchematizationLink
could be understood as something more sophisticated like turning data into a regressed model (like MOSES would), so maybe to distinguish it from that it could be named TablelizationLink
, or likely some better name.
I'm not clear what you are describing here. So let me make 3-4 random remarks. First, there currently exist the following (under-utilized) links:
http://wiki.opencog.org/w/SignatureLink http://wiki.opencog.org/w/ArrowLink
The ArrowLink is meant to describe the inputs and outputs of a function. The SignatureLink is meant to define a signature in general (not just a function signature). It's suitably polymorphic, I think. These links are inspired by, are intended to capture the essence of arrows, and of signatures, as described in books on term rewriting (e.g. bader & nipkow) or model theory (wilifred hodges) or proof theory or logic in general. Wikipedia describes them.
There's some code and some unit tests for them, but not a lot.
When I first read what you wrote, I thought that maybe the SchematizationLink is one, or the other or some combination of these two links. On closer reading, I see that its not... see next note...
On second reading, I see this: "probably don't want to query the entire atomspace for ExecutionLink". Well, you don't have to. If you have the SchemaLink
, you merely ask for all members of it's incoming set that are of type ExecutionLink, either C++ or scheme:
(cog-incoming-by-type (Schmea "foo") 'ExecutionLink)
and bingo you've got them all.
Also, note that the matrix/vector code does "tableization", at least, in the way that works for me. Wheras you have
(ExecuteLink
(Schema "f")
<input-1>
<output-1>)
...
(ExecuteLink
(Schema "f")
<input-n>
<output-n>)
I have
(FooLink
(BarNode "f")
<left-1>
<right-1>)
...
(FooLink
(BarNode "f")
<left-n>
<right-n>)
Much of the code uses the words "row" and "column" for "left" and "right" ... its the same thing. You can also think of each row or each column as a vector, so the matrix is a collection of vectors.
The code is meant to solve the following problems:
1) work very well for extremely sparse matrices e.g. only one-in-a-million non-zero entries.
2) map any kind of atomspace structure into matrix/vector form. For and Bar can be anything, and left, right can be anywhere. e.g.
(FooLink
(StuffLink <left-n>)
(Other (Different (BarNode "f") (PlaceLink <right-n>))))
There doesn't even need to be a FooLink -- any pattern match to find left, right will work.
3) provide typical row and column marginal sums , statistics, probabilities, entropies, mean-square-lenghts, cosine angles, jacquard distances, etc.
Last comment: For your data, if you have
(ExecuteLink
(Schema "f")
<input-1>
<output-1>)
...
(ExecuteLink
(Schema "f")
<input-n>
<output-n>)
and if the <input-k>
and <output-k>
are time-varying, and if you don't need to pattern-match them, then use Values not Atoms for them. Its more efficient, uses (an order of magnitude) less ram, is (an order of magnitude) faster for modification.
The tableization of the matrix code seems interesting, thanks for the feedback.
It turns out such "schematization" won't be needed soon (in as-moses) so this issue may likely remain pending for the next few months.
We likely need (as part of the effort of porting MOSES to the Atomspace) a way to use declarative knowledge about a Schema to actually run it.
For instance given a mapping between inputs and outputs using http://wiki.opencog.org/w/ExecutionLink defining some schema
f
such asExecuting (via the Atomese interpreter https://github.com/opencog/atomspace/blob/master/opencog/atoms/execution/Instantiator.h#L141)
should return the corresponding output
Some care would be required for dealing with undefined or duplicated values (though I guess it'd be OK to just raise an exception when things are ill or un-defined for starter, or leaving the body unchanged).
Some care would also be required to not slow down the interpreter every time it encounters a schema, as we probably don't want it to query the entire atomspace for
ExecutionLink
when that happens, or do we? Maybe one could restrict such behavior toDefinedSchemaNode
. Or perhapsExecutionLink
could have a factory that stores all inputs/outpus as values of the considered schema. Or perhaps we could introduce aSchematizationLink
that could do just that (i.e. build from a partial function from a record ofExecutionLink
Then one would be able to run
(DefinedSchema "schematized-of-f")
but not(Schema "f")
.