Create internal expression representation

wschella commented 6 years ago

The parsing of an expression to a usable format (tree) is delegated to SPARQL.js. This format is great, but it is not decently expandable in a typed language. To decently support #1, and remove the need for repeated casting to and from strings, literals, simple data-types, we should create an internal representation of an expression.

This would also allow us to be a lot more flexible during evaluation, which has a very recursive nature, and as such forces us to be very strict with return values at all steps, while the exact representation for the end user should in fact only matter at the last step/expression root/return moment.

rubensworks commented 6 years ago

SPARQL Algebra would be the way to go IMO. @joachimvh already has types for these: https://github.com/joachimvh/SPARQLAlgebra.js

joachimvh commented 6 years ago

For the expressions themselves there is still only 1 type though (Expression) with a string indicating which operator is used. I do actually have a TODO somewhere to maybe add specific types per operator and might look into that if there is need for it. But I assume this would still require casting then since you don't know in advance which expression might be in there.

wschella commented 6 years ago

I feel like SPARQLAlgebra wouldn't cut it, I was really talking about expression specific types, that would allow for example to place some relevant functions on them (ex: EBV on Term), or have a more efficient and elegant way of keeping some datatype information that is needed for some builtin functions or evaluator internals. I'm not quite sure about how TypeScript handles extending existing types while still keeping 'compile'-information.

wschella commented 6 years ago

Another argument against using RDF.js internally is that in RDF.js a variable is considered a term, while in an expression (or query), it is a lot more fitting to consider it an expression, but not a term. You can not evaluate a term any further, but you should evaluate a variable further, eg. you should bind it.

Making this difference will greatly improve the logic.

joachimvh commented 6 years ago

I actually have a similar problem now that I'm adding types in SPARQL algebra and integrating with RDF.js. Everything that makes use of expressions (filters, aggregates, orders, etc.) now has a parameter that looks like expression: Expression|rdfjs.Variable which is sort of unfortunate. Note sure yet how to solve this cleanly (besides dropping RDF.js but that would make @rubensworks sad).

But binding values to variables is not limited to variables appearing in expressions. In the end, a triple :a :b ?x. is also just an "expression" returning a set of bindings for the variable ?x. And there it is also just a Term, but it does get used to bind values to it.

rubensworks commented 6 years ago

besides dropping RDF.js but that would make @rubensworks sad

I'm not against that if there is a good reason for it :-) And this issue and #3 makes me think there might be. As long as there is a way for me to convert an RDFJS quad to this format, that's fine by me.

@wschella or @joachimvh, could you look into suggesting a representation that we could use instead of the RDFJS Term? Where possible, appropriate mapping to and from RDFJS terms should be available.

wschella commented 6 years ago

I think we shouldn't worry all too much (not at all actually). The expression evaluator will still be able to take RDF.js in it's mapping, and SPARQL.js output for it's expression, and it's own output is essentially just a boolean (a term's EBV). All conversions happen internally, so the exp-eval should still be easily usable as a component. We could maybe export internal types when everything is stable, to allow for other evaluation algorithms, but currently it's mostly boilerplate to (ab)use the TS type & JS prototype system to 'prepare' expressions and reduce branching during actual evaluation (since 1 expr will be evaluated multiple times this seemed like a decent balance to make).

wschella commented 6 years ago

It's all rdf-js (in the bindings and SPARQL Algebra), rdf-js out. Internals are different format.

wschella / Sparqlee

Create internal expression representation #2