comunica / comunica-feature-link-traversal

📬 Comunica packages for link traversal-based query execution
Other
8 stars 11 forks source link

Feature/static filter for the TREE specification #86

Open constraintAutomaton opened 1 year ago

constraintAutomaton commented 1 year ago

Static filter for the TREE specification

The objective of this PR is to implement SPARQL filters in a way that when the query contains a filter expression, the next nodes describe by the current TREE relation that cannot be satisfy with the filter will not be followed. The filters are called static because they are applied before the execution of the query, so the binding during execution has no impact on the pursuit of nodes.

Implementation

The TREE traversal extractor provided an ITreeNode object containing "all" the relevant metadata information.

The FilterNode class provide to the solver the TREE relation, the filter expression and the effective variable. The solver than interpret the TREE relation and the filter expression as a boolean expression. The solution domain is after evaluating first for only the filter expression than with the filter expression and the TREE relation. If the domain is empty than a false value associated with the relation is inserted in a map in the FilterNode class (as it is impossible that a document can match the filter will be found). The operation is repeated for all the relation and a map of those results are forwarded to the TREE extractor actor that prune the irrelevant links.

Limitation

Perceptive

linked to #82

constraintAutomaton commented 1 year ago

Things are still very vague to me as to how everything works. I think adding some more documentation would help a lot.

Also, I'm not convinced yet about the optimize-link-traversal bus, as it seems to be tailored specifically to tree, and is not link-traversal-wide. Can we move this logic into the tree-specific actor on the link extraction bus? If not, can we make this optimize bus independent of tree?

I thought that for other specification there could be other general optimization. As the filter is not a specific operation for the TREE, there could be another implementation for other specification, as it is not completely "domain specific".