comunica / comunica-feature-link-traversal

📬 Comunica packages for link traversal-based query execution
Other
8 stars 11 forks source link

Adapt query plan based on content policies #48

Open rubensworks opened 2 years ago

rubensworks commented 2 years ago

Issue type:


Description:

Certain content policies could define the assumption that certain triples will always exist together in a certain file.

For example, for the given query, data about movie and watch data will always exist in the same file:

https://comunica.github.io/comunica-feature-link-traversal-web-clients/builds/solid-default/#datasources=https%3A%2F%2Fdrive.verborgh.org%2Fmovies%2F&query=PREFIX%20schema%3A%20%3Chttps%3A%2F%2Fschema.org%2F%3E%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Fmovie%20a%20schema%3AMovie.%0A%20%20%3Faction%20a%20schema%3AWatchAction%3B%0A%20%20%20%20%20%20%20%20%20%20schema%3Aobject%20%3Fmovie.%0A%7D&solidIdp=https%3A%2F%2Fdrive.verborgh.org%2F

Given this assumption, the query plan could be optimized (adaptively), so that a join between each movie and watch action triple patterns could be restricted to the contents of a single file, as it is guaranteed that no watch actions about movie X will be found in files except for the one about movie X.

In an abstract sense, we could see this as an restricted Web API or query view that can query data about a given movie and its watch actions. The (adaptive) query planner should then take this restriction into account.

Related to #46, but focuses on query planning instead of discovery/priorization.


Another concrete use case is that of Solid's type index, such as this query: https://comunica.github.io/comunica-feature-link-traversal-web-clients/builds/solid-default/#transientDatasources=https%3A%2F%2Frubensworks.solidcommunity.net%2Fprofile%2Fcard&query=PREFIX%20bookm%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F01%2Fbookmark%23%3E%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Fbookmark%20a%20bookm%3ABookmark%3B%0A%20%20%20%20bookm%3AhasTopic%20%3Ftopic.%0A%7D

If one or more triple patterns refer to a type index entry, the query planner may want to prioritize those for earlier joins, as they may be more selective.

github-actions[bot] commented 2 years ago

Thanks for the suggestion!

github-actions[bot] commented 2 years ago

Thanks for reporting!