Open ricroberts opened 6 years ago
We should probably also support passing FROM and FROM NAMED on the request as parameters (which would mean that we don't need to parse the query for getting the modified date for caching purposes)
We should definitely do this, with the SPARQL 1.1 Query Protocol parameters named-graph-uri
(maps to FROM NAMED
) & default-graph-uri
(maps to FROM
). These need to be supported on all SPARQL query endpoints (e.g. draftsets/live etc).
Additionally we should support the common case of providing better caching/hinting on the default graph via queries that touch vocab graphs, e.g. you might have a query like this (pseudo sparql):
SELECT * WHERE {
GRAPH <http://my-dataset/graph> {
?ds a qb:DataSet ;
qb:structure/qb:component/qb:codeList/skos:member ?scheme .
}
?scheme rdfs:label ?lbl .
}
In which case it would be good to set on the request a special hint with the query params ?drafter-named-graph-uri
and ?drafter-default-graph-uri
. These would essentially be the same as the SPARQL 1.1. ones, except that the drafter variations will expand virtual URI's which have special meaning, e.g. the URI <http://publishmydata.com/drafter/graph/all-vocabs>
which would be expanded into the set of all vocab graph URIs. Similarly we may support virtual URIs for drafter-graphs:all-datasets
, drafter-graphs:all-ontologies
etc.
The set of URI's for FROM
and FROM NAMED
would then be intersected with those allowed for the endpoint and used to calculate modified times for stasher query caching.
URIs supplied on the SPARQL 1.1. *-uri
parameters would also be honored in a similar way, but not subject to expansion.
The motivation for this is:
drafter-*
variant of SPARQL parameters we can introduce our own special URI semantics, and remain compatible with non drafter endpoints used in dev e.g. when running against a raw stardog, as stardog will ignore the extra parameters.GRAPH
restrictions too, but we think using a special query parameter is a better way to hint things; as the GRAPH
approach wont work for setting the default graph, and may result in sub-optimal query plans when using hacks like: VALUES ?g { ,,, }
.Also we need to fix this issue before doing this one:
If you have a query like this:
We don't know what graph the labels might come from, so with the new caching approach, this query would need to use the modified time of the whole endpoint in the cache key. But we could do a pre-query to find a list of all the graphs which contain vocabs or geography data.
One option would be to use a VALUES clause
... but we know stardog isn't very good at optimising these queries.
A better alternative might be to use
FROM
orFROM NAMED
:Drafter should:
use the latest modified time of (1
intersect
2) and 3 as the modified time on the cache key.We should probably also support passing FROM and FROM NAMED on the request as parameters (which would mean that we don't need to parse the query for getting the modified date for caching purposes)