Closed mdorf closed 4 years ago
Another sample query constructed by 4store when executing the call:
paging = LinkedData::Models::Class.in(sub).include(:prefLabel, :synonym).page(page, size)
SELECT DISTINCT ?id ?prefLabel ?synonym FROM <http://data.bioontology.org/ontologies/CSV_TEST_BRO/submissions/1> WHERE { ?id a <http://www.w3.org/2002/07/owl#Class> .
OPTIONAL { ?id ?rewrite0 ?prefLabel . }
OPTIONAL { ?id ?rewrite1 ?synonym . }
FILTER(?id = <http://bioontology.org/ontologies/Activity.owl#Activity> ||
?id = <http://bioontology.org/ontologies/Activity.owl#Biospecimen_Management> || ...)
FILTER(?rewrite0 = <http://data.bioontology.org/metadata/def/prefLabel> ||
?rewrite0 = <http://www.w3.org/2004/02/skos/core#prefLabel>)
FILTER(?rewrite1 = <http://www.geneontology.org/formats/oboInOwl#hasExactSynonym> ||
?rewrite1 = <http://purl.obolibrary.org/obo/synonym> ||
?rewrite1 = <http://www.geneontology.org/formats/oboInOwl#hasBroadSynonym> ||
?rewrite1 = <http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym> ||
?rewrite1 = <http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym> ||
?rewrite1 = <http://www.w3.org/2004/02/skos/core#altLabel>) }
According to Gary King, a developer from AllegroGraph, the above query is constructed incorrectly. Explanation below:
The query looks like:
SELECT DISTINCT ?id ?prefLabel ?synonym FROM <http://data.bioontology.org/ontologies/CSV_TEST_BRO/submissions/1> WHERE { ?id a <http://www.w3.org/2002/07/owl#Class> . OPTIONAL { ?id ?rewrite0 ?prefLabel . } OPTIONAL { ?id ?rewrite1 ?synonym . } FILTER( ?id = <http://bioontology.org/ontologies/Activity.owl#Activity> || ## ## -- snip -- ## ?id = <http://bioontology.org/ontologies/BiomedicalResourceOntology.owl#Biomedical_Supply_Resource>) FILTER(?rewrite0 = <http://data.bioontology.org/metadata/def/prefLabel> || ?rewrite0 = <http://www.w3.org/2004/02/skos/core#prefLabel>) FILTER(?rewrite1 = <http://www.geneontology.org/formats/oboInOwl#hasExactSynonym> || ## ## -- snip -- ## ?rewrite1 = <http://www.w3.org/2004/02/skos/core#altLabel>) }
Because the FILTERs are outside the OPTIONALs, they are applied to every row returned. I.e., only rows where ?rewrite0 is in its list and ?rewrite1 is in its list will be returned. I.e., the query will return NO results where ?rewrite0 or ?rewrite1 is NULL.
What you need to do is to make sure that the FILTERS are applied only inside each OPTIONAL. For example, this query will do what you want:
SELECT DISTINCT ?id ?prefLabel ?synonym FROM <http://data.bioontology.org/ontologies/CSV_TEST_BRO/submissions/1> WHERE { ?id a <http://www.w3.org/2002/07/owl#Class> . OPTIONAL { ?id ?rewrite0 ?prefLabel . FILTER(?rewrite0 = <http://data.bioontology.org/metadata/def/prefLabel> || ?rewrite0 = <http://www.w3.org/2004/02/skos/core#prefLabel>) } OPTIONAL { ?id ?rewrite1 ?synonym . FILTER(?rewrite1 = <http://www.geneontology.org/formats/oboInOwl#hasExactSynonym> || ## ## -- snip -- ## ?rewrite1 = <http://www.w3.org/2004/02/skos/core#altLabel>) } FILTER( ?id = <http://bioontology.org/ontologies/Activity.owl#Activity> || ## ## -- snip -- ## ?id = <http://bioontology.org/ontologies/BiomedicalResourceOntology.owl#Biomedical_Supply_Resource>) }
This prompted changes in both Goo and Sparql-client projects. Extensive testing for backward compatibility is required. See: https://github.com/ncbo/goo/commit/8e88ac4bf79a66f1c1cdd66101e2e0070b547342#diff-0ce3c3d4c71d49a8d57dd6864ef8ca4f and https://github.com/ncbo/sparql-client/compare/master...ncbo:allegrograph_testing#diff-372c8098811915fcf8c2ac7020553f8d
One of our most heavily used API calls that returns paged data yields incorrect results in AllegroGraph:
The first call results in the following SPARQL query:
The second call results this this SPARQL query:
In 4store, both of these queries return an identical number of rows, with the difference contained only in the selected attributes for each record. In SQL terms, that would equate to an OUTER JOIN query.
Unfortunately, AllegroGraph is treating this as an INNER JOIN query, with the results varying depending on what OPTIONAL attributes are selected. In the second case, it only selects classes that contain both prefLabel(s) and synonym(s). I am not sure which back end is at fault here, but the presence of the construct OPTIONAL tells me that perhaps 4store is doing the right thing.