eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
359 stars 161 forks source link

redundant output with CONSTRUCT query #515

Closed abrokenjester closed 7 years ago

abrokenjester commented 8 years ago

(Migrated from https://openrdf.atlassian.net/browse/SES-925)

OUTPUT: 5 times same info ... werkgever/skos:prefLabel /rdf:Description .... werkgever/skos:prefLabel /rdf:Description ... werkgever/skos:prefLabel /rdf:Description ... werkgever/skos:prefLabel /rdf:Description ... werkgever/skos:prefLabel /rdf:Description /rdf:RDF CONSTRUCT query used: query=PREFIX ... CONSTRUCT { http://standaarden.overheid.nl/vac/terms/Doelgroep ?p ?v. ?l1 a http://standaarden.overheid.nl/vac/terms/Doelgroep. ?l2 skos:inScheme http://standaarden.overheid.nl/vac/terms/Doelgroep. ?l3 overheid:inScheme http://standaarden.overheid.nl/vac/terms/Doelgroep. ?l1 skos:prefLabel ?label. ?l2 skos:prefLabel ?label. ?l3 skos:prefLabel ?label. ?l1 overheid:startDate ?start. ?l1 overheid:endDate ?end. http://standaarden.overheid.nl/vac/terms/Doelgroep rdfs:isDefinedBy http://standaarden.overheid.nl/vac/terms/Doelgroep.rdf . http://standaarden.overheid.nl/vac/terms/Doelgroep rdfs:isDefinedBy http://standaarden.overheid.nl/vac/terms/Doelgroep.n3 . http://standaarden.overheid.nl/vac/terms/Doelgroep http://xmlns.com/foaf/0.1/page http://standaarden.overheid.nl/vac/terms/Doelgroep.html . http://standaarden.overheid.nl/vac/terms/Doelgroep dcterms:isPartOf http://standaarden.overheid.nl/owms/terms/OWMSdataset . http://standaarden.overheid.nl/vac/terms/Doelgroep.rdf dcterms:publisher overheid:ICTU . http://standaarden.overheid.nl/vac/terms/Doelgroep.rdf dcterms:rights http://en.wikipedia.org/wiki/WP:GFDL.} WHERE { http://standaarden.overheid.nl/vac/terms/Doelgroep ?p ?v. OPTIONAL {?l1 a http://standaarden.overheid.nl/vac/terms/Doelgroep. ?l1 skos:prefLabel ?label. OPTIONAL {?l1 overheid:startDate ?start.} OPTIONAL {?l1 overheid:endDate ?end. }} OPTIONAL {?l2 skos:inScheme http://standaarden.overheid.nl/vac/terms/Doelgroep. ?l2 skos:prefLabel ?label. } OPTIONAL {?l3 overheid:inScheme http://standaarden.overheid.nl/vac/terms/Doelgroep. ?l3 skos:prefLabel ?label. } }

pulquero commented 8 years ago

Won't fix, construct doesnt dedup.

jakubklimek commented 7 years ago

@pulquero Based on discussion in https://github.com/eclipse/rdf4j/issues/857 and the SPARQL specification it seems that construct SHOULD deduplicate. Therefore, I vote for reopening this issue.

pulquero commented 7 years ago

The conclusion was based on the conclusion of another jira, let me find it...

pulquero commented 7 years ago

OK, I can't find it, but the discussions following it went something along the lines of that it would be a performance hit if CONSTRUCT had to dedup, and what needs to be dedup is the serialization/materialization of the result of CONSTRUCT, not the output from the API. Basically, the spec defines how you can 'see' the result, and that is the only way you can tell if it is dedupped on not, so the result must be dedupped at that point, but it doesnt mean it needs to be dedupped earlier.

If we do dedup it, there is always going to be a point in the code where it isnt dedupped. The issue is that the contract of the API doesnt met the spec for being a serialization of RDF, and I dont think it was intended to.

jakubklimek commented 7 years ago

Sure, dedup always means performance penalty, but I still think it should be either clearly specified, with some kind of a helper that does the deduplication, or it should do it.

Nevertheless, if I do this experiment in rdf4j workbench, I get the same result. i.e. 4 results, 2 and 2 duplicated, which should definitely not be happening.