Open aaj3f opened 5 months ago
I don't think this is a bug but intended behavior. When group-by
is used in a query, the select expression selects from the set of groups of results, not the individual results within each group. If two groups of solutions are different, then select-distinct
will treat them as different and include both irrespective of the repetition within each group.
I think we'd need to add a new distinct
aggregate function modifier to support this behavior because I think changing the behavior of select-distinct
will break the model of how it should work in other places. We already support a similar behavior with count-distinct
, so we'd need to generalize this to allow distinct
to act as it's own aggregate function. Then, this query would look like this:
{
"@context": {
"f": "https://ns.flur.ee/ledger#",
"owl": "http://www.w3.org/2002/07/owl#",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"sh": "http://www.w3.org/ns/shacl#",
"skos": "http://www.w3.org/2008/05/skos#",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"ex": "http://example.org/",
"schema": "http://schema.org/"
},
"from": "fluree-jld/369435906933189",
"where": {
"@id": "?id",
"@type": "?type",
"?p": "?o"
},
"select": [
"?type",
"(distinct ?p)"
],
"groupBy": "?type"
}
The SPARQL spec describes a similar mechanism where DISTINCT
is used as a modifier of the input of any aggregate function, including GROUPCONCAT
, which is the "default" aggregation in FlureeQL if no other aggregation is specified for a group.
Description
When using
selectDistinct
withgroupBy
, the grouped items are not unique sets.For example, without
groupBy
, this query returns unique values for?p
But when using
groupBy
, this query returns 2+ duplicate values for?p
Steps to Reproduce
Submit this /create txn
Issue this query
Notice duplicates like the following in the query results: