ftsrg / ingraph

Incremental view maintenance for openCypher graph queries.
http://docs.inf.mit.bme.hu/ingraph/
Eclipse Public License 1.0
47 stars 10 forks source link

How to calculate and produce count of empty aggregations #318

Open szarnyasg opened 6 years ago

szarnyasg commented 6 years ago

Handling aggregations incrementally is tricky. For example, the simple query of

MATCH (n)
RETURN count(n)

should return a single row (containing 0) for an empty database. So, the philosophical question is, for an empty aggregation set, do we return an 0 or nothing?

The issue can be demonstrated with a PostgreSQL console:

postgres=#
  select count(a)
  from (select 1 as a) as subq
  where a = 2;

count
-------
0
(1 row)

However, if we aggregate for a:

postgres=#
  select count(a)
  from (select 1 as a) as subq
  where a = 2
  group by a;

 count 
-------
(0 rows)

Or, if you think aggregating for a is ugly, we can aggregate for b:

postgres=#
  select count(a)
  from (select 1 as a, 1 as b) as subq
  where a = 2
  group by b;

 count 
-------
(0 rows)

The issues of handling nulls for OPTIONAL MATCH clauses is also related...

szarnyasg commented 6 years ago

It's worth checking the Postgres docs:

If a query contains aggregate function calls, but no GROUP BY clause, grouping still occurs: the result is a single group row (or perhaps no rows at all, if the single row is then eliminated by HAVING). The same is true if it contains a HAVING clause, even without any aggregate function calls or GROUP BY clause.

end of 7.2.3: https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-GROUP

via @jmarton

szarnyasg commented 6 years ago

Related literature (thanks to @bergmanngabor): https://dl.acm.org/citation.cfm?id=137852

szarnyasg commented 6 years ago

This causes BI Q7 to break.

A simplified version shows the issue:

MATCH (message2:Message)
OPTIONAL MATCH (message2:Message)<-[like:LIKES]-(p3:Person)
RETURN message2.id AS m, count(like) AS likes
ingraph results: List(ArrayBuffer((likes,1), (m,44)), ArrayBuffer((likes,1), (m,88)))
neo4j results: List(ArrayBuffer((likes,1), (m,44)), ArrayBuffer((likes,1), (m,88)), ArrayBuffer((likes,0), (m,99)))
szarnyasg commented 6 years ago

The problem can be pinpointed to the following condition in AggregationNodes:

if (oldValues != newValues)

Of course, this line is there for a reason...

szarnyasg commented 5 years ago

Currently, this breaks BI Q21. image