apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.24k stars 1.23k forks source link

Multistage engine and multivalue column semantics incompatibilities #12306

Open gyorfimi opened 5 months ago

gyorfimi commented 5 months ago

Multistage engine and multivalue column semantics incompatibilities

In Pinot filtering on a multivalue column with = operator actually a contains operator, aggregation to a multivalue column is actually an unnest operation.

It works well in 1.0 engine even with filtering the result (multivalue column) in the HAVING clause.

However, in 2.0 engine, this irregular behavior is not compatible with relation algebra semantics. That's why one should use ArrayToMv function, which tries to break the border between "normal" field semantics and multivalue field semantics.

In most cases it works well, but there are some cases when it doesn't work as expected.

Example:

select id, mvc from (select id, arraytomv(mv) as mvc from mytable group by id, mvc) embd where mvc = 'a'

(assuming that mvc is a multivalue column)

If there is a row in mytable with mvc containing a and b, the result will contain two rows with the same id and mvc = a and b respectively (even though the where clause is applied).

The main cause of this behavior is that relation algebra transformation pushes down the where clause to the inner query like this: select id, mvc from (select id, arraytomv(mv) as mvc from mytable where arraytomv(mv) = 'a' group by id, mvc)

In normal field semantics, this query will return only one row with mvc = a. However, in multivalue field semantics, it will return two rows with mvc = a and b respectively.

So a complete rework of multivalue operators is needed. Multivalue is actually a set, not a value, so for example an UNNEST(...) operator is suggested (Calcite supports it), and for filter mv = SOME(...) is suggested (Calcie supports it too).

For fast fix of this issue, it should be prevented to push down filter on the multivalue-column group by query to the inner query.

walterddr commented 5 months ago

could you attach the explain plan result for this query.

This is actually a problem b/c the mv column is not exactly semantically "relational algebra" it is a short-handed representation of saying "it should behave as an array (with some extra op) when xxx and should behave as a set (with some extra op) when yyy"

so there's really no convention or compatibility fix.

for this specific query

select id, mvc from (
  select id, arraytomv(mv) as mvc from mytable group by id, mvc) embd 
where mvc = 'a'

the where clause is not because it is not suppose to be pushed into the inner query, but the question on whether we should applied it as a filter before/after the group-by.

agree we definitely need some redesign to make sure this behavior is clear. even if it means some of the existing syntax wont make sense. for example: in this case, UNNEST must be explicitly specify and mvc should be treated as normal ARRAY type. then the confusion should go away