Closed saurabhlambe closed 2 months ago
Can this be replicated in any quickstart? Can you also specify the Pinot version where the error is present and the types and indexes involved in the query/table?
It seems the issue was a bit more complex than expected. There is a DISTINCT_COUNT_HLL star-tree index on that column. That precalculation was done with the default log2m. Therefore when a query touches a segment whose data HLL is read from the star-tree index and another where the data has been calculated at runtime with a different log2m, the error is thrown.
I think Pinot should detect the discrepancy in the arguments used and do not use the star-tree index in this case. What do you think @Jackie-Jiang ?
Yes. This requires bigger changes. Basically star-tree should keep the extra arguments stored in the metadata, and match the whole aggregation. Currently it only stores the main column and aggregation type, thus causing this problem.
Will be fixed by https://github.com/apache/pinot/pull/13835.
Example query:
As per Pinot docs, the DISCOUNTHLL function takes 2 arguments, log2m being optional. When the value of log2m is 8, the query runs correctly, when used a different value, it throws the following error: