apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.4k stars 1.26k forks source link

Increase DEFAULT_VALUE_PRUNER_IN_PREDICATE_THRESHOLD or derive the value with some simple policy #7913

Open mqliang opened 2 years ago

mqliang commented 2 years ago

For IN predicate, ColumnValueSegmentPruner prune segments based on:

However, we will skip pruning when there are too many values in the IN predicate, currently we skip pruning when IN predicates has more than 10 items: https://github.com/apache/pinot/blob/12edbdb1e521e0d9ee35a2d6ef2fed172e5beaa1/pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java#L287

The value 10 here is quite empirical. IIUC, the rationale is: default FPP (false positive probability) of Bloom Filter is 0.05:https://github.com/apache/pinot/blob/7e9ca6a5a4afe0d4e283ac1307c45430e474cbf2/pinot-spi/src/main/java/org/apache/pinot/spi/config/table/BloomFilterConfig.java#L28

Which means even if a segments does not contains a given value, BloomFilter will has a 5% probability falsely answering that the segment contains the value.

1 - 95% ^ 10 = 0.4

So, if IN predicate has 10 values, even if a segment doe NOT contains any one of the 10 values, BloomFilter will has a 40% probability falsely answering that the segment can NOT be pruned. Let's say we have a query set:

Say we have a table, the table does not contains any one of the 10 values of the IN predicate:

I think that's the reason why BloomFilter does not help too much (in terms of decreasing query latency) when IN predicate has more than 10 values.

But if we look at this problem from system resource usage point of view, a pinot segments is usually 100MB to 500MB, loading a segment form disk to memory and do a full segment scanning is much much more expensive than consulting BloomFilter (thousands, even ten thousands times more expensive IMO).

For example: If we assume loading a segment form disk to memory and do a full segment scanning is 100 times expensive than consulting BloomFilter, we should set DEFAULT_VALUE_PRUNER_IN_PREDICATE_THRESHOLD as 90 ( 1 - 95% ^ 90 = 0.99). Let me elaborate:

In conclusion, increasing DEFAULT_VALUE_PRUNER_IN_PREDICATE_THRESHOLD more than 10 does not help too much in terms of reducing query latency, but will be very helpful to decrease overall system resource usage.

In addition, MinMaxRange pruning does not has the false positive problem as BloomFilter does, we may should always enabling MinMaxRange pruning.

cc @mcvsubbu @siddharthteotia @GSharayu

Jackie-Jiang commented 2 years ago

The problem here is that: If segment cannot be pruned, we are paying the overhead of parsing all the values in the IN clause and apply it to bloom filter (assume comparing min/max value is very cheap) If segment can be pruned, even if we skip the pruning because there are too many values in the IN clause, the segment will still be quickly skipped after looking up the values in the dictionary and find there is no match. So I don't think the assumption of full table scan stands here.