apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.54k stars 3.24k forks source link

support filter with predicate in BetaRowset #1652

Closed imay closed 4 years ago

imay commented 5 years ago

Support predicate filter in BetaRowset iterator.

It's better if we can support lazy materialization.

gaodayue commented 5 years ago

If you mean filter rows by columns's zonemap, it's already supported

imay commented 5 years ago

@gaodayue

If you mean filter rows by columns's zonemap, it's already supported

I mean that vectorized predicate filter. In AlphaRowset, this logic locates in ColumnData, we should support it in SegmentIterator to filter data as soon as possible.

gaodayue commented 5 years ago

I see, thanks

kangpinghuang commented 5 years ago

I will do this feature

kangpinghuang commented 5 years ago

I will split this job into several step:

  1. I will move original v1's predicate logic to segment v2 to provide this feature quickly.
  2. I will refactor the current column predicate framework because current column predicate is based on row block. I will modify it to calculate based on column cell. This work wil be the base of the following work.
  3. I will realize lazy materiazation to optimize data read.
  4. I will realize the optimization to push predicate evalution to page decoder which will optimize in some situation, eg: dict encoded page.

and more future works which will be done in next phase, including:

  1. support more complex predicate calculation in v2, like or/not predicate
EmmyMiao87 commented 5 years ago

1775