This PR aims to check the index of the specified column.
We can test the filtering effect by specifying different types.
check --type stat - Only use column statistics.
check --type bloom-filter - Only use bloom filter.
check --type predicate - Used in combination with column statistics and bloom filter.
Why are the changes needed?
ORC supports specifying multiple columns to generate bloom filter indexes, but it lacks a convenient tool to verify the effect of bloom filter.
Parquet also has similar commands.
PARQUET-2138: Add ShowBloomFilterCommand to parquet-cli
How was this patch tested?
Add UT
Was this patch authored or co-authored using generative AI tooling?
What changes were proposed in this pull request?
This PR aims to check the index of the specified column.
We can test the filtering effect by specifying different types.
check --type stat
- Only use column statistics.check --type bloom-filter
- Only use bloom filter.check --type predicate
- Used in combination with column statistics and bloom filter.Why are the changes needed?
ORC supports specifying multiple columns to generate bloom filter indexes, but it lacks a convenient tool to verify the effect of bloom filter.
Parquet also has similar commands. PARQUET-2138: Add ShowBloomFilterCommand to parquet-cli
How was this patch tested?
Add UT
Was this patch authored or co-authored using generative AI tooling?
No