Open asfimport opened 7 years ago
Daniel Lemire: Relevant blog post:
Deepak Majeti / @majetideepak: I looked at the code and the blog briefly. The current implementation works for dictionary indices that are bit-packed. This implementation will have to be extended to support Rle-Bitpacked hybrid encoding current used by parquet-cpp to encode dictionary index values. Encoding details here: https://github.com/apache/parquet-cpp/blob/master/src/parquet/util/rle-encoding.h#L33
I guess the rle encoding of indices will furture improve the performance since it will not require the costly gather instruction.
See discussion in
https://github.com/apache/parquet-cpp/pull/140
and experiments from Daniel Lemire in
https://github.com/lemire/dictionary
Reporter: Wes McKinney / @wesm Assignee: Deepak Majeti / @majetideepak
Related issues:
Note: This issue was originally created as PARQUET-684. Please see the migration documentation for further details.