Open asfimport opened 7 years ago
Uwe Korn / @xhochy:
We can add bzip2
to Parquet but this will only change compression, it won't have any effect on splittability. By the design of the format Parquet files are always splittable, independently of the compression algorithm used. This means especially that also GZIP compressed Parquet files are splittable. In your case, it is probably easier to stick with that instead of implementing bzip2
in Parquet.
Still it would be nice to see if bzip2
would improve performance-wise against the currently implemented GZIP/snappy/Brotli codecs.
Hi,
I have a requirement to implement Parquet with bzip2 compression because it's splitable. Right now, we can't provide bzip2 in PIG.
SET parquet.compression none/gzip/SNAPPY;
Is there any way to compress to bzip2 on top parquet ?
Reporter: Rajasekhar Konda
Note: This issue was originally created as PARQUET-1011. Please see the migration documentation for further details.