apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
13.9k stars 3.38k forks source link

[C++][Parquet] Disable LZ4 codec #42871

Open asfimport opened 5 years ago

asfimport commented 5 years ago

As discussed in https://issues.apache.org/jira/browse/PARQUET-1241, the parquet-cpp's LZ4 codec is not compatible with Hadoop and parquet-mr. We must disable the codec until we resolve the compatibility issues.

Reporter: Deepak Majeti / @majetideepak

Related issues:

Note: This issue was originally created as PARQUET-1515. Please see the migration documentation for further details.

asfimport commented 5 years ago

Antoine Pitrou / @pitrou: The irony is that fastparquet switched to LZ4 block compression to ensure compatibility with parquet-cpp:

https://github.com/dask/fastparquet/pull/315

 

asfimport commented 3 years ago

Antoine Pitrou / @pitrou: I don't think we're going to disable it soon, but we'll be implementing the new LZ4_RAW codec in PARQUET-1998.