Open agrawalreetika opened 2 months ago
Velox uses the Arrow Parquet Writer. I see that there is an option to specify V1 https://github.com/apache/arrow/blob/main/cpp/src/parquet/properties.h Let's add it to Velox. Can you point me to a test for V1 vs V2?
Fix in progress - https://github.com/facebookincubator/velox/pull/9700
Native worker not writing Parquet data files for WriterVersion v1 (PARQUET_1_0)
Your Environment
Expected Behavior
When
set session hive.parquet_writer_version='PARQUET_1_0';
Parquet data should be written in format_version 1Current Behavior
Even if when setting
set session hive.parquet_writer_version='PARQUET_1_0';
Parquet data is written in format_version: 2.6Possible Solution
Steps to Reproduce
Sample Output of Parquet File -
Screenshots (if appropriate)
Context
Looks like the session property for
parquet_writer_version
is not honored in Prestissimo. Same works fine with Jave Parquet Writer