apache / parquet-java

Apache Parquet Java
https://parquet.apache.org/
Apache License 2.0
2.48k stars 1.37k forks source link

PARQUET-1126: Write unencrypted Parquet files without Hadoop #1376

Open dlvenable opened 1 week ago

dlvenable commented 1 week ago

If you want to write an unencrypted Parquet file without Hadoop, the existing code will use Hadoop to try to get encryption properties.

https://github.com/apache/parquet-java/blob/fbe13d89ae4193be12c164d4bb5342c5eba3963f/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java#L388-L393

However, if you have these null, we really didn't need to go through Hadoop. Also, it calls a helper method in ParquetOutputFormat. This class inherits from Hadoop's FileOutputFormat. So calling this method at all, requires Hadoop classes. To resolve this, I moved this helper into a package-protected EncryptionPropertiesHelper class.

Make sure you have checked all steps below.

Jira

Tests

testLoadEncPropertiesFactoryParquetConfiguration

Commits

Style

Documentation