Open Selfeer opened 1 week ago
It seems that we can only control dictionary encoding and byte stream split encoding via ParquetProperties: https://github.com/apache/parquet-java/blob/master/parquet-column/src/main/java/org/apache/parquet/column/ParquetProperties.java.
Other encoding types are enabled via WriterVersion: https://github.com/apache/parquet-java/blob/master/parquet-column/src/main/java/org/apache/parquet/column/values/factory/DefaultValuesWriterFactory.java
I’m working on a tool that generates Parquet files based on a file definition provided in JSON. I use the parquet-java library for this, and I’m curious if it’s possible to specify a particular type of encoding for specific columns when generating the file.