jerolba / parquet-carpet

Java Parquet serialization and deserialization library using Java 17 Records
Apache License 2.0
50 stars 3 forks source link

Support converting java field names to parquet columns as snake_case #17

Closed jerolba closed 9 months ago

jerolba commented 9 months ago

Support to map automatically java field names to parquet columns with snake_case format

This java record:

record Data(long someIdValue, String companyCode) { }

Generates this parquet schema:

message Data {
  required int64 some_id_value;
  optional binary company_code (STRING);
}

Proposed API implementation: in CarpetWriter class add a new method that configures the strategy to use mapping fields to columns:

var writer = new CarpetWriter<>(file Data.class)..columnNamingStrategy(ColumnNamingStrategy.SNAKE_CASE);

Other strategies can be added if needed.