apache / parquet-format

Apache Parquet Format
https://parquet.apache.org/
Apache License 2.0
1.69k stars 422 forks source link

[T2] Wide column metadata improvemnts #253

Open alkis opened 1 month ago

alkis commented 1 month ago
  1. Make ColumnMetaData.type optional
  2. Make ColumnMetaData.path_in_schema optional
  3. Add ColumnMetaData.schema_index. This is the ordinal in FileMetaData.schema this column corresponds to. This allows sparse representation of columns in a rowgroup.
  4. Deprecate ColumnMetaData.encoding_stats and replace with ColumnMetaData.is_fully_dict_encoded.

ref Parquet Metadata evolution

Jira

Commits

Documentation