apache / parquet-format

Apache Parquet Format
https://parquet.apache.org/
Apache License 2.0
1.69k stars 422 forks source link

Clarify parquet encoding #295

Closed asfimport closed 8 months ago

asfimport commented 9 months ago

Dictionary page format: the entries in the dictionary - in dictionary order - using the plain encoding.

The dictionary entries are not sorted (or at least not always sorted).

There is no padding between values (except for the last byte) which is padded with 0s.

Minor change.

Please correct me if I get anything wrong.

Reporter: Letian Jiang / @letian-jiang Assignee: Letian Jiang / @letian-jiang

Externally tracked issue: https://github.com/apache/parquet-format/pull/217

Note: This issue was originally created as PARQUET-2362. Please see the migration documentation for further details.