apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.43k stars 955 forks source link

[parquet] Fix nested array/map has no id in parquet files #4513

Closed tsreaper closed 1 week ago

tsreaper commented 1 week ago

Purpose

In #4362 we write field ids into parquet files. However that PR fails to add id for nested array/map type (for example ARRAY(MAP(INT, ARRAY(INT))) type. This PR fixes the bug.

Tests

Unit tests.

API and Format

Field id of array element, map keys and values are changed.

Documentation

No document is needed.