apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.43k stars 954 forks source link

[parquet] Fix that cannot read parquet ROW<DECIMAL> data #4533

Closed yuzelin closed 1 week ago

yuzelin commented 1 week ago

Purpose

In NestedColumnReader#readRow, we need to know the length of nested vectors. Generally, it can be gotten from AbstractHeapVector but ParquetDecimalVector is not an AbstractHeapVector. This PR fix it.

Also remove two unused classes: RowColumnReader, RowPosition.

Tests

API and Format

Documentation