apache / iceberg-go

Apache Iceberg - Go
https://iceberg.apache.org/
Apache License 2.0
142 stars 34 forks source link

feat(table): Initial implementation of Reading Data #185

Closed zeroshade closed 1 week ago

zeroshade commented 3 weeks ago

This provides an initial implementation for reading data as Arrow record batches or tables from iceberg tables. The data is parallelized and streamed. The records are pulled in a consistent ordering and allows for filtering and row limits.

The underlying interactions with the file are abstracted behind interfaces in an internal package to allow for future additions of handling ORC and Avro files in addition to the Parquet implementation.

This PR also includes the addition of integration tests to ensure the reads are working properly.

zeroshade commented 3 weeks ago

CC @Fokko @nastra

Fokko commented 1 week ago

Let's get this in, thanks again @zeroshade