Netflix / iceberg

Iceberg is a table format for large, slow-moving tabular data
Apache License 2.0
476 stars 59 forks source link

Add split offsets to manifest files #5

Closed rdblue closed 5 years ago

rdblue commented 6 years ago

Instead of storing a single HDFS block size for each data file, Iceberg should store a list of split offsets. That will allow split planning to be more precise by using row group or stripe offsets, without reading file footers.

rdblue commented 5 years ago

This issue has moved to the ASF repo: https://github.com/apache/incubator-iceberg/issues/37