apache / iceberg-python

Apache PyIceberg
https://py.iceberg.apache.org/
Apache License 2.0
402 stars 147 forks source link

Implement Sorted Writes #871

Open vinjai opened 3 months ago

vinjai commented 3 months ago

Implements: https://github.com/apache/iceberg-python/issues/271

vinjai commented 2 months ago

This PR solves for:

  1. Writing sorted datasets to a partitioned or non-partitioned iceberg table.
  2. Generating manifests with correct sort-order-id.
  3. Integration tests to make sure sorted datasets are generated similar to spark sorting.

Decisions taken:

What is not in the scope of this PR?

vinjai commented 2 months ago

@Fokko This PR is ready for review