apache / iceberg-rust

Apache Iceberg
https://rust.iceberg.apache.org/
Apache License 2.0
474 stars 97 forks source link

feat: support append data file and add e2e test #349

Open ZENOTME opened 2 months ago

ZENOTME commented 2 months ago

This PR is complete https://github.com/apache/iceberg-rust/issues/345.

  1. It adds the FastAppendAction to commit the data file

The design of this is based on https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java.

I implement a SnapshotProduceAction which will accept a Vec<ManifestFile> and Summary to generate a new snapshot and apply the snapshot to the tx.

FastAppendAction will reuse SnapshotProduceAction and have its own interface to process the added data files.

In the future, we can reuse SnapshotProduceAction to implement more append actions with different commit semantics as described in https://github.com/apache/iceberg-rust/issues/348.

  1. It init the e2e test for write data file

Please let me know if this design has something that can be improved and other things missed.

ZENOTME commented 2 months ago

cc @liurenjie1024 @Fokko @Xuanwo

ZENOTME commented 1 month ago

Hi, I have tried to fix this PR. Some things may not be fixed well now:

  1. https://github.com/apache/iceberg-rust/pull/349#discussion_r1580444775 I'm not sure whether my understanding is correct
  2. todo, we can do them in later PR:
  3. https://github.com/apache/iceberg-rust/pull/349#discussion_r1580571634 Please let me know if there are other things I miss and need to fix. cc @Fokko