Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.34k stars 164 forks source link

Iceberg Support Roadmap #2458

Open kevinzwang opened 4 months ago

kevinzwang commented 4 months ago

Features

Bugs and Testing

corleyma commented 1 week ago

@kevinzwang I am a bit confused about what copy-on-write writing represents in this set of features? Typically (at least coming from a Spark lens), copy-on-write is the default behavior for Iceberg v2 tables. Given that Daft supports basic and partitioned rights, what does it mean to say Daft does not have support for either COW or MOR writing? It has to be one or the other, right?

kevinzwang commented 1 week ago

@kevinzwang I am a bit confused about what copy-on-write writing represents in this set of features? Typically (at least coming from a Spark lens), copy-on-write is the default behavior for Iceberg v2 tables. Given that Daft supports basic and partitioned rights, what does it mean to say Daft does not have support for either COW or MOR writing? It has to be one or the other, right?

Hi @corleyma! COW and MOR are two different strategies we would need to support in order to do upsert (merge) operations in Iceberg. We currently only support append or overwrite for writing to Iceberg tables, which do not need either.

I've updated the roadmap to better reflect this.