delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.34k stars 413 forks source link

2023 H1 Roadmap #1128

Closed wjones127 closed 1 year ago

wjones127 commented 1 year ago

Work committed to

These are projects current contributors are working on.

Projects seeking contributors

In addition to smaller issues labelled good-first-issue, these are some larger projects that we could use some help on. Most of them will be implemented as part of the operations module in the Rust source and can later be exposed to Python and other bindings.

MrPowers commented 1 year ago

This looks great! Really excited!

Some blog post ideas:

Let me know if I should make issues for the blog posts. I'm fine tracking them elsewhere too. I'll want delta-rs community reviews, but we can just do those in the Slack chat. Thanks for putting this together.

saivarunk commented 1 year ago

@MrPowers I'm interested in taking up Delta Lake + AWS Lambda blog post. Can you help me out with the process?

ion-elgreco commented 1 year ago

@wjones127 maybe a silly question but why would you still need the Operations API that only uses data fusion (in rust) after introducing the ADBC API?

From the design document I can see any query engine can potentially be used with ADBC.

FlavioDiasPs commented 1 year ago

Why implement optimize and zorder when databricks is going to the opposite side with Liquid Clustering. By the moment delta-rs implement this, databricks will have made Liquid Clustering the default.

ion-elgreco commented 1 year ago

Why implement optimize and zorder when databricks is going to the opposite side with Liquid Clustering. By the moment delta-rs implement this, databricks will have made Liquid Clustering the default.

But they are already implemented in delta-rs.

andreale28 commented 1 year ago

Why implement optimize and zorder when databricks is going to the opposite side with Liquid Clustering. By the moment delta-rs implement this, databricks will have made Liquid Clustering the default.

Delta-rs team actually implemented these two features before the announcement of delta 3.0 and liquid clustering. To be honest, delta 3.0 and liquid clustering came out kinds of unexpectedly

sim-san commented 3 months ago

@rtyler Do you plan to support Generated Columns (Writer Version 4) in delta-rs ?