pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.84k stars 1.92k forks source link

Write support for Apache Iceberg #14610

Open randypitcherii opened 8 months ago

randypitcherii commented 8 months ago

Description

I love the lazy reading for Iceberg. It's great.

If Polars support writes back to an Iceberg catalog, that would make it a really powerful too working alongside the sql engines and spark dataframes I'm usually stuck using with Iceberg.

image

Thanks!

alexander-beedie commented 8 months ago

Good timing on this issue, as I see pyiceberg write support just made its public debut^1; congrats @Fokko! Not sure how it might best integrate via Polars, but could certainly be interesting ;)

randypitcherii commented 8 months ago

Great news!

I spoke with Fokko today (he rules) and he showed me that polars will let me get an arrow table. With that, pyiceberg allegedly will write to any arbitrary iceberg target : )

So maybe I don't need built in write support in polars if its that easy. I'm going to give it a go and see.

I also took a peek at the write_delta function for inspiration but it was wayyyy over my head.