dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.76k stars 182 forks source link

Native Iceberg Support #851

Open sh-rp opened 11 months ago

sh-rp commented 11 months ago

Feature description

Iceberg support with py-iceberg to support multiple bucket hoster and query engines.

Are you a dlt user?

Yes, I'm already a dlt user.

Use case

No response

Proposed solution

No response

Related issues

No response

btelFD commented 9 months ago

Would love that!

shohamyamin commented 6 months ago

@sh-rp any update on that feature?

sh-rp commented 5 months ago

@shohamyamin right now we are not working on this, but we have support for delta tables now which might also work for some users :)

mike-luabase commented 5 months ago

I'm guessing you'd need / want partitioned writes which should be released in 0.7.0 https://github.com/apache/iceberg-python/issues/208

thenaturalist commented 1 day ago

@sh-rp any update on this?

Given the importance placed on openness and portability, Apache Iceberg has a considerably more open and less Databricks dependent history and ecosystem, so should be more well aligned with dlt's values and vision?