Closed nicor88 closed 1 year ago
@Tomme could you have a look??
a drop statement of iceberg table lead to deleting all the data in S3
What do you mean by "all data"? Does this imply any risk of losing data of other tables/relations? What would be the desired outcome of dropping the relation?
@rumbin dropping a table in the current adaptor setup means: looking the table location path, cleaning up via S3 apis, then finally dropping the table in athena. See implementation here.
When working with iceberg a simple drop table
will drop the data in the specified location of the table, that's what I mean with all-data. Also, as we are talking about materialization=table here, in dbt world a materialization table drop the original table and re-create from scratch, so it's the first basic primitive when working with dbt.
I re-worked the comment saying:
drop statement of iceberg table lead to deleting the data in S3 automatically in the specified location
Thanks for clarifying 😊 I know what DROP means, but with your initial description I got confused
@rumbin well, in athena world a drop statement of a table is not like Postres/Redshift/Snowflake. But now with Iceberg should get closer to it. Anyhow, the iceberg helper add a simple macro to just drop the table, without calling s3 path pruning.
I see. Thanks again for explaining.
Closing as added to dbt-athena-community
What
This PR introduce iceberg as table materialization.
As iceberg doesn't support CTA, the implementation do the following:
Notes
adapter.drop_relation
doesn't work with iceberg, a drop statement of iceberg table lead to deleting the data in S3 automatically in the specified locationDoing so, the adapter add a unique uuid to the final table location, that is help-full in case of rename statement (e.g. you want to promote the table to your table used by analyst/reporting, after some running some dbt tests), to avoid collision when the table is recreated. It's possible to disable such behaviour using
strict_location=True
, that is the default.Models used to test
Without partitions
With partitions
With external location
With different data types
Table properties
Not strict location