Open krifra1234 opened 2 months ago
Thanks for reporting. Need to rethink this.
Hi I am also facing the same issue, also another effect of the same is after the alter table cluster by, it's running "optimize table
@sundeep1687 you can skip optimize by setting DATABRICKS_SKIP_OPTIMIZE=true
@krifra1234 is the alter operation slow? The alternative is querying for metadata to decide whether to do it or not, and that is not particularly fast.
Alter option is almost instant, but the optimize after is taking a long time you mean something like this in config or if you have an example please share {{ config( materialized="incremental", incremental_strategy='replace_where', incremental_predicates =[lookback_predicate], liquid_clustered_by = 'request_date_local', DATABRICKS_SKIP_OPTIMIZE =true, tags=["cvs"] ) }}
I mean set it as an environment variable. It might also work as a dbt variable, but not as config.
Hi @benc-db - Can you please confirm that this DATABRICKS_SKIP_OPTIMIZE=true
as a dbt var should not mess with table property autoOptimize.optimizeWrite=true
? I understand that autoOptimize
kicks in at write and will probably work as-is but wanted to 100% sure. We would like to avoid explicit optimize
on the table if autoOptimize
is working.
that variable only affects whether we call optimize explicitly.
Describe the bug
When using liquid clustering the cluster columns in the deltalake table are update every time the dbt model is ran, even if the cluster columns are not changed in the config.
Steps To Reproduce
Create a DBT model with the follwoing config: materialized= 'incremental', incremental_strategy= 'append', liquid_clustered_by= ['columnname']
Run the dbt model multiple times and look for the operation "CLUSTER BY" in the deltalake table history. Fine the column “Operation Parameters” and you will see something similar to this for every run:
{ "oldClusteringColumns": "columnname", "newClusteringColumns": "columnname" }
Expected behavior
I would expect the CLUSTER BY operation not to run when the cluster columns are not changed.
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of
dbt --version
:The operating system you're using: Microsoft Windows 11 Enterpris
The output of
python --version
: Python 3.10.14Additional context
Add any other context about the problem here.