databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
228 stars 119 forks source link

Allow users to opt out of optimize calls on liquid-clustering/zordering #703

Open benc-db opened 5 months ago

benc-db commented 5 months ago

Describe the feature

Users want to capability to schedule their own optimize runs.

Describe alternatives you've considered

Currently dbt-databricks runs it after every call to merge in data, but some users would rather schedule out of band

Who will this benefit?

Users than update incremental tables frequently and want to batch optimize

NodeJSmith commented 3 months ago

Isn't this handled by --vars '{DATABRICKS_SKIP_OPTIMIZE: true}'? If we create another way to do it I'd be all for it, as the current vars argument is a pain, especially if a databricks workflow with a dbt task, but functionality-wise I thought we had this already.

benc-db commented 3 months ago

Oh yeah! I had forgotten that existed, but I do think it would be better to opt out on the config level.

Bazsy commented 2 months ago

I support this being on the config level especially as databricks now offers predictive optimization which runs optimize automatically as needed.

mmansikka commented 1 month ago

Config level would a lot easier for larger projects running many models with different requirements and/or writing to different catalogs where on only some the predictive optimization is enabled.