databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
226 stars 119 forks source link

Execute adding column comments/constraints as a single query #845

Open jsingh-cw opened 1 day ago

jsingh-cw commented 1 day ago

Currently, if I provide column comments or NOT NULL constraints for 100 columns in a model, dbt-databricks adapter ends up running 200 separate queries resulting in significantly long total model run time. Ideally, all comments should be added via single ALTER query and constraints should be added via second ALTER query.

This will benefit model run time.

Please see attached log to see how individual SQL queries added to the overall model runtime. Table creation/data load takes less than 30 seconds but column comments and NOT NULL constraints are added to one column per SQL query and takes ~7 mins to complete.

debug_run_86601.log

image

benc-db commented 12 hours ago

I'm working on this as we speak. There is no bulk alter for column comments, but I'm reshaping materializations to set comments at create time. This means separating create and insert of data (because you can't set comments with 'create table as select' :), but given how much users have complained about comment performance, I expect it will be a net win.