dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
405 stars 227 forks source link

Ct 1873/support insert overwrite #700

Closed VersusFacit closed 1 year ago

VersusFacit commented 1 year ago

resolves #600 Closes #430

Description

The insert overwrite incremental strategy can be used with Delta tables so we no longer need to raise an exception saying that this is not a possibility. And the test case that says that it's a bad strategy obviously needs to be changed to something that is a positive example of the functionality at work

This PR looks like it's ready to go. I've moved the community PR to this branch so that we don't run into a CI issue that was stumbling their process and made this fall onto the backlog.

Checklist

VersusFacit commented 1 year ago

it appears that you cannot do these Delta overwrite strategies for whatever reason at the same time the database is configured with access control list.

spark.databricks.acl.dfAclsEnabled true

When I flip this to false, desta-overwrites work!! But, grants STOP working.

org.apache.spark.SparkException: Trying to perform permission action on Hive Metastore

Likewise, if I turn it back on, I run into the really weird cryptic errors that I’ve been dealing with for a while on the database cluster for delta dynamic overwrites.

I only found this out by finding an obscure unanswered question on the databricks forum plus a user of dbt five months ago struggling to solve this in the community Slack with no real resolution.

This may be a bug on databricks' side or that something to do with the implementation of a databricks environment prevents the simultaneous presence of these options. Either way it's not documented as to what is happening as far as I can tell. If there ends up being a solution I'd love to push that in and get this merged with future


Note, Setting the partition overwrite option to "dynamic" in the cluster set up does not resolve this issue

nssalian commented 1 year ago

The solution looks alright from glancing through. I wonder if we should document those parameters for users somewhere so they are able to troubleshoot if needed.

VersusFacit commented 1 year ago

I'll clean up the commit history before I merge