dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
395 stars 221 forks source link

[ADAP-405] [Feature] Use server_side_parameters as SparkSession config in Spark "session" mode #690

Closed alarocca-apixio closed 1 year ago

alarocca-apixio commented 1 year ago

Is this your first time submitting a feature request?

Describe the feature

Configuration for server_side_parameters already exists in the Spark configuration. However, these are currently unused in Spark "session" mode. We can pass these into SessionConnectionWrapper and then, when building the SparkSession, we can pass them as Spark "config":

for k, v in server_side_parameters.items():
            builder = builder.config(k, v)

Describe alternatives you've considered

We've tried Spark via Thrift, but this doesn't support all of our necessary use cases, such as configuring KMS encryption for the jobs.

Who will this benefit?

This will enable users to flexibly control the SparkSession that is created, giving users the ability to inject any parameters desired into the SparkSession.

Are you interested in contributing this feature?

Yes

Anything else?

No response

Fleid commented 1 year ago

Hey @alarocca-apixio, could you please check this discussion, I'm trying to regroup all the threads on that topic in one place. Please let me know if the plan works for you, before we can decide to move forward with this specific issue.

Fokko commented 1 year ago

@alarocca-apixio @Fleid Any progress on this? I would love to see this added. Happy to help on this one, just commented on the PR.

Fleid commented 1 year ago

I asked @JCZuurmond to weigh in on the topic, but we haven't had time to catch-up. Any update on your side Cor?

JCZuurmond commented 1 year ago

See PR for update 👆