databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
226 stars 119 forks source link

Databricks OAuth secret not working as expected in 1.8 #761

Open markus-sh-lftt opened 3 months ago

markus-sh-lftt commented 3 months ago

Describe the bug

We are sometimes running dbt as a service principal from the command line. profiles.yml then looks something like

project:
  outputs:
    o1:
      auth_type: oauth
      client_id: [sp_id]
      client_secret: "{{ env_var('DB_CLIENT_SECRET') }}"
      type: databricks
      [...]

The value of DB_CLIENT_SECRET has been a Databricks OAuth secret as per the docs. This has been working fine with dbt-databricks>=1.7 but now when upgrading to 1.8(.5) it starts failing with

Runtime Error
  invalid_client: AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID, for a secret added to app '[sp_id]'.

And if I instead supply the client secret that I set up in Microsoft Entra ID it works again.

What is the intended usage? Do I need to switch from using Databricks OAuth secrets here to using that from Microsoft Entra ID?

benc-db commented 3 months ago

This is a legit bug that we aim to fix shortly, but what's surprising to me is the idea that it was introduced in 1.8...I'm reasonably sure we had this same bug in 1.7. Which version of 1.7?

markus-sh-lftt commented 3 months ago

Yeah you're right, it's there in 1.7 too it seems. But using the versions I happened to have installed I'm not able to reproduce the issue:

❯ dbt --version
Core:
  - installed: 1.7.13
  - latest:    1.8.5  - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - databricks: 1.7.3 - Update available!
  - spark:      1.7.1 - Update available!

Using these I am able to reproduce it even on dbt-databricks 1.7:

❯ dbt --version
Core:
  - installed: 1.7.18
  - latest:    1.8.5  - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - databricks: 1.7.17 - Update available!
  - spark:      1.7.1  - Update available!

I tried a few different combinations of dbt-databricks and dbt-core and you probably know better than me but I think it's after your 1.7.3 that it starts to break, but not all versions of dbt-core 1.7 will work even though dbt-spark says ">=1.7.0,<1.8.0" so it's hard for me to see a pattern :)

But also: What is the intended behavior? Is it the Databricks OAuth secrets that I should be using whenever this is fixed?

benc-db commented 3 months ago

When it's fixed you will have the option of either using Databricks OAuth secret, or AAD secret. The reason it could work in old versions of 1.7.x is we didn't use to pin our version of the Databricks SDK. We started pinning when the SDK updated and broke our tests...but it broke our tests because we were using Azure and doing it with an AAD secret, and the SDK started using Databricks OAuth secret by default. So, on an old 1.7, you can install Databricks SDK 0.18.0 or higher, which works with Databricks OAuth secret.

benc-db commented 3 months ago

So, long story short, I'm working on some stability work right now (ensuring that new versions of the Databricks runtime don't break dbt users), and the next thing I'll work on after that is getting both ways of doing OAuth on Azure to work.

markus-sh-lftt commented 1 week ago

Hey @benc-db when do you think this issue will be addressed?

benc-db commented 1 week ago

Hi apologies for the long delay. We had been waiting on a fix in the Databricks SDK, which I believe has landed. Now I'm waiting for @eric-wang-1990 (my OAuth expert :) ) to free up to finish off the work on our side.