databricks / cli

Databricks CLI
Other
129 stars 48 forks source link

cli commands unexpectedly inferring host from bundle #1358

Open brookpatten opened 5 months ago

brookpatten commented 5 months ago

Describe the issue

When using the cli from a folder that contains a bundle, the cli unexpectedly infers host configuration from the bundle even with default profile is set in .databrickscfg.

Steps to reproduce the behavior

Assume a empty python template bundle with a dev target like this:

bundle:
  name: my_project

include:
  - resources/*.yml

targets:
  dev:
    mode: development
    default: true
    workspace:
      host: https://wrongurl.azuredatabricks.net

consider a CI/CD script that does something like this:

databricks configure --profile foo --host https://correcturl.azuredatabricks.net <<< "pat#####"
databricks clusters list -p foo

This yeilds: Error: cannot resolve bundle auth configuration: config host mismatch: profile uses host https://correcturl.azuredatabricks.net,/ but CLI configured to use https://wrongurl.azuredatabricks.net/

To make this work, you have to add a bundle config and make it match, which is unexpeted.

targets:
  correcturl:
    mode: development
    workspace:
      host: https://correcturl.azuredatabricks.net`
databricks configure --profile foo --host https://correcturl.azuredatabricks.net <<< "pat#####"
databricks clusters list -t correcturl

Expected Behavior

A few different options, I'm not sure which is "correct". 1) I spent quite a lot of time trying to figure out how the "wrongurl" host was getting picked up by the CLI. Maybe I'm dumb but I did not expect commands like databricks clusters to infer from the bundle config. 2) If it is inferring config from the bundle, I think it should output that it is doing so, so it's clear. I spent waaaay too much time debugging .databrickscfg and environment variables trying to figure out where the host was coming from 3) Maybe it would make sense to embed more "infer from bundle" commands within databricks bundle and leave databricks <command> to stick to .databrickscfg / env vars etc.

Actual Behavior

CLI inferred host from the bundle in the same path.

OS and CLI version

Tried on both v0.209.1 and v0.217.1

Is this a regression?

I didn't check, but I suspect this behavior changed when bundles were added.

Some additional background. In this case the CI/CD is deploying a bundle, however as part of the deployment we connect to another workspace and use databricks-connect to execute some ddl updates to keep a legacy hive metastore up to date. This connect was running in a seperate github actions job isolated from the bundle deploy, so it was very unexpected that the CLI was still picking up the bundle host.

andrewnester commented 5 months ago

Hi! Even though confusing, this is a behaviour made by design when using bundles to optimise for DABs experience, e.g. creating clusters for your bundle jobs or list pipeline or job runs for your bundle and etc.

I spent quite a lot of time trying to figure out how the "wrongurl" host was getting picked up by the CLI.... If it is inferring config from the bundle, I think it should output that it is doing so, so it's clear.

You can use databricks auth describe command which can help you figure out where certain config values are coming from. In this case it will indicate that host value is coming from bundles

brookpatten commented 5 months ago

I get what you're saying, but when i'm specifying -p profile there is no ambiguity about what I'm trying to do. It's unintuitive that the cli confuses inferred bundle config over an explicit flag. Explicit flags should have precedence.

brookpatten commented 5 months ago

One additional point which may contribute to confusion: databricks vscode extension does not appear to follow this same logic and infer from bundle. If it was consistent across tools it would make more sense.

NodeJSmith commented 4 months ago

FWIW, I've raised a similar issue after spending a considerate amount of time troubleshooting a similar auth issue.

I do not agree that the current behavior is optimal - if I provide a command line argument it should be used, regardless of the environment. I can't always control the environment, but I can control what arguments I provide to the CLI binary. If I can't trust that the arguments I'm providing to the CLI are being used, it makes it hard to ever feel confident using the CLI.

In fact, I currently have a shell function that just unsets all of my databricks variables that I call whenever I am having issues with the CLI auth. I'm about to make another one to delete databricks.yml and the .databricks directory created by the VS Code extension so I can switch profiles when in a project directory - I appreciate the idea behind inferring values from the environment, but it shouldn't be at the expense of what the user directly provides.

brookpatten commented 4 months ago

I currently have a shell function that just unsets all of my databricks variables that I call whenever I am having issues with the CLI auth

Same.

The fact that these even exist is a good indicator that the current behavior is unintuitive.

dernat71 commented 1 month ago

+1 ! I manage my authentication via environment variables and, now, even simple commands like databricks clusters list are failing saying that I need to specify a target (Error: please specify target) when in a bundle-enabled project. Bundles are a nice addition, but shouldn't interfere with your authentication context..

This becomes very messy and counter-intuitive when coupled with databricks-connect . What should I use now? profiles, targets, ... ?

NodeJSmith commented 1 month ago

@andrewnester is there any expectation that Databricks will be working on improving this user experience?

Also, is this really a good item to mark as "good first issue"? This is more of a design decision regarding how authentication should be architected, I don't see how anyone could move on this without Databricks making a decision about how Auth should work. And if they decide that it shouldn't change, that should be admitted and these issues closed probably, and we'll all just continue to hack at the cli and SDK to make it work in a semi-intuitive way on our own.

andrewnester commented 1 month ago

@NodeJSmith sorry for not sharing the updates here, internally we had a discussion about it and what @brookpatten pointed out as an expected order / flow of auth makes sense and that's what we agreed on internally to have implemented. So, yes, we aim to improve the experience. Unfortunately, at the moment, it fell off our radar due to some other tasks which took priority. I can't define any ETA at the moment but considering impact of this we'll try to prioritise this issue.

Once again thanks for the feedback and discussion.

cc @pietern

gardnmi commented 4 weeks ago

Here is my user experience if it helps with the prioritization. It's very frustrating and to me feels broken. Having different behavior based on bundles and the vscode extension is not a pleasant experience and why a lot of people are coming up with hacks to resolve the issues.

My setup has a bundle project and two profiles in a .databrickscfg file (dev, staging) with and without the databricks vscode extension.

Project Contains a Bundle

$ databricks auth describe

Host: xxxx
User: xxxx
Authenticated with: pat
-----
Current configuration:
  ✓ host: xxxx (from bundle)
  ✓ token: ******** (from /workspaces/xxxx/.databricks/.databrickscfg config file)
  ✓ profile: dev (from bundle)
  ✓ config_file: /workspaces/xxxx/.databricks/.databrickscfg (from DATABRICKS_CONFIG_FILE environment variable)
  ✓ auth_type: pat

$ databricks clusters list --profile dev ✅ works $ databricks clusters list --target dev ✅ works $ databricks bundle validate --target dev ✅ works

$ databricks clusters list --profile staging ❌ doesn't work $ databricks clusters list --target staging ✅ works $ databricks bundle validate --target staging ✅ works

Same Project but Installed VsCode Databricks Extension and Authenticated it.

$ databricks auth describe

Host: xxxx
User: xxxx
Authenticated with: metadata-service
-----
Current configuration:
  ✓ host: xxxx (from bundle)
  ✓ metadata_service_url: ******** (from DATABRICKS_METADATA_SERVICE_URL environment variable)
  ~ token: ******** (from /workspaces/xxxx/.databricks/.databrickscfg config file, not used for auth type metadata-service)
  ✓ profile: dev (from bundle)
  ✓ config_file: /workspaces/xxxx/.databricks/.databrickscfg (from DATABRICKS_CONFIG_FILE environment variable)
  ✓ auth_type: metadata-service (from DATABRICKS_AUTH_TYPE environment variable)

$ databricks clusters list --profile dev ✅ works $ databricks clusters list --target dev ✅ works $ databricks bundle validate --target dev ✅ works

$ databricks clusters list --profile staging ❌ doesn't work $ databricks clusters list --target staging ❌ doesn't work $ databricks bundle validate --target staging ❌ doesn't work

zaplapl commented 3 weeks ago

me too, I am upset that the current behaviour is the case

please change it

MerreM commented 3 weeks ago

I am trying to run a databricks bundle as a service principle in azure and cannot for the life of me force it to use auth described in databricks auth describe in the databricks bundle run stage.

Setting -p profile name had it complaining of conflicts.