Open Pim-Mostert opened 3 months ago
Databricks CLI depends on Databricks Go SDK which recently added support for OIDC, see this:
The configuration you need to provide though is ACTIONS_ID_TOKEN_REQUEST_URL
and ACTIONS_ID_TOKEN_REQUEST_TOKEN
Please try to change you GitHub actions setup to use these variables and see if it works
@andrewnester Thanks for your reply. I'm not using GitHub actions though, but Azure DevOpes Pipelines. It appears your solution applies specifically to GitHub actions (see e.g. https://library.tf/providers/microsoft/azuredevops/latest/docs/guides/authenticating_service_principal_using_an_oidc_token). For Azure Pipelines, the above page mentions the variables ARM_TENANT_ID
, ARM_CLIENT_ID
, and ARM_OIDC_TOKEN
. These are indeed the ones I have tried and do not work.
Ah, indeed, I see. In this case, Go SDK we rely on for authentication is not yet supporting OIDC for Azure pipelines. I'm moving this issue to Go SDK as a feature request
Also it seems to be related to this feature request https://github.com/databricks/databricks-sdk-go/issues/495
@Pim-Mostert what is surprising is that CLI commands work for you, could you try to run this command with --log-level TRACE flag and provide an output?
databricks experiments list-experiments --log-level TRACE
@andrewnester Sure:
2024-08-27T11:01:55.1358321Z ------------- List experiments -------------
2024-08-27T11:01:55.1480761Z 11:01:55 INFO start pid=1874 version=0.227.0 args="databricks, experiments, list-experiments, --log-level, TRACE"
2024-08-27T11:01:55.1486300Z 11:01:55 DEBUG Found bundle root at /home/vsts/work/1/s (file /home/vsts/work/1/s/databricks.yml) pid=1874
2024-08-27T11:01:55.1489766Z 11:01:55 DEBUG Apply pid=1874 mutator=load
2024-08-27T11:01:55.1494173Z 11:01:55 INFO Phase: load pid=1874 mutator=load
2024-08-27T11:01:55.1497169Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq
2024-08-27T11:01:55.1501801Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=EntryPoint
2024-08-27T11:01:55.1508168Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=scripts.preinit
2024-08-27T11:01:55.1512861Z 11:01:55 DEBUG No script defined for preinit, skipping pid=1874 mutator=load mutator=seq mutator=scripts.preinit
2024-08-27T11:01:55.1516555Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=ProcessRootIncludes
2024-08-27T11:01:55.1520538Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=ProcessRootIncludes mutator=seq
2024-08-27T11:01:55.1524682Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=VerifyCliVersion
2024-08-27T11:01:55.1528227Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=EnvironmentsToTargets
2024-08-27T11:01:55.1531938Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=InitializeVariables
2024-08-27T11:01:55.1541062Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=DefineDefaultTarget(default)
2024-08-27T11:01:55.1544626Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=LoadGitDetails
2024-08-27T11:01:55.1550812Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=PythonMutator(load)
2024-08-27T11:01:55.1554405Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=validate:unique_resource_keys
2024-08-27T11:01:55.1558241Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=SelectDefaultTarget
2024-08-27T11:01:55.1561794Z 11:01:55 DEBUG Apply pid=1874 mutator=load mutator=seq mutator=SelectDefaultTarget mutator=SelectTarget(dev)
2024-08-27T11:01:55.1566466Z 11:01:55 TRACE Loading config via environment pid=1874 sdk=true
2024-08-27T11:01:55.1569451Z 11:01:55 TRACE Loading config via resolve-profile-from-host pid=1874 sdk=true
2024-08-27T11:01:55.1573469Z 11:01:55 TRACE Attempting to configure auth: pat pid=1874 sdk=true
2024-08-27T11:01:55.1575987Z 11:01:55 TRACE Attempting to configure auth: basic pid=1874 sdk=true
2024-08-27T11:01:55.1579964Z 11:01:55 TRACE Attempting to configure auth: oauth-m2m pid=1874 sdk=true
2024-08-27T11:01:55.1582931Z 11:01:55 TRACE Attempting to configure auth: databricks-cli pid=1874 sdk=true
2024-08-27T11:01:55.1586293Z 11:01:55 DEBUG Running command: /usr/local/bin/databricks auth token --host https://adb-XXX.azuredatabricks.net pid=1874 sdk=true
2024-08-27T11:01:55.1708113Z 11:01:55 TRACE Attempting to configure auth: metadata-service pid=1874 sdk=true
2024-08-27T11:01:55.1708871Z 11:01:55 TRACE Attempting to configure auth: github-oidc-azure pid=1874 sdk=true
2024-08-27T11:01:55.1709975Z 11:01:55 DEBUG Missing cfg.ActionsIDTokenRequestURL, likely not calling from a Github action pid=1874 sdk=true
2024-08-27T11:01:55.1710641Z 11:01:55 TRACE Attempting to configure auth: azure-msi pid=1874 sdk=true
2024-08-27T11:01:55.1711667Z 11:01:55 TRACE Attempting to configure auth: azure-client-secret pid=1874 sdk=true
2024-08-27T11:01:55.1712348Z 11:01:55 TRACE Attempting to configure auth: azure-cli pid=1874 sdk=true
2024-08-27T11:01:55.1713620Z 11:01:55 DEBUG Running command: az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d --output json --tenant XXX pid=1874 sdk=true
2024-08-27T11:01:55.8769849Z 11:01:55 INFO Refreshed OAuth token for 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d for tenant XXX from Azure CLI, which expires on 2024-08-27 12:01:54.000000 pid=1874 sdk=true
2024-08-27T11:01:55.8777969Z 11:01:55 DEBUG Running command: az account get-access-token --resource https://management.core.windows.net/ --output json --tenant XXX pid=1874 sdk=true
2024-08-27T11:01:56.3681975Z 11:01:56 INFO Refreshed OAuth token for https://management.core.windows.net/ for tenant XXX from Azure CLI, which expires on 2024-08-27 12:01:51.000000 pid=1874 sdk=true
2024-08-27T11:01:56.3688434Z 11:01:56 INFO Using Azure CLI authentication with AAD tokens pid=1874 sdk=true
2024-08-27T11:01:57.6325404Z 11:01:57 DEBUG GET /api/2.0/mlflow/experiments/list
2024-08-27T11:01:57.6326186Z < HTTP/2.0 200 OK
2024-08-27T11:01:57.6326679Z < {
2024-08-27T11:01:57.6326983Z < "experiments": [
2024-08-27T11:01:57.6327560Z < {
REDACTED
2024-08-27T11:01:57.6433239Z < "... (5 additional elements)"
2024-08-27T11:01:57.6433371Z < ]
2024-08-27T11:01:57.6433481Z < } pid=1874 sdk=true
2024-08-27T11:01:57.6433605Z [
2024-08-27T11:01:57.6433699Z {
2024-08-27T11:01:57.6433962Z "artifact_location":XXX
2024-08-27T11:01:57.6434160Z "creation_time": XXX
2024-08-27T11:01:57.6434310Z "experiment_id": XXX
2024-08-27T11:01:57.6434477Z "last_update_time": XXX
2024-08-27T11:01:57.6434691Z "lifecycle_stage": XXX
2024-08-27T11:01:57.6435001Z "name": XXX
2024-08-27T11:01:57.6435201Z "tags": [
2024-08-27T11:01:57.6435303Z {
2024-08-27T11:01:57.6435453Z "key": "mlflow.experiment.sourceName",
2024-08-27T11:01:57.6435772Z "value": XXX
2024-08-27T11:01:57.6435968Z },
2024-08-27T11:01:57.6436080Z {
2024-08-27T11:01:57.6436193Z "key": "mlflow.ownerId",
2024-08-27T11:01:57.6436341Z "value": "587479253565293"
2024-08-27T11:01:57.6436459Z },
2024-08-27T11:01:57.6436639Z {
2024-08-27T11:01:57.6436756Z "key": "mlflow.ownerEmail",
2024-08-27T11:01:57.6436923Z "value": "XXX"
2024-08-27T11:01:57.6437069Z },
2024-08-27T11:01:57.6437166Z {
2024-08-27T11:01:57.6437299Z "key": "mlflow.experimentType",
2024-08-27T11:01:57.6437433Z "value": "NOTEBOOK"
2024-08-27T11:01:57.6437559Z }
2024-08-27T11:01:57.6437654Z ]
2024-08-27T11:01:57.6437763Z },
... REDACTED
2024-08-27T11:01:57.6483974Z ]
2024-08-27T11:01:57.6484127Z 11:01:57 INFO completed execution pid=1874 exit_code=0
Ah, I see, CLI auth works because it eventually configures to use azure-cli
auth type and not OIDC one so it might be not what you expect anyway.
So to summarise:
azure-cli
type but bundles failed to do so is separate one and might be related to some miss on bundles side where we don't pass all necessary env variables. If this is an issue for you, please feel free to open a separate ticket for this in Databricks CLI repo.Thank you!
It's not an issue for me right now, but I expect it will be in the near future (when my company disables the old service connection). I've opened a new issue: https://github.com/databricks/cli/issues/1722
Please let me know if you need more information.
Thanks!
Describe the issue
I want to deploy a Databricks Asset Bundle from an Azure Pipeline using databricks cli. While authentication for the cli itself seems to work, the actual deployment does not. It appears that the underlying Terraform provider is not able to authenticate.
The issue in particular appears to arise from our DevOps service connection. The service connection is configured for Workload Identity Federation. When I try an old service connection that authenticates using client credentials, the deployment succeeds.
I suspect the bug may be fixed by simply upgrading the version of Terraform that databricks cli uses under the hood. Currently it uses Terraform
1.5.5
. Newer versions of Terraform seems to support the Workload Identity Federation flow. See https://developer.hashicorp.com/terraform/language/settings/backends/azurerm, but note how version1.5.x
of that same page makes no mention of Workload Identity Federation.Relevant documentation:
Configuration
I have tried various combinations of the
ARM_
environment variables above, but I couldn't find a working combination.What did work was using a service principal service connection, in combination with:
Steps to reproduce the behavior
Expected Behavior
The deployment of the asset bundle should succeed.
Actual Behavior
The following error appears in the pipeline's log:
Note that the listing of experiments works fine:
OS and CLI version
Output by the Azure pipeline:
Databricks CLI:
v0.227.0
OS: Ubuntu (Microsoft-hosted agent, latest version)
Is this a regression?
I don't know, I'm new to Databricks.
Debug Logs
See attachment. debug_logs.txt