databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
372 stars 124 forks source link

`account_client.workspaces.list()` fails with `Databricks - Sign in` #332

Open stefanringeis opened 1 year ago

stefanringeis commented 1 year ago

Description Listing workspaces using the AccountClient fails.

Reproduction

import logging
from databricks.sdk import AccountClient

logging.basicConfig(level=logging.DEBUG)

account_client = AccountClient(
    auth_type='azure-cli',
    host='https://accounts.azuredatabricks.net/',
    account_id='<redacted>'
)

# Works
# print(account_client.workspaces.get('<redacted>'))

# Fails
print(account_client.workspaces.list())

Expected behavior Return the list of all workspaces.

Debug Logs

DEBUG:databricks.sdk:Ignoring pat auth, because azure-cli is preferred
DEBUG:databricks.sdk:Ignoring basic auth, because azure-cli is preferred
DEBUG:databricks.sdk:Ignoring metadata-service auth, because azure-cli is preferred
DEBUG:databricks.sdk:Ignoring oauth-m2m auth, because azure-cli is preferred
DEBUG:databricks.sdk:Ignoring azure-client-secret auth, because azure-cli is preferred
DEBUG:databricks.sdk:Attempting to configure auth: azure-cli
INFO:databricks.sdk:Using Azure CLI authentication with AAD tokens
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): accounts.azuredatabricks.net:443
DEBUG:urllib3.connectionpool:https://accounts.azuredatabricks.net:443 "GET /api/2.0/accounts/<redacted>/workspaces HTTP/1.1" 303 0
DEBUG:urllib3.connectionpool:https://accounts.azuredatabricks.net:443 "GET /login?account_id=<redacted>&next_url=%2Fapi%2F2.0%2Faccounts%2F<redacted>%2Fworkspaces HTTP/1.1" 200 438
DEBUG:databricks.sdk:GET /login?account_id=<redacted>&next_url=/api/2.0/accounts/<redacted>/workspaces
< 200 OK
< [non-JSON document of 727 bytes]
Traceback (most recent call last):
  File "/<redacted>/list_workspaces.py", line 16, in <module>
    print(account_client.workspaces.list())
  File "/<redacted>/.venv/lib/python3.10/site-packages/databricks/sdk/service/provisioning.py", line 2037, in list
    res = self._api.do('GET', f'/api/2.0/accounts/{self._api.account_id}/workspaces', headers=headers)
  File "/<redacted>/.venv/lib/python3.10/site-packages/databricks/sdk/core.py", line 1021, in do
    raise self._make_nicer_error(message=message) from None
databricks.sdk.core.DatabricksError: Databricks - Sign in

I have debugged the REST API call and this is the 200 response:

b'<!doctype html><html lang="en"><head><meta charset="utf-8"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="Databricks Sign in"/><title>Databricks - Sign in</title><script id="Cookiebot" src="https://consent.cookiebot.com/uc.js" data-cbid="459f54ba-f28b-4a56-ab47-7af5ef8b04b4" data-blockingmode="auto" type="text/javascript" defer="defer"></script><link rel="icon" href="/favicon.ico"><script defer="defer" src="/static/js/login.0e28aa5a.js"></script><link href="/static/css/login.fc121a26.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="login"></div></body></html>'

Other Information

Additional context This CLI bug is related: https://github.com/databricks/cli/issues/579

mgyucht commented 1 year ago

Thank you for reporting this. This is an issue with the underlying API. I've followed-up with the team responsible for this API to understand why 1) AAD authentication is not accepted and 2) that endpoint redirects you to the login page rather than respond with an error.

nfx commented 1 year ago

Azure databricks doesn’t support workspace listing

stefanringeis commented 1 year ago

The official Azure REST API: https://docs.databricks.com/api/azure/account/introduction

@nfx Do you know if workspace listing in Azure Databricks is in development or do I need to reach out to Azure directly? I am working on the deployment automation around Unity Catalog and this API would be quite handy.

nfx commented 1 year ago

@stefanringeis the last time I've checked with the relevant teams owning account console APIs a few weeks ago, there were no plans to support https://docs.databricks.com/api/account/workspaces/list for Azure in the near term.

To list Databricks Workspaces on Azure, the only possible way is using Azure Resource Manager APIs. Just so you know, that one Azure Active Directory Tenant is mapped onto one Databricks Account. The flow is:

  1. List all Azure Subscriptions that are available for the current tenant and get their IDs
  2. List all Azure Databricks Workspaces by supplying that Subscription ID
  3. Fetch the workspaceUrl and workspaceId from returned items

We're doing the same thing in downstream projects - https://github.com/databrickslabs/ucx/pull/264/files - and do not yet know if this particular workaround deserves to get into the SDK layer.

Sanket-Mehta commented 1 year ago

A suggestion @nfx, @mgyucht & team, if this isn't supported by Azure, maybe the error should state that in a better way? Please see if this is possible. I also ran into this same issue and ended up here.

The sign-in in terms of error when hitting this isn't that helpful to understand what's going on.

Thanks!

image