databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
318 stars 103 forks source link

Better error message when private link enabled workspaces reject requests #647

Closed mgyucht closed 1 month ago

mgyucht commented 1 month ago

Changes

This PR ports https://github.com/databricks/databricks-sdk-go/pull/924 to the Python SDK.

When a user tries to access a Private Link-enabled workspace configured with no public internet access from a different network than the VPC endpoint belongs to, the Private Link backend redirects the user to the login page, rather than outright rejecting the request. The login page, however, is not a JSON document and cannot be parsed by the SDK, resulting in this error message:

$ databricks current-user me
Error: unexpected error handling request: invalid character '<' looking for beginning of value. This is likely a bug in the Databricks SDK for Go or the underlying REST API. Please report this issue with the following debugging information to the SDK issue tracker at https://github.com/databricks/databricks-sdk-go/issues. Request log:
GET /login.html?error=private-link-validation-error:<WSID>
> * Host: 
> * Accept: application/json
> * Authorization: REDACTED
> * Referer: https://adb-<WSID>.azuredatabricks.net/api/2.0/preview/scim/v2/Me
> * User-Agent: cli/0.0.0-dev+5ed10bb8ccc1 databricks-sdk-go/0.39.0 go/1.22.2 os/darwin cmd/current-user_me auth/pat
< HTTP/2.0 200 OK
< * Cache-Control: no-cache, no-store, must-revalidate
< * Content-Security-Policy: default-src *; font-src * data:; frame-src * blob:; img-src * blob: data:; media-src * data:; object-src 'none'; style-src * 'unsafe-inline'; worker-src * blob:; script-src 'self' 'unsafe-eval' 'unsafe-hashes' 'report-sample' https://*.databricks.com https://databricks.github.io/debug-bookmarklet/ https://widget.intercom.io https://js.intercomcdn.com https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js https://databricks-ui-assets.azureedge.net https://ui-serving-cdn-testing.azureedge.net https://uiserviceprodwestus-cdn-endpoint.azureedge.net https://databricks-ui-infra.s3.us-west-2.amazonaws.com 'sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=' 'sha256-YOlue469P2BtTMZYUFLupA2aOUsgc6B/TDewH7/Qz7s=' 'sha256-Lh4yp7cr3YOJ3MOn6erNz3E3WI0JA20mWV+0RuuviFM=' 'sha256-0jMhpY6PB/BTRDLWtfcjdwiHOpE+6rFk3ohvY6fbuHU='; report-uri /ui-csp-reports; frame-ancestors *.vocareum.com *.docebosaas.com *.edx.org *.deloitte.com *.cloudlabs.ai *.databricks.com *.myteksi.net
< * Content-Type: text/html; charset=utf-8
< * Date: Fri, 17 May 2024 07:47:38 GMT
< * Server: databricks
< * Set-Cookie: enable-armeria-workspace-server-for-ui-flags=false; Max-Age=1800; Expires=Fri, 17 May 2024 08:17:38 GMT; Secure; HTTPOnly; SameSite=Strict
< * Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< * X-Content-Type-Options: nosniff
< * X-Ui-Svc: true
< * X-Xss-Protection: 1; mode=block
< <!doctype html>
< <html>
<  <head>
<   <meta charset="utf-8">
<   <meta http-equiv="Content-Language" content="en">
<   <title>Databricks - Sign In</title>
<   <meta name="viewport" content="width=960">
<   <link rel="icon" type="image/png" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
<   <meta http-equiv="content-type" content="text/html; charset=UTF8">
<   <script id="__databricks_react_script"></script>
<   <script>window.__DATABRICKS_SAFE_FLAGS__={"databricks.infra.showErrorModalOnFetchError":true,"databricks.fe.infra.useReact18":true,"databricks.fe.infra.useReact18NewAPI":false,"databricks.fe.infra.fixConfigPrefetch":true},window.__DATABRICKS_CONFIG__={"publicPath":{"mlflow":"https://databricks-ui-assets.azureedge.net/","dbsql":"https://databricks-ui-assets.azureedge.net/","feature-store":"https://databricks-ui-assets.azureedge.net/","monolith":"https://databricks-ui-assets.azureedge.net/","jaws":"https://databricks-ui-assets.azureedge.net/"}}</script>
<   <link rel="icon" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
<   <script>
<   function setNoCdnAndReload() {
<       document.cookie = `x-databricks-cdn-inaccessible=true; path=/; max-age=86400`;
<       const metric = 'cdnFallbackOccurred';
<       const browserUserAgent = navigator.userAgent;
<       const browserTabId = window.browserTabId;
<       const performanceEntry = performance.getEntriesByType('resource').filter(e => e.initiatorType === 'script').slice(-1)[0]
<       sessionStorage.setItem('databricks-cdn-fallback-telemetry-key', JSON.stringify({ tags: { browserUserAgent, browserTabId }, performanceEntry}));
<       window.location.reload();
<   }
< </script>
<   <script>
<   // Set a manual timeout for dropped packets to CDN
<   function loadScriptWithTimeout(src, timeout) {
<      return new Promise((resolve, reject) => {
<         const script = document.createElement('script');
<           script.defer = true;
<           script.src = src;
<           script.onload = resolve;
<           script.onerror = reject;
<           document.head.appendChild(script);
<           setTimeout(() => {
<               reject(new Error('Script load timeout'));
<           }, timeout);
<       });
<   }
<   loadScriptWithTimeout('https://databricks-ui-assets.azureedge.net/static/js/login/login.8a983ca2.js', 10000).catch(setNoCdnAndReload);
< </script>
<  </head>
<  <body class="light-mode">
<   <uses-legacy-bootstrap>
<    <div id="login-page"></div>
<   </uses-legacy-bootstrap>
<  </body>
< </html>

To address this, I add one additional check in the error mapper logic to inspect whether the user was redirected to the login page with the private link validation error response code. If so, we return a custom error, PrivateLinkValidationError, with error code PRIVATE_LINK_VALIDATION_ERROR that inherits from PermissionDenied and has a mock 403 status code.

After this change, users will see an error message like this:

databricks.sdk.errors.private_link.PrivateLinkValidationError: The requested workspace has Azure Private Link enabled and is not accessible from the current network. Ensure that Azure Private Link is properly configured and that your device has access to the Azure Private Link endpoint. For more information, see https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/private-link-standard#authentication-troubleshooting.

The error message is tuned to the specific cloud so that we can redirect users to the appropriate documentation, the cloud being inferred from the request URI.

Tests

Unit tests cover the private link error message mapping. To manually test this, I created a private link workspace in Azure, created an access token, restricted access to the workspace, then ran the last_job_runs.py example using the host & token:

/Users/miles/databricks-cli/.venv/bin/python /Users/miles/databricks-sdk-py/examples/last_job_runs.py 
2024-05-17 11:43:32,529 [databricks.sdk][INFO] loading DEFAULT profile from ~/.databrickscfg: host, token
Traceback (most recent call last):
  File "/Users/miles/databricks-sdk-py/examples/last_job_runs.py", line 20, in <module>
    for job in w.jobs.list():
  File "/Users/miles/databricks-sdk-py/databricks/sdk/service/jobs.py", line 5453, in list
    json = self._api.do('GET', '/api/2.1/jobs/list', query=query, headers=headers)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/miles/databricks-sdk-py/databricks/sdk/core.py", line 131, in do
    response = retryable(self._perform)(method,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/miles/databricks-sdk-py/databricks/sdk/retries.py", line 54, in wrapper
    raise err
  File "/Users/miles/databricks-sdk-py/databricks/sdk/retries.py", line 33, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/miles/databricks-sdk-py/databricks/sdk/core.py", line 245, in _perform
    raise self._make_nicer_error(response=response) from None
databricks.sdk.errors.private_link.PrivateLinkValidationError: The requested workspace has Azure Private Link enabled and is not accessible from the current network. Ensure that Azure Private Link is properly configured and that your device has access to the Azure Private Link endpoint. For more information, see https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/private-link-standard#authentication-troubleshooting.
github-actions[bot] commented 1 month ago

This PR breaks backwards compatibility for databrickslabs/blueprint downstream. See build logs for more details.

Running from downstreams #123

codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 97.91667% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 57.66%. Comparing base (fabe7c4) to head (6422cb5).

Files Patch % Lines
databricks/sdk/core.py 66.66% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #647 +/- ## ========================================== + Coverage 57.62% 57.66% +0.03% ========================================== Files 47 48 +1 Lines 32650 32680 +30 ========================================== + Hits 18815 18844 +29 - Misses 13835 13836 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.