databricks / databricks-sdk-go

Databricks SDK for Go
https://docs.databricks.com/dev-tools/sdk-go.html
Apache License 2.0
48 stars 41 forks source link

Better error message when private link enabled workspaces reject requests #924

Closed mgyucht closed 4 months ago

mgyucht commented 4 months ago

Changes

When a user tries to access a Private Link-enabled workspace configured with no public internet access from a different network than the VPC endpoint belongs to, the Private Link backend redirects the user to the login page, rather than outright rejecting the request. The login page, however, is not a JSON document and cannot be parsed by the SDK, resulting in this error message:

$ databricks current-user me
Error: unexpected error handling request: invalid character '<' looking for beginning of value. This is likely a bug in the Databricks SDK for Go or the underlying REST API. Please report this issue with the following debugging information to the SDK issue tracker at https://github.com/databricks/databricks-sdk-go/issues. Request log:
GET /login.html?error=private-link-validation-error:<WSID>
> * Host: 
> * Accept: application/json
> * Authorization: REDACTED
> * Referer: https://adb-<WSID>.azuredatabricks.net/api/2.0/preview/scim/v2/Me
> * User-Agent: cli/0.0.0-dev+5ed10bb8ccc1 databricks-sdk-go/0.39.0 go/1.22.2 os/darwin cmd/current-user_me auth/pat
< HTTP/2.0 200 OK
< * Cache-Control: no-cache, no-store, must-revalidate
< * Content-Security-Policy: default-src *; font-src * data:; frame-src * blob:; img-src * blob: data:; media-src * data:; object-src 'none'; style-src * 'unsafe-inline'; worker-src * blob:; script-src 'self' 'unsafe-eval' 'unsafe-hashes' 'report-sample' https://*.databricks.com https://databricks.github.io/debug-bookmarklet/ https://widget.intercom.io https://js.intercomcdn.com https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js https://databricks-ui-assets.azureedge.net https://ui-serving-cdn-testing.azureedge.net https://uiserviceprodwestus-cdn-endpoint.azureedge.net https://databricks-ui-infra.s3.us-west-2.amazonaws.com 'sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=' 'sha256-YOlue469P2BtTMZYUFLupA2aOUsgc6B/TDewH7/Qz7s=' 'sha256-Lh4yp7cr3YOJ3MOn6erNz3E3WI0JA20mWV+0RuuviFM=' 'sha256-0jMhpY6PB/BTRDLWtfcjdwiHOpE+6rFk3ohvY6fbuHU='; report-uri /ui-csp-reports; frame-ancestors *.vocareum.com *.docebosaas.com *.edx.org *.deloitte.com *.cloudlabs.ai *.databricks.com *.myteksi.net
< * Content-Type: text/html; charset=utf-8
< * Date: Fri, 17 May 2024 07:47:38 GMT
< * Server: databricks
< * Set-Cookie: enable-armeria-workspace-server-for-ui-flags=false; Max-Age=1800; Expires=Fri, 17 May 2024 08:17:38 GMT; Secure; HTTPOnly; SameSite=Strict
< * Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< * X-Content-Type-Options: nosniff
< * X-Ui-Svc: true
< * X-Xss-Protection: 1; mode=block
< <!doctype html>
< <html>
<  <head>
<   <meta charset="utf-8">
<   <meta http-equiv="Content-Language" content="en">
<   <title>Databricks - Sign In</title>
<   <meta name="viewport" content="width=960">
<   <link rel="icon" type="image/png" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
<   <meta http-equiv="content-type" content="text/html; charset=UTF8">
<   <script id="__databricks_react_script"></script>
<   <script>window.__DATABRICKS_SAFE_FLAGS__={"databricks.infra.showErrorModalOnFetchError":true,"databricks.fe.infra.useReact18":true,"databricks.fe.infra.useReact18NewAPI":false,"databricks.fe.infra.fixConfigPrefetch":true},window.__DATABRICKS_CONFIG__={"publicPath":{"mlflow":"https://databricks-ui-assets.azureedge.net/","dbsql":"https://databricks-ui-assets.azureedge.net/","feature-store":"https://databricks-ui-assets.azureedge.net/","monolith":"https://databricks-ui-assets.azureedge.net/","jaws":"https://databricks-ui-assets.azureedge.net/"}}</script>
<   <link rel="icon" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
<   <script>
<   function setNoCdnAndReload() {
<       document.cookie = `x-databricks-cdn-inaccessible=true; path=/; max-age=86400`;
<       const metric = 'cdnFallbackOccurred';
<       const browserUserAgent = navigator.userAgent;
<       const browserTabId = window.browserTabId;
<       const performanceEntry = performance.getEntriesByType('resource').filter(e => e.initiatorType === 'script').slice(-1)[0]
<       sessionStorage.setItem('databricks-cdn-fallback-telemetry-key', JSON.stringify({ tags: { browserUserAgent, browserTabId }, performanceEntry}));
<       window.location.reload();
<   }
< </script>
<   <script>
<   // Set a manual timeout for dropped packets to CDN
<   function loadScriptWithTimeout(src, timeout) {
<      return new Promise((resolve, reject) => {
<         const script = document.createElement('script');
<           script.defer = true;
<           script.src = src;
<           script.onload = resolve;
<           script.onerror = reject;
<           document.head.appendChild(script);
<           setTimeout(() => {
<               reject(new Error('Script load timeout'));
<           }, timeout);
<       });
<   }
<   loadScriptWithTimeout('https://databricks-ui-assets.azureedge.net/static/js/login/login.8a983ca2.js', 10000).catch(setNoCdnAndReload);
< </script>
<  </head>
<  <body class="light-mode">
<   <uses-legacy-bootstrap>
<    <div id="login-page"></div>
<   </uses-legacy-bootstrap>
<  </body>
< </html>

To address this, I add one additional check in the error mapper logic to inspect whether the user was redirected to the login page with the private link validation error response code. If so, we return a synthetic error with error code PRIVATE_LINK_VALIDATION_ERROR that inherits from ErrPermissionsDenied and has a mock 403 status code.

After this change, users will see an error message like this:

panic: The requested workspace has Azure Private Link enabled and is not accessible from the current network. Ensure that Azure Private Link is properly configured and that your device has access to the Azure Private Link endpoint. For more information, see https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/private-link-standard#authentication-troubleshooting.

The error message is tuned to the specific cloud so that we can redirect users to the appropriate documentation, the cloud being inferred from the request URI.

As part of this, I made a small refactor of environments code into a separate package so it can be used by both the config and apierr packages.

Tests

Unit tests cover the private link error message mapping. To manually test this, I created a private link workspace in Azure, created an access token, restricted access to the workspace, then ran the default-auth example using the host & token:

$ go run ./examples/default-auth
panic: The requested workspace has Azure Private Link enabled and is not accessible from the current network. Ensure that Azure Private Link is properly configured and that your device has access to the Azure Private Link endpoint. For more information, see https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/private-link-standard#authentication-troubleshooting.

goroutine 1 [running]:
main.main()
        /Users/miles/databricks-sdk-go/examples/default-auth/main.go:17 +0x138
exit status 2
codecov-commenter commented 4 months ago

Codecov Report

Attention: Patch coverage is 62.74510% with 19 lines in your changes are missing coverage. Please review.

Project coverage is 7.14%. Comparing base (6106c39) to head (c7f3f54).

Files Patch % Lines
common/environment/environments.go 28.00% 17 Missing and 1 partial :warning:
config/config.go 50.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #924 +/- ## ======================================== + Coverage 7.12% 7.14% +0.02% ======================================== Files 279 281 +2 Lines 64446 64475 +29 ======================================== + Hits 4590 4609 +19 - Misses 59546 59558 +12 + Partials 310 308 -2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.