Security Analysis Tool (SAT) analyzes customer's Databricks account and workspace security configurations and provides recommendations that help them follow Databrick's security best practices. When a customer runs SAT, it will compare their workspace configurations against a set of security best practices and delivers a report.
Other
85
stars
39
forks
source link
Setting up Security Analysis Tool (SAT) in the environment with Private VPC, no public internet, accessing Account APIs through PrivateLink #97
We are encountering difficulties while setting up SAT in our development environment with a VPC that restricts public internet access. Here's a summary of our configuration:
• Cloud Provider: AWS
• Authentication: Service Principal
• Network: VPC with Frontend/Backend PrivateLinks
• Library Installation: Offline Wheel packages deployed directly to the cluster
• Secrets: Managed via Databricks CLI
• Database: Unity Catalog
• Cluster: Single user for running SAT notebooks
Issues:
Account Connection Timeout: During the security_analysis_intializer notebook execution, the final cell (referencing security-analysis-tool/notebooks/Setup/1. list_account_workspaces_to_conf_file) encounters a connection timeout error when attempting to connect to accounts.cloud.databricks.com for account-level setup. The specific error is:
HTTPSConnectionPool(host='accounts.cloud.databricks.com', port=443): Max retries exceeded with url: /oidc/accounts//v1/token (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f90fcef31c0>, 'Connection to accounts.cloud.databricks.com timed out
When skipping account-level setup, subsequent steps in the initializer fail due to the absence of global temporary views, which are typically created during the first setup process.
Attempted Solution:
We tried utilizing the PrivateLink VPC endpoint with its DNS for account-level API calls using a curl command(ignoring SSL certification), but it didn't return a token or response. With the SSL it failed to retrieve the SSL certificate.
Here's the command for reference:
curl --request POST -k \
-d '{
"vpc_endpoint_name": "Databricks backend endpoint",
"region": "eu-west-1",
"aws_vpc_endpoint_id": "<>"
}' \
--url https://ireland.privatelink.cloud.databricks.com/oidc/accounts//v1/token \
--user "$CLIENT_ID:$CLIENT_SECRET" \
--data 'grant_type=client_credentials&scope=all-apis'
We appreciate your help in giving us the instructions how to call the account APIs through PriavteLink without the public internet if possible at all. Please assess on your end if we can achieve this.
Hi team,
We are encountering difficulties while setting up SAT in our development environment with a VPC that restricts public internet access. Here's a summary of our configuration: • Cloud Provider: AWS • Authentication: Service Principal • Network: VPC with Frontend/Backend PrivateLinks • Library Installation: Offline Wheel packages deployed directly to the cluster • Secrets: Managed via Databricks CLI • Database: Unity Catalog • Cluster: Single user for running SAT notebooks
Issues:
Attempted Solution: We tried utilizing the PrivateLink VPC endpoint with its DNS for account-level API calls using a curl command(ignoring SSL certification), but it didn't return a token or response. With the SSL it failed to retrieve the SSL certificate.
Here's the command for reference: curl --request POST -k \ -d '{ "vpc_endpoint_name": "Databricks backend endpoint", "region": "eu-west-1", "aws_vpc_endpoint_id": "<>" }' \ --url https://ireland.privatelink.cloud.databricks.com/oidc/accounts//v1/token \
--user "$CLIENT_ID:$CLIENT_SECRET" \
--data 'grant_type=client_credentials&scope=all-apis'
We appreciate your help in giving us the instructions how to call the account APIs through PriavteLink without the public internet if possible at all. Please assess on your end if we can achieve this.