databricks-industry-solutions / security-analysis-tool

Security Analysis Tool (SAT) analyzes customer's Databricks account and workspace security configurations and provides recommendations that help them follow Databrick's security best practices. When a customer runs SAT, it will compare their workspace configurations against a set of security best practices and delivers a report.
Other
94 stars 41 forks source link

SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate on Azure #174

Closed ArcTheMaster closed 1 week ago

ArcTheMaster commented 4 weeks ago

The initializer Notebook on Azure is failing after the first steps.

When reaching the part:

status1 = dbutils.notebook.run(
    f"{basePath()}/notebooks/Setup/1. list_account_workspaces_to_conf_file", 3000
)

The flow breaks. The specific error happens when 1. list_account_workspaces_to_conf_file calls accounts_bootstrap (see picture below).

image

When checking deeper the underlying error it appears the connection check to the workspace is refused due to a SSL: CERTIFICATE_VERIFY_FAILED error.

2024-10-31 19:33:51,799 - _profiler_ - INFO - in _update_token_master
2024-10-31 19:33:52,521 - _profiler_ - ERROR - Unsuccessful connection. Verify credentials.
Traceback (most recent call last):
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 711, in urlopen
    self._prepare_proxy(conn)
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1007, in _prepare_proxy
    conn.connect()
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
                ^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ssl.py", line 1075, in _create
    self.do_handshake()
  File "/usr/lib/python3.11/ssl.py", line 1346, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.ipykernel/1836/command-671944820365018-2760754777", line 3, in <module>
    is_successful_acct = db_client.test_connection(master_acct=True)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/core/dbclient.py", line 150, in test_connection
    results = requests.get(f'{self._url}/subscriptions/{self._subscription_id}/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01',
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))
2024-10-31 19:33:52,524 - _profiler_ - ERROR - HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))
Traceback (most recent call last):
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 711, in urlopen
    self._prepare_proxy(conn)
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1007, in _prepare_proxy
    conn.connect()
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
                ^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ssl.py", line 1075, in _create
    self.do_handshake()
  File "/usr/lib/python3.11/ssl.py", line 1346, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.ipykernel/1836/command-671944820365018-2760754777", line 3, in <module>
    is_successful_acct = db_client.test_connection(master_acct=True)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/core/dbclient.py", line 150, in test_connection
    results = requests.get(f'{self._url}/subscriptions/{self._subscription_id}/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01',
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))

Actually the class dbclient.test_connection, or any other class which uses urllib3 get does not support a use of the parameter verify=False.

https://github.com/databricks-industry-solutions/security-analysis-tool/blob/61b912254b04590b8c3f26b0f67b76cc3da2ffc1/src/securityanalysistoolproject/core/dbclient.py#L145

The class should give users the choice of turning on or off the SSL certificate verify flag.

arunpamulapati commented 3 weeks ago

Tagging @ramdaskmdb.

ramdaskmdb commented 3 weeks ago

These kind of issues usually occur due to firewall or proxy which adds some self signed or expired certs in the chain. Caan you run this from within a cell in your notebook and let us know.

%sh
openssl s_client -showcerts -connect management.azure.com:443 
arunpamulapati commented 3 weeks ago

Thanks Ram. @ArcTheMaster You can also use https://github.com/databricks-industry-solutions/security-analysis-tool/blob/main/notebooks/diagnosis/sat_diagnosis_azure.py to test.

ArcTheMaster commented 3 weeks ago

Hi @arunpamulapati @ramdaskmdb ,

Thank you both of you on the replies and the information you shared.

I added the openssl command inside the sat_diagnosis_azure.py notebook to see if everything works. It passed and this notebook was successful already. At least the main parts because the last commands were returning a SSL error.

Now no more errors for this diagnosis notebook. Unfortunately the accounts_bootstrap one is still failing.

2024-11-04 13:13:30,909 - _profiler_ - INFO - in _update_token_master
2024-11-04 13:13:31,656 - _profiler_ - INFO - Error. Either the credentials have expired or the                     credentials don't have proper permissions. Re-verify secrets
2024-11-04 13:13:31,657 - _profiler_ - INFO - Forbidden
2024-11-04 13:13:31,657 - _profiler_ - INFO - <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta type="copyright" content="Copyright (C) 1996-2021 The Squid Software Foundation and contributors">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>ERROR: The requested URL could not be retrieved</title>
<style type="text/css"><!--
 /*
 * Copyright (C) 1996-2022 The Squid Software Foundation and contributors
 *
 * Squid software is distributed under GPLv2+ license and includes
 * contributions from numerous individuals and organizations.
 * Please see the COPYING and CONTRIBUTORS files for details.
 */

/*
 Stylesheet for Squid Error pages
 Adapted from design by Free CSS Templates
 http://www.freecsstemplates.org/
 Released for free under a Creative Commons Attribution 2.5 License
*/

/* Page basics */
* {
    font-family: verdana, sans-serif;
}

html body {
    margin: 0;
    padding: 0;
    background: #efefef;
    font-size: 12px;
    color: #1e1e1e;
}

/* Page displayed title area */
#titles {
    margin-left: 15px;
    padding: 10px;
    padding-left: 100px;
    background: url('/squid-internal-static/icons/SN.png') no-repeat left;
}

/* initial title */
#titles h1 {
    color: #000000;
}
#titles h2 {
    color: #000000;
}

/* special event: FTP success page titles */
#titles ftpsuccess {
    background-color:#00ff00;
    width:100%;
}

/* Page displayed body content area */
#content {
    padding: 10px;
    background: #ffffff;
}

/* General text */
p {
}

/* error brief description */
#error p {
}

/* some data which may have caused the problem */
#data {
}

/* the error message received from the system or other software */
#sysmsg {
}

pre {
}

/* special event: FTP / Gopher directory listing */
#dirmsg {
    font-family: courier, monospace;
    color: black;
    font-size: 10pt;
}
#dirlisting {
    margin-left: 2%;
    margin-right: 2%;
}
#dirlisting tr.entry td.icon,td.filename,td.size,td.date {
    border-bottom: groove;
}
#dirlisting td.size {
    width: 50px;
    text-align: right;
    padding-right: 5px;
}

/* horizontal lines */
hr {
    margin: 0;
}

/* page displayed footer area */
#footer {
    font-size: 9px;
    padding-left: 10px;
}

body
:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }
:lang(he) { direction: rtl; }
 --></style>
</head><body id=ERR_ACCESS_DENIED>
<div id="titles">
<h1>ERROR</h1>
<h2>The requested URL could not be retrieved</h2>
</div>
<hr>

<div id="content">
<p>The following error was encountered while trying to retrieve the URL: <a href="https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?">[https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?</a></p>](https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?%3C/a%3E%3C/p%3E)

<blockquote id="error">
<p><b>Access Denied.</b></p>
</blockquote>

<p>Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.</p>

<p>Your cache administrator is <a href="mailto:webmaster?subject=CacheErrorInfo%20-%20ERR_ACCESS_DENIED&amp;body=CacheHost%3A%2093e9a03bd7bb%0D%0AErrPage%3A%20ERR_ACCESS_DENIED%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Mon,%2004%20Nov%202024%2013%3A13%3A31%20GMT%0D%0A%0D%0AClientIP%3A%2010.153.204.139%0D%0A%0D%0AHTTP%20Request%3A%0D%0AGET%20%2Fsubscriptions%2Fd70b065a-7699-4216-9ded-2f371a542a0f%2Fproviders%2FMicrosoft.Databricks%2Fworkspaces%3Fapi-version%3D2018-04-01%20HTTP%2F1.1%0AUser-Agent%3A%20databricks-sat%2F0.1.0%0D%0AAccept-Encoding%3A%20gzip,%20deflate%0D%0AAccept%3A%20*%2F*%0D%0AConnection%3A%20keep-alive%0D%0AAuthorization%3A%20Bearer%20eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6IjNQYUs0RWZ5Qk5RdTNDdGpZc2EzWW1oUTVFMCIsImtpZCI6IjNQYUs0RWZ5Qk5RdTNDdGpZc2EzWW1oUTVFMCJ9.eyJhdWQiOiJodHRwczovL21hbmFnZW1lbnQuYXp1cmUuY29tIiwiaXNzIjoiaHR0cHM6Ly9zdHMud2luZG93cy5uZXQvZTAxYmQzODYtZmE1MS00MjEwLWEyYTQtMjllNWFiNmY3YWIxLyIsImlhdCI6MTczMDcyNTcxMSwibmJmIjoxNzMwNzI1NzExLCJleHAiOjE3MzA3Mjk2MTEsImFpbyI6ImsyQmdZSkRrbkxpa2Z2dk5qNHNrZGRwTjg2NUdBZ0E9IiwiYXBwaWQiOiI4MzY2ZTMwMi1iOTgxLTQ1N2YtOGQzNy1mZjJhNTM5YWIwNmEiLCJhcHBpZGFjciI6IjEiLCJpZHAiOiJodHRwczovL3N0cy53aW5kb3dzLm5ldC9lMDFiZDM4Ni1mYTUxLTQyMTAtYTJhNC0yOWU1YWI2ZjdhYjEvIiwiaWR0eXAiOiJhcHAiLCJvaWQiOiI2MGEyY2IwYi04ZGQxLTQwMWMtYTM1OS03Mzg4OTk1NmFjODEiLCJyaCI6IjEuQVEwQWh0TWI0Rkg2RUVLaXBDbmxxMjk2c1VaSWYza0F1dGRQdWtQYXdmajJNQk1OQUFBTkFBLiIsInN1YiI6IjYwYTJjYjBiLThkZDEtNDAxYy1hMzU5LTczODg5OTU2YWM4MSIsInRpZCI6ImUwMWJkMzg2LWZhNTEtNDIxMC1hMmE0LTI5ZTVhYjZmN2FiMSIsInV0aSI6Im1VQktsbmdnaUV5bURxNFpFNXdzQUEiLCJ2ZXIiOiIxLjAiLCJ4bXNfaWRyZWwiOiIzMCA3IiwieG1zX3RjZHQiOjE0NDA5NDk4NzJ9.QfUt7V6SHxrYBsgY5wiVxMbuM8woZ5Av93OkYVAKmmnwhT4v6xJJxDkA4JMH35bXLgPyINWkwcYToNgIrSfyRHsMUpywbBUD1ADsBPjl1lNhvakyruCl_mn3BfamUr3HdfjdVn9mRfahX0rnEIRKH5xjFVk3b-XihDrU5NufJ6LhYVld6HnCMoM2V6Zq0LJpfHcFealq3az0T62kCnsW5Ic3K7YtpKDEbPNDK2mgtFGPUjsCp0GzW5fgMCXcD15GlHGjtlXUGKym2VbpcbDynOCmhEXzn483y31bY576n8SfzUGZ502x-tCDT4bDIWK42Fw9rgHCK6Di_raV_i1KNA%0D%0AHost%3A%20management.azure.com%0D%0A%0D%0A%0D%0A">webmaster</a>.</p>
<br>
</div>

<hr>
<div id="footer">
<p>Generated Mon, 04 Nov 2024 13:13:31 GMT by 93e9a03bd7bb (squid/5.7)</p>
<!-- ERR_ACCESS_DENIED -->
</div>
</body></html>

2024-11-04 13:13:31,658 - _profiler_ - ERROR - Exception encountered
Traceback (most recent call last):
  File "/root/.ipykernel/1959/command-9139779880878-1615707143", line 4, in <module>
    is_successful_acct = db_client.check_connection(master_acct=True)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.ipykernel/1959/command-9139779881055-3150357518", line 39, in check_connection
    raise Exception(f'Test connection failed {results.reason}')
Exception: Test connection failed Forbidden

The credentials are the same as the one used with the diagnosis notebook so it must work. Do you have any ideas ?

ramdaskmdb commented 3 weeks ago

Sounds like your service principal may not have access to the management api/accounts api. based on this.

The following error was encountered while trying to retrieve the URL: [https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?

](https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?%3C/a%3E%3C/p%3E)

Access Denied.

Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.

*___________________________________* *Ramdas Murali* Solutions Architect - 214.235.8353 | ***@***.*** On Mon, Nov 4, 2024 at 7:22 AM Pierre Beauvois ***@***.***> wrote: > Hi @arunpamulapati @ramdaskmdb > , > > Thank you both of you on the replies and the information you shared. > > I added the openssl command inside the sat_diagnosis_azure.py notebook to > see if everything works. It passed and this notebook was successful > already. At least the main parts because the last commands were returning a > SSL error. > > Now no more errors for this diagnosis notebook. Unfortunately the > accounts_bootstrap one is still failing. > > 2024-11-04 13:13:30,909 - _profiler_ - INFO - in _update_token_master2024-11-04 13:13:31,656 - _profiler_ - INFO - Error. Either the credentials have expired or the credentials don't have proper permissions. Re-verify secrets2024-11-04 13:13:31,657 - _profiler_ - INFO - Forbidden2024-11-04 13:13:31,657 - _profiler_ - INFO - ERROR: The requested URL could not be retrieved

ERROR

The requested URL could not be retrieved


>

The following error was encountered while trying to retrieve the URL: [https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?

](https://management.azure.com/subscriptions/xxxxxxxxx-xxxxx-xxxxx-xxxxx-xxxxxxxxxxxxx/providers/Microsoft.Databricks/workspaces?%3C/a%3E%3C/p%3E) >

Access Denied.

>

Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.

>

Your cache administrator is webmaster.



2024-11-04 13:13:31,658 - profiler - ERROR - Exception encounteredTraceback (most recent call last): File "/root/.ipykernel/1959/command-9139779880878-1615707143", line 4, in is_successful_acct = db_client.check_connection(master_acct=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.ipykernel/1959/command-9139779881055-3150357518", line 39, in check_connection raise Exception(f'Test connection failed {results.reason}')Exception: Test connection failed Forbidden

The credentials are the same as the one used with the diagnosis notebook so it must work. Do you have any ideas ?

— Reply to this email directly, view it on GitHub https://github.com/databricks-industry-solutions/security-analysis-tool/issues/174#issuecomment-2454704411, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANXPYAJ7FRDWG6JYN35D34DZ65RIPAVCNFSM6AAAAABRAEHK2SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJUG4YDINBRGE . You are receiving this because you were mentioned.Message ID: <databricks-industry-solutions/security-analysis-tool/issues/174/2454704411 @github.com>

ArcTheMaster commented 3 weeks ago

@ramdaskmdb the error does not make sense as the access works during the sat_diagnosis_azure notebook. Credentials are the same.

Where those permissions need to be aligned ? I tried using a Azure IAM custom role which gives read permissions to the Microsoft.Databricks provider but this is not solving the issue.

image

Do you know where I need to adjust the permissions more specifically ?

arunpamulapati commented 3 weeks ago

A call may be better. Can you share few times and days best for you that are EST friendly and send an email to ram (ramdas.murali@databricks.com ) and I (arun@databricks.com).

ArcTheMaster commented 3 weeks ago

@arunpamulapati email sent to you both.

At the same time I made progress in the debug. To make it short, I was setting proxies and those were blocking some flows. But now, still inside the accounts_bootstrap notebook, a timeout happens at this location:

bootstrap('acctworkspaces', acct_client.get_workspace_list)
2024-11-04 20:39:25,693 - _profiler_ - INFO - in _update_token_master
Cancelled
Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to address=(host=consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com)(port=3306)(type=master) : Socket fail to connect to host:consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com, port:3306. connect timed out
    at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
    at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:197)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1404)
    at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:635)
    at org.mariadb.jdbc.MariaDbConnection.newConnection(MariaDbConnection.java:150)
    at org.mariadb.jdbc.Driver.connect(Driver.java:89)
    at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
    at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
    at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:341)
    at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:506)
    ... 118 more
Caused by: java.sql.SQLNonTransientConnectionException: Socket fail to connect to host:consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com, port:3306. connect timed out
    at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
    at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:188)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createSocket(AbstractConnectProtocol.java:262)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:534)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1399)
    ... 125 more
Caused by: java.net.SocketTimeoutException: connect timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:613)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createSocket(AbstractConnectProtocol.java:254)
    ... 127 more
24/11/04 20:44:41 WARN HiveClientImpl: HiveClient got thrift or connection reset exception, destroying client and retrying (22 tries remaining)
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1169)
    at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1154)
    at org.apache.spark.sql.hive.client.Shim_v0_12.databaseExists(HiveShim.scala:620)
    at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$databaseExists$1(HiveClientImpl.scala:456)
    at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
    at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:353)
    at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$retryLocked$1(HiveClientImpl.scala:252)
    at org.apache.spark.sql.hive.client.HiveClientImpl.synchronizeOnObject(HiveClientImpl.scala:290)
    at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:244)
    at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:333)
    at org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:456)
    at org.apache.spark.sql.hive.client.PoolingHiveClient.$anonfun$databaseExists$1(PoolingHiveClient.scala:321)
    at org.apache.spark.sql.hive.client.PoolingHiveClient.$anonfun$databaseExists$1$adapted(PoolingHiveClient.scala:320)
    at org.apache.spark.sql.hive.client.PoolingHiveClient.withHiveClient(PoolingHiveClient.scala:149)
    at org.apache.spark.sql.hive.client.PoolingHiveClient.databaseExists(PoolingHiveClient.scala:320)
    at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:292)
    at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
    at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
    at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$2(HiveExternalCatalog.scala:156)
    at org.apache.spark.sql.hive.HiveExternalCatalog.maybeSynchronized(HiveExternalCatalog.scala:117)
    at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$1(HiveExternalCatalog.scala:155)
    at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:411)
    at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:397)
    at com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34)
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:154)
    at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:292)
    at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:311)
    at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:306)
    at org.apache.spark.sql.internal.SharedState.$anonfun$globalTempViewManager$1(SharedState.scala:395)
    at org.apache.spark.sql.internal.SharedState.$anonfun$globalTempViewExternalCatalogNameCheck$1(SharedState.scala:365)
    at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
    at scala.util.Try$.apply(Try.scala:213)
    at org.apache.spark.sql.internal.SharedState.globalTempViewExternalCatalogNameCheck(SharedState.scala:365)
    at org.apache.spark.sql.internal.SharedState.globalTempViewManager$lzycompute(SharedState.scala:395)
    at org.apache.spark.sql.internal.SharedState.globalTempViewManager(SharedState.scala:391)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$hiveCatalog$2(HiveSessionStateBuilder.scala:76)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.globalTempViewManager$lzycompute(SessionCatalog.scala:649)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.globalTempViewManager(SessionCatalog.scala:649)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.getRawGlobalTempView(SessionCatalog.scala:1362)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.getGlobalTempView(SessionCatalog.scala:1369)
    at org.apache.spark.sql.catalyst.catalog.DelegatingSessionCatalog.getGlobalTempView(DelegatingSessionCatalog.scala:278)
    at org.apache.spark.sql.catalyst.catalog.DelegatingSessionCatalog.getGlobalTempView$(DelegatingSessionCatalog.scala:277)
    at com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.getGlobalTempView(ManagedCatalogSessionCatalog.scala:89)
    at org.apache.spark.sql.internal.CatalogImpl.dropGlobalTempView(CatalogImpl.scala:761)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
    at py4j.Gateway.invoke(Gateway.java:306)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1165)
    ... 55 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
    ... 60 more

@arunpamulapati @ramdaskmdb Why SAT needs to make this connection to port 3306 and on host consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com ? A clarification would be appreciated because right now, this flow is not authorized inside our company for security reasons.

ramdaskmdb commented 3 weeks ago

Hi, can you click on this in your UC and tell us what you see? [image: image.png]

___ Ramdas Murali Solutions Architect - 214.235.8353 | @.***

On Tue, Nov 5, 2024 at 8:09 AM Pierre Beauvois @.***> wrote:

@arunpamulapati https://github.com/arunpamulapati email sent to you both.

At the same time I made progress in the debug. To make it short, I was setting proxies and those were blocking some flows. But now, still inside the accounts_bootstrap notebook, a timeout happens at this location:

bootstrap('acctworkspaces', acct_client.get_workspace_list)2024-11-04 20:39:25,693 - profiler - INFO - in _update_token_masterCancelled

Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to address=(host=consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com)(port=3306)(type=master) : Socket fail to connect to host:consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com, port:3306. connect timed out at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73) at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:197) at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1404) at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:635) at org.mariadb.jdbc.MariaDbConnection.newConnection(MariaDbConnection.java:150) at org.mariadb.jdbc.Driver.connect(Driver.java:89) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101) at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:341) at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:506) ... 118 more Caused by: java.sql.SQLNonTransientConnectionException: Socket fail to connect to host:consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com, port:3306. connect timed out at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73) at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:188) at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createSocket(AbstractConnectProtocol.java:262) at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:534) at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1399) ... 125 more Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:613) at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createSocket(AbstractConnectProtocol.java:254) ... 127 more 24/11/04 20:44:41 WARN HiveClientImpl: HiveClient got thrift or connection reset exception, destroying client and retrying (22 tries remaining) org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1169) at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1154) at org.apache.spark.sql.hive.client.Shim_v0_12.databaseExists(HiveShim.scala:620) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$databaseExists$1(HiveClientImpl.scala:456) at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:353) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$retryLocked$1(HiveClientImpl.scala:252) at org.apache.spark.sql.hive.client.HiveClientImpl.synchronizeOnObject(HiveClientImpl.scala:290) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:244) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:333) at org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:456) at org.apache.spark.sql.hive.client.PoolingHiveClient.$anonfun$databaseExists$1(PoolingHiveClient.scala:321) at org.apache.spark.sql.hive.client.PoolingHiveClient.$anonfun$databaseExists$1$adapted(PoolingHiveClient.scala:320) at org.apache.spark.sql.hive.client.PoolingHiveClient.withHiveClient(PoolingHiveClient.scala:149) at org.apache.spark.sql.hive.client.PoolingHiveClient.databaseExists(PoolingHiveClient.scala:320) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:292) at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$2(HiveExternalCatalog.scala:156) at org.apache.spark.sql.hive.HiveExternalCatalog.maybeSynchronized(HiveExternalCatalog.scala:117) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$1(HiveExternalCatalog.scala:155) at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:411) at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:397) at com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:154) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:292) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:311) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:306) at org.apache.spark.sql.internal.SharedState.$anonfun$globalTempViewManager$1(SharedState.scala:395) at org.apache.spark.sql.internal.SharedState.$anonfun$globalTempViewExternalCatalogNameCheck$1(SharedState.scala:365) at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.sql.internal.SharedState.globalTempViewExternalCatalogNameCheck(SharedState.scala:365) at org.apache.spark.sql.internal.SharedState.globalTempViewManager$lzycompute(SharedState.scala:395) at org.apache.spark.sql.internal.SharedState.globalTempViewManager(SharedState.scala:391) at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$hiveCatalog$2(HiveSessionStateBuilder.scala:76) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.globalTempViewManager$lzycompute(SessionCatalog.scala:649) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.globalTempViewManager(SessionCatalog.scala:649) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.getRawGlobalTempView(SessionCatalog.scala:1362) at org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.getGlobalTempView(SessionCatalog.scala:1369) at org.apache.spark.sql.catalyst.catalog.DelegatingSessionCatalog.getGlobalTempView(DelegatingSessionCatalog.scala:278) at org.apache.spark.sql.catalyst.catalog.DelegatingSessionCatalog.getGlobalTempView$(DelegatingSessionCatalog.scala:277) at com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.getGlobalTempView(ManagedCatalogSessionCatalog.scala:89) at org.apache.spark.sql.internal.CatalogImpl.dropGlobalTempView(CatalogImpl.scala:761) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199) at py4j.ClientServerConnection.run(ClientServerConnection.java:119) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1165) ... 55 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410) ... 60 more

@arunpamulapati https://github.com/arunpamulapati @ramdaskmdb https://github.com/ramdaskmdb Why SAT needs to make this connection to port 3306 and on host consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com http://consolidated-eastus2c3-prod-metastore-1.mysql.database.azure.com ? A clarification would be appreciated because right now, this flow is not authorized inside our company for security reasons.

— Reply to this email directly, view it on GitHub https://github.com/databricks-industry-solutions/security-analysis-tool/issues/174#issuecomment-2457277852, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANXPYALGOM42XUAKB3D6XCTZ7DGR3AVCNFSM6AAAAABRAEHK2SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJXGI3TOOBVGI . You are receiving this because you were mentioned.Message ID: <databricks-industry-solutions/security-analysis-tool/issues/174/2457277852 @github.com>

ArcTheMaster commented 3 weeks ago

@ramdaskmdb sorry but we don't see the image. Can you post your comment again please ? I cannot take a look at the UC location without the info. Thx.

ramdaskmdb commented 3 weeks ago

hms

ArcTheMaster commented 3 weeks ago

Sorry for the late reply @ramdaskmdb .

So looking into the UC it times out.

image

Any suggestions on that one ? But for information we do not use the legacy Hive Metastore in our context as we have external metastore used.

I asked for opening Azure FW for all Azure endpoints for Metastore on eastus2 as enlisted at this doc page - eastus2 metastore I do the test again after those are open.

ArcTheMaster commented 2 weeks ago

@arunpamulapati @ramdaskmdb I finally got to the full end of the notebook. In addition to the FW rules added I had to modify multiple notebooks for getting these results though, all Azure related.

As a summary, this default part does not work properly when working on Azure Cloud + AKV secrets:

# COMMAND ----------

import requests
from core import  parser as pars
from core.dbclient import SatDBClient

if cloud_type =='azure': # use client secret
  client_secret = dbutils.secrets.get(json_['master_name_scope'], json_["client_secret_key"])
  json_.update({'token':'dapijedi', 'client_secret': client_secret})
elif (cloud_type =='aws' and json_['use_sp_auth'].lower() == 'true'):  
    client_secret = dbutils.secrets.get(json_['master_name_scope'], json_["client_secret_key"])
    json_.update({'token':'dapijedi', 'client_secret': client_secret})
    mastername =' ' # this will not be present when using SPs
    masterpwd = ' '  # we still need to send empty user/pwd.
    json_.update({'token':'dapijedi', 'mastername':mastername, 'masterpwd':masterpwd})
else: #lets populate master key for accounts api
    mastername = dbutils.secrets.get(json_['master_name_scope'], json_['master_name_key'])
    masterpwd = dbutils.secrets.get(json_['master_pwd_scope'], json_['master_pwd_key'])
    json_.update({'token':'dapijedi', 'mastername':mastername, 'masterpwd':masterpwd})

db_client = SatDBClient(json_)

It is used on the following notebooks:

  • 3. test_connections.py
  • 5. import_dashboard_template_lakeview.py
  • 5. import_dashboard_template.py
  • 6. configure_alerts_template.py
  • accounts_bootstrap.py
  • sat_diagnosis_azure.py
  • workspace_bootstrap.py

To finish, I noticed that SAT cannot build up the full dashboard from one workspace, collecting other workspaces info. To be more specific there: workspace 1 hosts a SAT instance, collects workspace 1 data but does not get workspace 2 and workspace 3 info. See error below (wrong creds even with the proper token and creds)

Access token: <myaccesstoken_workspace1.12>
2024-11-08 20:36:08,383 - _profiler_ - ERROR - Unsuccessful https://adb-1.12.azuredatabricks.net/ 1.12 workspace connection. Verify credentials.
Traceback (most recent call last):
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 403, in _make_request
    self._validate_conn(conn)
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1053, in _validate_conn
    conn.connect()
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
                ^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ssl.py", line 1075, in _create
    self.do_handshake()
  File "/usr/lib/python3.11/ssl.py", line 1346, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:992)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='adb-1.12.azuredatabricks.net', port=443): Max retries exceeded with url: /api/2.0/clusters/spark-versions (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:992)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.ipykernel/2062/command-3919299609755321-2924787437", line 15, in test_connection
    is_successful_ws = db_client.test_connection(master_acct=accounts_test)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/core/dbclient.py", line 157, in test_connection
    results = requests.get(f'{self._url}/api/2.0/clusters/spark-versions',
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='adb-1.12.azuredatabricks.net', port=443): Max retries exceeded with url: /api/2.0/clusters/spark-versions (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:992)')))
2024-11-08 20:36:08,387 - _profiler_ - INFO - https://adb-1.12.azuredatabricks.net/ workspace connection. Connection Status: False
Access token: <myaccesstoken_workspace2.1>
2024-11-08 20:36:09,503 - _profiler_ - INFO - https://adb-2.1.azuredatabricks.net/ 2.1 : Connection successful!
2024-11-08 20:36:09,504 - _profiler_ - INFO - https://adb-2.1.azuredatabricks.net/ workspace connection. Connection Status: True
[('1.12', False), ('2.1', True)]

I will open a new case for this one.

ramdaskmdb commented 2 weeks ago

The hive metastore connection has to work as a pre-requisite. It looks like port 3306 may be blocked in your case. Without SAT in the picture, you should be able to click on the hive_metastore button in UC without any errors.

___ Ramdas Murali Solutions Architect - 214.235.8353 | @.***

On Mon, Nov 11, 2024 at 7:10 AM Pierre Beauvois @.***> wrote:

@arunpamulapati https://github.com/arunpamulapati @ramdaskmdb https://github.com/ramdaskmdb I finally got to the full end of the notebook. I modified multiple notebooks for getting these results though, all Azure related.

As a summary, this default part does not work properly when working on Azure Cloud + AKV secrets:

COMMAND ----------

import requestsfrom core import parser as parsfrom core.dbclient import SatDBClient if cloud_type =='azure': # use client secret clientsecret = dbutils.secrets.get(json['master_namescope'], json["client_secretkey"]) json.update({'token':'dapijedi', 'client_secret': client_secret})elif (cloudtype =='aws' and json['use_sp_auth'].lower() == 'true'): clientsecret = dbutils.secrets.get(json['master_namescope'], json["client_secretkey"]) json.update({'token':'dapijedi', 'client_secret': clientsecret}) mastername =' ' # this will not be present when using SPs masterpwd = ' ' # we still need to send empty user/pwd. json.update({'token':'dapijedi', 'mastername':mastername, 'masterpwd':masterpwd})else: #lets populate master key for accounts api mastername = dbutils.secrets.get(json_['master_namescope'], json['master_namekey']) masterpwd = dbutils.secrets.get(json['master_pwdscope'], json['master_pwdkey']) json.update({'token':'dapijedi', 'mastername':mastername, 'masterpwd':masterpwd}) dbclient = SatDBClient(json)

It is used on the following notebooks:

    1. test_connections.py
    1. import_dashboard_template_lakeview.py
    1. import_dashboard_template.py
    1. configure_alerts_template.py
  • accounts_bootstrap.py
  • sat_diagnosis_azure.py
  • workspace_bootstrap.py

To finish, I noticed that SAT cannot build up the full dashboard from one workspace, collecting other workspaces info. To be more specific there: workspace 1 hosts a SAT instance, collects workspace 1 data but does not get workspace 2 and workspace 3 info. See error below (wrong creds even with the proper token and creds)

Access token: 2024-11-08 20:36:08,383 - profiler - ERROR - Unsuccessful https://adb-1.12.azuredatabricks.net/ 1.12 workspace connection. Verify credentials.Traceback (most recent call last): File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen httplib_response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 403, in _make_request self._validate_conn(conn) File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1053, in _validate_conn conn.connect() File "/databricks/python/lib/python3.11/site-packages/urllib3/connection.py", line 419, in connect self.sock = ssl_wrapsocket( ^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl.py", line 449, in ssl_wrap_socket ssl_sock = _ssl_wrap_socketimpl( ^^^^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/urllib3/util/ssl.py", line 493, in _ssl_wrap_socket_impl return ssl_context.wrap_socket(sock, server_hostname=server_hostname) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/ssl.py", line 517, in wrap_socket return self.sslsocket_class._create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/ssl.py", line 1075, in _create self.do_handshake() File "/usr/lib/python3.11/ssl.py", line 1346, in do_handshake self._sslobj.do_handshake()ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:992) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause))urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='adb-1.12.azuredatabricks.net', port=443): Max retries exceeded with url: /api/2.0/clusters/spark-versions (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:992)'))) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/root/.ipykernel/2062/command-3919299609755321-2924787437", line 15, in test_connection is_successful_ws = db_client.test_connection(master_acct=accounts_test) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/core/dbclient.py", line 157, in test_connection results = requests.get(f'{self._url}/api/2.0/clusters/spark-versions', ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 73, in get return request("get", url, params=params, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/databricks/python/lib/python3.11/site-packages/requests/adapters.py", line 517, in send raise SSLError(e, request=request)requests.exceptions.SSLError: HTTPSConnectionPool(host='adb-1.12.azuredatabricks.net', port=443): Max retries exceeded with url: /api/2.0/clusters/spark-versions (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:992)')))2024-11-08 20:36:08,387 - profiler - INFO - https://adb-1.12.azuredatabricks.net/ workspace connection. Connection Status: FalseAccess token: 2024-11-08 20:36:09,503 - profiler - INFO - https://adb-2.1.azuredatabricks.net/ 2.1 : Connection successful!2024-11-08 20:36:09,504 - profiler - INFO - https://adb-2.1.azuredatabricks.net/ workspace connection. Connection Status: True [('1.12', False), ('2.1', True)]

I will open a new case for this one.

— Reply to this email directly, view it on GitHub https://github.com/databricks-industry-solutions/security-analysis-tool/issues/174#issuecomment-2468143687, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANXPYAMQNUAX4775H33VLS32ACUC3AVCNFSM6AAAAABRAEHK2SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRYGE2DGNRYG4 . You are receiving this because you were mentioned.Message ID: <databricks-industry-solutions/security-analysis-tool/issues/174/2468143687 @github.com>

ArcTheMaster commented 2 weeks ago

@ramdaskmdb exact and this is what I shared, once firewalls have been opened it worked.

But this does not answer my second point about accessing the other workspaces from the SAT instance running on workspace 1. Vnet peering is on between the workspaces so SAT should be able to reach the other workspaces from the same Azure Subscription. Error is either authentication failure or TLS error (same creds are used, token is different though).

ramdaskmdb commented 2 weeks ago

Hi Pierre, This URL does not look right to me.

HTTPSConnectionPool(host='adb-1.12.azuredatabricks.net', port=443)

___ Ramdas Murali Solutions Architect - 214.235.8353 | @.***

On Tue, Nov 12, 2024 at 1:28 PM Pierre Beauvois @.***> wrote:

@ramdaskmdb https://github.com/ramdaskmdb exact and this is what I shared, once firewalls have been opened it worked.

But this does not answer my second point about accessing the other workspaces from the SAT instance running on workspace 1. Vnet peering is on between the workspaces so SAT should be able to reach the other workspaces from the same Azure Subscription. Error is either authentication failure or TLS error (same creds are used, token is different though).

— Reply to this email directly, view it on GitHub https://github.com/databricks-industry-solutions/security-analysis-tool/issues/174#issuecomment-2471385699, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANXPYALU6Y5LQADA3A54BXL2AJJG5AVCNFSM6AAAAABRAEHK2SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZRGM4DKNRZHE . You are receiving this because you were mentioned.Message ID: <databricks-industry-solutions/security-analysis-tool/issues/174/2471385699 @github.com>

ArcTheMaster commented 2 weeks ago

Hello @ramdaskmdb I anonymized the workspace name on purpose before sending the info in the comment as GitHub is public. The workspace name is perfectly normal in the original stack trace.

ArcTheMaster commented 1 week ago

Considering the fact the SSH error comes from a Network path constraint, binded to our own Company context, I close this ticket.