databricks / databricks-sql-python

Databricks SQL Connector for Python
Apache License 2.0
171 stars 94 forks source link

CERTIFICATE_VERIFY_FAILED while connecting to the databricks sql endpoint using python through M2M auth #439

Open saurabhdataeng opened 2 months ago

saurabhdataeng commented 2 months ago

main.py content

from dotenv import load_dotenv, dotenv_values
load_dotenv()
from databricks.sdk.core import Config, oauth_service_principal
from databricks import sql
import os

server_hostname = os.getenv("DATABRICKS_HOST")

def credential_provider():
  config = Config(
    host          = f"https://{server_hostname}",
    client_id     = os.getenv("DATABRICKS_CLIENT_ID"),
    client_secret = os.getenv("DATABRICKS_CLIENT_SECRET"))
  print(config)
  return oauth_service_principal(config)

with sql.connect(server_hostname      = server_hostname,
                 http_path            = f'{os.getenv("DATABRICKS_HTTP_PATH")}',
                 credentials_provider = credential_provider,
                _tls_no_verify=True,
                 ) as connection:
    with connection.cursor() as cursor:
        cursor.execute("SELECT 1 as a")
        print(cursor.fetchall())
        cursor.close()

connection.close()

.env content

DATABRICKS_HOST="adb-XXXXXXXXXXXXX.azuredatabricks.net"

DATABRICKS_CLIENT_ID="XXXXXXX"

DATABRICKS_CLIENT_SECRET="XXXXXXX"

DATABRICKS_HTTP_PATH="/sql/1.0/warehouses/XXXXXXXXXX"

ERROR

Traceback (most recent call last):
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/test.py", line 28, in <module>
    credential_provider()
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/test.py", line 21, in credential_provider
    config = Config(
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/databricks/sdk/config.py", line 127, in __init__
    raise ValueError(message) from e
ValueError: default auth: oauth-m2m: HTTPSConnectionPool(host='adb-XXXXXXXXXXXX.azuredatabricks.net', port=443): Max retries exceeded with url: /oidc/.well-known/oauth-authorization-server (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)'))). Config: host=https://adb-XXXXXXXXXXX.azuredatabricks.net, client_id=XXXXXXXXXXX, client_secret=***. Env: DATABRICKS_HOST, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET

Packages

cachetools==5.5.0
certifi==2024.8.30
charset-normalizer==3.3.2
databricks==0.2
databricks-connect==13.0.1
databricks-sdk==0.32.1
databricks-sql-connector==3.4.0
et-xmlfile==1.1.0
google-auth==2.34.0
googleapis-common-protos==1.65.0
grpcio==1.66.1
grpcio-status==1.66.1
idna==3.8
lz4==4.3.3
numpy==1.24.4
oauthlib==3.2.2
openpyxl==3.1.5
pandas==1.5.3
pip-system-certs==4.0
protobuf==5.28.1
py4j==0.10.9.7
pyarrow==16.1.0
pyasn1==0.6.1
pyasn1_modules==0.4.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
pytz==2024.2
requests==2.32.3
rsa==4.9
six==1.16.0
thrift==0.20.0
urllib3==2.2.2
wrapt==1.16.0
kravets-levko commented 2 months ago

@benc-db can you please take a look? Seems that error comes from SDK

kravets-levko commented 2 months ago

@saurabhdataeng a quick question: if you use access token instead of oauth - does it work?

saurabhdataeng commented 2 months ago

@kravets-levko Yes, it works fine when I use access_token.

saurabhdataeng commented 2 months ago

@kravets-levko

I think the earlier error was due to some request quota exceed. Now I am getting different error with same code.

<Config: host=https://adb-XXXXXXXX.azuredatabricks.net, client_id=XXXXXXXXX, client_secret=***, auth_type=oauth-m2m. Env: DATABRICKS_HOST, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET>
<Config: host=https://adb-XXXXXXXX.azuredatabricks.net, client_id=XXXXXXXXX client_secret=***, auth_type=oauth-m2m. Env: DATABRICKS_HOST, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET>
/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'adb-XXXXXXXXX.azuredatabricks.net'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Traceback (most recent call last):
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/test.py", line 30, in <module>
    with sql.connect(server_hostname      = server_hostname,
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/databricks/sql/__init__.py", line 90, in connect
    return Connection(server_hostname, http_path, access_token, **kwargs)
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/databricks/sql/client.py", line 247, in __init__
    self._open_session_resp = self.thrift_backend.open_session(
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 549, in open_session
    response = self.make_request(self._client.OpenSession, open_session_req)
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 478, in make_request
    self._handle_request_error(error_info, attempt, elapsed)
  File "/Users/saurabhkumar/PycharmProjects/pythonProject/.venv/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 308, in _handle_request_error
    raise network_request_error
databricks.sql.exc.RequestError: Error during request to server
saurabhdataeng commented 2 months ago

I have found a work around:

Downgraded the databricks-sql-connector version to 2.7.1.dev2 as mentioned here: https://github.com/databricks/databricks-sql-python/issues/169

Now its woking as expected.

kravets-levko commented 2 months ago

@saurabhdataeng this warning comes from urllib and normally should be just printed, but not interrupt your script. Do you have anything that may override that behaviour and change warnings to exceptions? Also, try to disable warnings and see if anything changes:

import warnings
...
warnings.simplefilter("ignore")
saurabhdataeng commented 2 months ago

@kravets-levko No, I am just using the boilerplate code copied from azure databricks documentation. I downgraded the databricks-sql-connector version as mentioned and now I am not getting any execptions and warnings.

caldempsey commented 2 months ago

Also experiencing this issue with the boiler plate code, opened a new issue.

caldempsey commented 2 months ago

I think I found a solution for this in my issue

susodapop commented 2 months ago

Just wanted to add some signal that the boilerplate M2M code appears to be non-functional on main. I'm working on an integration now (which we'll hopefully publish in the near future) and had to scrap the boilerplate here. For reference, the U2M example does appear to work. But M2M is broken.