databricks / databricks-sql-python

Databricks SQL Connector for Python
Apache License 2.0
153 stars 87 forks source link

databricks.sql.exc.RequestError: Error during request to server #23

Closed PaulCornellDB closed 1 year ago

PaulCornellDB commented 2 years ago

Repro steps:

  1. pipenv --python 3.8
  2. pipenv shell
  3. pip install databricks-sql-connector
  4. export DATABRICKS_SERVER_HOSTNAME="https://<redacted>.cloud.databricks.com" && export DATABRICKS_HTTP_PATH="sql/protocolv1/o/<redacted>/<redacted>" && export DATABRICKS_TOKEN="dapi<redacted>"
  5. Code (main.py):
from databricks import sql
import os

with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
                 http_path       = os.getenv("DATABRICKS_HTTP_PATH"),
                 access_token    = os.getenv("DATABRICKS_TOKEN")) as connection:

  with connection.cursor() as cursor:
    cursor.execute("SELECT * FROM default.diamonds LIMIT 2")
    result = cursor.fetchall()

    for row in result:
      print(row)
  1. python main.py
  2. Traceback:
Traceback (most recent call last):
  File "main.py", line 10, in <module>
    with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
  File "/Users/paul.cornell/.local/share/virtualenvs/paul.cornell-<redacted>/lib/python3.8/site-packages/databricks/sql/__init__.py", line 48, in connect
    return Connection(server_hostname, http_path, access_token, **kwargs)
  File "/Users/paul.cornell/.local/share/virtualenvs/paul.cornell-<redacted>/lib/python3.8/site-packages/databricks/sql/client.py", line 112, in __init__
    self._session_handle = self.thrift_backend.open_session(session_configuration, catalog,
  File "/Users/paul.cornell/.local/share/virtualenvs/paul.cornell-<redacted>/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 341, in open_session
    response = self.make_request(self._client.OpenSession, open_session_req)
  File "/Users/paul.cornell/.local/share/virtualenvs/paul.cornell-<redacted>/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 287, in make_request
    self._handle_request_error(error_info, attempt, elapsed)
  File "/Users/paul.cornell/.local/share/virtualenvs/paul.cornell-<redacted>/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 199, in _handle_request_error
    raise network_request_error
databricks.sql.exc.RequestError: Error during request to server
  1. Tried export CERT_PATH=$(python -m certifi) && export SSL_CERT_FILE=${CERT_PATH} && export REQUESTS_CA_BUNDLE=${CERT_PATH} and python main.py again, but same traceback.

Environment:

Package Version certifi 2022.6.15 databricks-sql-connector 2.0.2 numpy 1.23.0 pandas 1.4.3 pip 21.0.1 pyarrow 8.0.0 python-dateutil 2.8.2 pytz 2022.1 setuptools 54.1.2 six 1.16.0 thrift 0.16.0 wheel 0.36.2

zhangzuolin16 commented 1 year ago

dears,

is there any solution to fix this issue?

susodapop commented 1 year ago

As yet I'm not able to reproduce this behaviour in the connector. Even with this specified version of Python. My best recommendation is to try with a fresh installation of Python within a virtual environment.

timon-schmelzer-gcx commented 1 year ago

I see the same behavior since a few weeks. Before that, everything was fine and a connection could be established. Nothing changed in the code / the SQL endpoint (at least not from my side). Just updated databricks-sql-connector to 2.1.0 but the result is the same.

timon-schmelzer-gcx commented 1 year ago

...actually updating to 2.1.0 helped solving my issue! The error message changed from:

databricks.sql.thrift_backend: Error during request to server: {...}

to:

databricks.sql.thrift_backend: Error during request to server: : Invalid access token. {...}

So my databricks access token has been expired. After adding a new token to my script everything is running again 👍

susodapop commented 1 year ago

I'm closing this issue out due to its age. Feel free to reopen or make a new issue if a similar issue arrises.

chozillla commented 11 months ago

The issue still exists. I had success with an initial access token, but I am getting RequestError: Error during request to server: : Invalid access token. when I used a new token and revoked the old one.

susodapop commented 11 months ago

@chozillla Can you provide reproduction steps?

chozillla commented 11 months ago

@susodapop

Screenshot 2023-09-24 at 8 33 55 PM

Interestingly, it works fine when I explicitly put in the values. This is on a jupyter notebook on vscode via Anaconda.

chozillla commented 11 months ago

Uh, so I just exited vscode and restarted and it worked. Strange issue.

Screenshot 2023-09-24 at 9 19 51 PM
NephilimJaeger commented 9 months ago

I'm experiencing the same problem, but in my case, it only occurs when I increase the number of rows that will be written to the Databricks table (50 rows).

vcakdwivedi commented 7 months ago

Hi Everyone, Is there any fix for above issue? I tried everything mentioned above. main(server_hostname = args.server_hostname, http_path = args.http_path, schema = args.schema, access_token = args.access_token) File "extract.py", line 24, in main dbx_extractor.execute_sql(sql_query, server_hostname, http_path, access_token) File "/mnt/azureml/cr/j/2c91464c9e0743f5860e8f48b407683d/exe/wd/dbextractor.py", line 23, in execute_sql with db_sql.connect( File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/init.py", line 50, in connect return Connection(server_hostname, http_path, access_token, **kwargs) File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/client.py", line 189, in init self._session_handle = self.thrift_backend.open_session( File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 464, in open_session response = self.make_request(self._client.OpenSession, open_session_req) File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 393, in make_request self._handle_request_error(error_info, attempt, elapsed) File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 261, in _handle_request_error raise network_request_error databricks.sql.exc.RequestError: Error during request to server

susodapop commented 7 months ago

Hi there, to investigate this we need reproduction steps. So far, every instance of this issue that we've traced has been an environment issue rather than a bug in the code. If you can provide reproduction steps we can try to guide you toward the fix.

At the very least we need to know the exception text, since RequestError can point in many different directions.

On Fri, Jan 26, 2024 at 1:30 AM vcakdwivedi @.***> wrote:

Hi Everyone, Is there any fix for above issue? I tried everything mentioned above. main(server_hostname = args.server_hostname, http_path = args.http_path, schema = args.schema, access_token = args.access_token) File "extract.py", line 24, in main dbx_extractor.execute_sql(sql_query, server_hostname, http_path, access_token) File "/mnt/azureml/cr/j/2c91464c9e0743f5860e8f48b407683d/exe/wd/dbextractor.py", line 23, in execute_sql with db_sql.connect( File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/ init.py", line 50, in connect return Connection(server_hostname, http_path, access_token, *kwargs) File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/client.py", line 189, in init* self._session_handle = self.thrift_backend.open_session( File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 464, in open_session response = self.make_request(self._client.OpenSession, open_session_req) File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 393, in make_request self._handle_request_error(error_info, attempt, elapsed) File "/azureml-envs/azureml_9806bfc7c716eaf0df12bea8512e961b/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 261, in _handle_request_error raise network_request_error databricks.sql.exc.RequestError: Error during request to server

— Reply to this email directly, view it on GitHub https://github.com/databricks/databricks-sql-python/issues/23#issuecomment-1911551749, or unsubscribe https://github.com/notifications/unsubscribe-auth/AECG7B5ZOOUNLZQFPX36H4DYQNEORAVCNFSM54TYKLLKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJRGE2TKMJXGQ4Q . You are receiving this because you were mentioned.Message ID: @.***>

Ownmarc commented 7 months ago

Hello,

We are seeing the same error, here is the relevant exception text. We use polars (0.20.6) to make requests to our Databricks endpoint using sql.connect. This works fine for some time until it stops working. Then if we simply restart our pod (lives as a docker container in kubernetes), it starts working again.

def get_databricks_connection():
    return sql.connect(
        server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
        http_path=os.getenv("DATABRICKS_HTTP_PATH"),
        access_token=os.getenv("DATABRICKS_TOKEN"),
        use_inline_params=True,
    )

databricks_connection = get_databricks_connection()

pl.read_database(
            query=QUERY,
            connection=databricks_connection,
            execute_options={
                "parameters": {
                    "some_param": some_param,
                }
            },
        )
certifi==2024.2.2
polars==0.20.6
databricks-sql-connector==3.0.3
thrift==0.16.0
pyarrow==14.0.2
flask==2.2.5
requests==2.28.2
urllib3==1.26.18

line 255, in _query_event\n return pl.read_database(\n File \"/home/python/.local/lib/python3.10/site-packages/polars/io/database.py\", line 582, in read_database\n return cx.execute(\n File \"/home/python/.local/lib/python3.10/site-packages/polars/io/database.py\", line 345, in execute\n result = cursor_execute(query, **options)\n File \"/home/python/.local/lib/python3.10/site-packages/databricks/sql/client.py\", line 761, in execute\n execute_response = self.thrift_backend.execute_command(\n File \"/home/python/.local/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 868, in execute_command\n resp = self.make_request(self._client.ExecuteStatement, req)\n File \"/home/python/.local/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 507, in make_request\n self._handle_request_error(error_info, attempt, elapsed)\n File \"/home/python/.local/lib/python3.10/site-packages/databricks/sql/thrift_backend.py\", line 337, in _handle_request_error\n raise network_request_error\ndatabricks.sql.exc.RequestError: Error during request to server"

sfc-gh-smao commented 6 months ago

Hello, I'm seeing the same error.

Code:

from databricks import sql
import os

with sql.connect(
    server_hostname="****************.cloud.databricks.com",
    http_path="/sql/1.0/warehouses/****************",
    access_token="****************",
) as connection:
    cursor = connection.cursor()

    cursor.execute("SELECT * from range(10)")
    print(cursor.fetchall())

    cursor.close()
    connection.close()

Error:

   with sql.connect(
         ^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/databricks/sql/__init__.py", line 84, in connect
    return Connection(server_hostname, http_path, access_token, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/databricks/sql/client.py", line 232, in __init__
    self._open_session_resp = self.thrift_backend.open_session(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 578, in open_session
    response = self.make_request(self._client.OpenSession, open_session_req)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 507, in make_request
    self._handle_request_error(error_info, attempt, elapsed)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 337, in _handle_request_error
    raise network_request_error
databricks.sql.exc.RequestError: Error during request to server
sfc-gh-smao commented 6 months ago

@susodapop Can you please reopen this issue? There are new reports for this issue since the issue was closed.

susodapop commented 6 months ago

I'm not able to reopen issues any longer. But I would encourage you to turn on DEBUG logging for your connector and share what you find. RequestError is a very generic exception and could point to any number of issues. Without more detail it's impossible to say what the root cause is.

Jasonnicholas-ZEN commented 2 weeks ago

Hi, are there any updates on this? I am facing similar issue. Could it be because the Databricks cluster that it needs to use is inactive? Could it be the source of the issue of databricks.sql.exc.RequestError: Error during request to server?

sebastianfastert commented 5 days ago

@Ownmarc did you find a solution?

I have a similar situation, where the deployment stops working after a while.

Ownmarc commented 4 days ago

@Ownmarc did you find a solution?

I have a similar situation, where the deployment stops working after a while.

Open and close connection, don’t keep them open for a while will solve it. This takes almost no time, what will take time is the coldstarts on your Databricks cluster side, but once its up, creating a new connection takes no time.

susodapop commented 4 days ago

That sounds accurate to me @Ownmarc. As database users, I think we are all inclined to keep connections open for long periods of time because connections / sessions are cheap on most databases. But on Databricks, sessions are expensive and are periodically closed down for inactivity as a cost-savings measure.

To get around this, you can either implement a heartbeat to keep the session alive (incurring cost from the cluster remaining online), or implement some logic that catches the exception when you attempt an operation on a closed session and opens a new one.

The downside of this latter approach is that if you have set a bunch of session variables on your session, you'll need to set them again on the new session if the first one expires.