exasol / sqlalchemy-exasol

SQLAlchemy dialect for EXASOL
https://exasol.github.io/sqlalchemy-exasol/
Other
34 stars 28 forks source link

problem with Exasol driver's WebSocket connection cleanup process #390

Closed wavewater closed 1 year ago

wavewater commented 1 year ago

Summary

When invoking the sql-alchemy engine, it returns the correct result (42) but throws an error during the the WebSocket Connection cleanup process.

Reproducing the Issue

Reproducibility: always

Steps to reproduce the behavior:

step 1: create file test_sql_alchemy_websocket.py

# test_sql_alchemy_websocket.py 
from sqlalchemy import create_engine
url = "exa+websocket://user:pw@host:8563/meta?SSLCertificate=SSL_VERIFY_NONE"
engine = create_engine(url)
with engine.connect() as con:
    result = con.execute("select 42").fetchall()
    print(f"{result}")

step 2: execute the script: python ./resources/scripts/test_sql_alchemy_websocket.py

Expected Behaviour

output should be:

[(42,)]

without any further errors

Actual Behaviour

[(42,)]
Exception ignored in: <function Connection.__del__ at 0x7fe1304df670>
Traceback (most recent call last):
  File "/home/xxxxx/projects/airflow_sbk/.venv/lib/python3.9/site-packages/exasol/driver/websocket/_connection.py", line 151, in __del__
  File "/home/xxxxx/projects/airflow_sbk/.venv/lib/python3.9/site-packages/exasol/driver/websocket/_connection.py", line 127, in close
exasol.driver.websocket._errors.Error: 

Root Cause (optional)

Note that the 42 was returned but there seems a problem with Exasol driver's WebSocket connection cleanup process. Without more descriptive error messages, it's difficult to diagnose the exact issue. This is why I am reaching out to the community for support. Any ideas?

Context

I am simply following the tutorial on sqlalchemy-exasol and wanted to give the new websocket dialect a try because the odbc dialect is a burden. In particular, many open source tooling require a working sqlalchemy connection.

System

CentOS Linux release 7.9.2009 (Core)

python --version
Python 3.9.18
pip list 
Package           Version
----------------- -------
cffi              1.16.0
cryptography      41.0.5
greenlet          3.0.1
packaging         23.2
pip               23.0.1
pyasn1            0.5.0
pycparser         2.21
pyexasol          0.25.2
pyodbc            4.0.39
pyOpenSSL         23.3.0
rsa               4.9
setuptools        58.1.0
SQLAlchemy        1.4.44
sqlalchemy-exasol 4.6.0
websocket-client  1.6.4
Nicoretti commented 1 year ago

Hi @wavewater,

thanks for reporting this issue. I will pick up on it as my next task.

best Nico

wavewater commented 1 year ago

Hi @wavewater,

thanks for reporting this issue. I will pick up on it as my next task.

best Nico

Thanks! - Joachim

Nicoretti commented 1 year ago

Hi @wavewater (Joachim),

I have analyzed the issue, this is a clear bug. I'll provide a fix during this week.

~~:spiral_notepad: Side Note: The error is an "additional close" (double close) on the already closed connection when the GC is run. This is not ideal, but on the other hand, all executed and committed queries should not be affected. Additionally, the connections have been closed in an orderly fashion before this error occurs.~~

best Nico

Nicoretti commented 1 year ago

Hi @wavewater (Joachim),

I have been digging and it appears that the issue may not be isolated to the SQLAlchemy dialect. My current investigations indicate that it is a problem existing within pyexasol and maybe even within the websocket library pyexasol is using.

What do we know so far?

  1. The problem only occurs on process termination when the __del__ methods of pyexasol, websocket-client etc. are invoked.
  2. This is also a problem when a user expects pyexasol to clean up the resources on shutdown The pyexasol example below, has the exact same problem, except that the failure is hidden from the user.

    import pyexasol
    c = pyexasol.connect(
        dsn="127.0.0.1:8888", user="sys", password="exasol", socket_timeout=None,
        # don't fall back to no encryption for unknown certificates
        websocket_sslopt= {"cert_reqs": ssl.CERT_NONE}
    )
    result = c.execute("SELECT 1;")
    print(f"{result.fetchall()}")
  3. Checking the underlying resource(s) before calling close does not seem to be sufficient in case of program termination. This may is due to
    • Invalid usage of the underlying API
    • An bug in the underlying implementation(s)/layers
    • Destruction order and/or guarantees/assumptions regarding process cleanup

What are partial mitigation(s) from a SQLA Dialect users perspective?

Redirect the error message

python ./resources/scripts/test_sql_alchemy_websocket.py 2> /dev/null

Should yield:

[(42,)]

:warning: Attention :warning: : This only should be fine as long as the connection has been closed properly, during program execution. This can be achieved by explicitly calling .close() on the connection or by using it in the context of a with statement e.g. with engine.connect() as con:.

How we intend to follow up on this?

  1. We will improve the checks before closing the underlying connection in the websocket based SQLA Dialect :spiral_notepad: Note: This won't solve the issue at hand, but will handle other edge cases which could be a problem in the future
  2. We will create an issue in the pyexasol project for further investigations. :spiral_notepad: Note: I will link the associated issue here, once it has been created.

Related Issue

tkilias commented 1 year ago

@Nicoretti besides the fix in pyexasol, we should also assign None to self._connection after https://github.com/exasol/sqlalchemy-exasol/blob/f7be0eac87e752ed89b538b01740a48dd2e09bb7/exasol/driver/websocket/_connection.py#L125

Nicoretti commented 1 year ago

@Nicoretti besides the fix in pyexasol, we should also assign None to self._connection after

https://github.com/exasol/sqlalchemy-exasol/blob/f7be0eac87e752ed89b538b01740a48dd2e09bb7/exasol/driver/websocket/_connection.py#L125

Agreed, also I think we should add an additional check for self._connection.is_closed beforehand, to ensure the we only call close on open connections (How we intend to follow up on this [1.]).

Nicoretti commented 1 year ago

@wavewater (Joachim)

We have conducted further investigations and have come to the conclusion that the problem lies within the libraries on which this project is based. In order to enhance the situation for the users of this library, we have modified the code to handle the problem more gracefully, although it still persists. Additionally, it remains unclear how long it will take to address this issue.

That being said, we have also assessed the impact of the issue and have reached the conclusion that its overall impact should be minor (for details, see the 'Context' section here). Shortly after this pull request is merged, we will create a new release of sqlalchemy-exasol. So you can expect a release either today or early next week.

Nicoretti commented 1 year ago

@wavewater we just released a new version of sqlalchemy-exasol

wavewater commented 1 year ago

@Nicoretti Thanks. I upated via pip install --upgrade sqlalchemy-exasol, And now it works without the error. Perfect.