Closed yuvalmarciano closed 3 years ago
Hi,
You mentioned a webserver, does it use multiple threads? Could it be that while one thread is initializing the dialect another one is executing the ilike query?
I think we have a lock for a similar case, but I'm not sure off the top of my head if it's used also for the dialect initialization.
the connection pool should not grant a connection until the first connect has succeeded, within which the initialization takes place.
I would ask if this occurs on 1.3 but in 1.3 we dont have the caching, so it would only occur on one or two initial statements. but as it appears this can be reproduced are you able to reproduce on 1.3 ?
wowee. nevermind, this reproduces immediately on 1.4 and not on 1.3. huh.
import threading
from sqlalchemy import create_engine
def run(engine):
with engine.connect() as conn:
assert conn.dialect.server_version_info > (12,)
engine = create_engine("postgresql://scott:tiger@localhost/test", pool_size=20)
# this fixes
# engine.connect().close()
threads = [threading.Thread(target=run, args=(engine,)) for i in range(20)]
for t in threads:
t.start()
yeah i knew i had made a change here this is #5497 .
I guess we should have a test that does that check using threads
is this only on pg?
well pg has a bad symptom here but all of the dialects should not be used to compile until initialize.
Mike Bayer has proposed a fix for this issue in the master branch:
Add new "sync once" mode for pool.connect https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/2766
thanks for reporting!
we had this exact same issue about 8 years ago in #2964 however unfortunately the tests added for that only tested the connection pool itself when the "first_connect" event is used; it didn't test that the Engine
is actually using the first_connect
event and that's what changed in #5497, and the tests did not catch it. obviously that's fixed now.
@zzzeek @CaselIT Thank you so much for the quick response and fix!
Sorry to post into a closed bug - I have a stack trace for a similar case (except single threaded) that seems to be resolved by the same changes. I thought it would be worth posting this here for anyone trying to track down whether this issue is resolved in SA1.4 / SA2.0 (which it seems to be).
In SA1.3 a deadlock could be triggered in the MSSQL if the database restarted during initialization. This can be replicated by putting a breakpoint in e.g., _setup_supports_nvarchar_max()
, then restarting the database when the breakpoint is hit, then continuing the program.
The change in the associated PR to set _allow_revalidate = False
seems to be the piece that fixes this issue.
HTH anyone who has the same issue - a sample stack-trace is below.
Current thread 0x00007f75ab25f740 (most recent call first):
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 282 in _exec_once_impl
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 314 in exec_once_unless_exception
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 664 in __connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 440 in __init__
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 309 in _create_connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/impl.py", line 137 in _do_get
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 495 in checkout
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 778 in _checkout
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 304 in unique_connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2336 in _wrap_pool_connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2369 in raw_connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 473 in _revalidate_connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 390 in connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 401 in check_unicode
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 433 in <setcomp>
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 433 in _check_unicode_returns
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 330 in initialize
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mssql/base.py", line 2451 in initialize
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/strategies.py", line 199 in first_connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1406 in go
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 322 in __call__
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 285 in _exec_once_impl
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 314 in exec_once_unless_exception
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 664 in __connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 440 in __init__
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 309 in _create_connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/impl.py", line 137 in _do_get
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 495 in checkout
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 778 in _checkout
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 364 in connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2336 in _wrap_pool_connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2302 in _contextual_connect
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 433 in _connection_for_bind
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1150 in _connection_for_bind
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1142 in connection
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3550 in _connection_from_session
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3571 in _get_bind_args
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3556 in _execute_and_instances
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3535 in __iter__
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3373 in all
Describe the bug Hi, As the title says, we face a possible race condition between dialect initialization and query execution. We're using
SQLAlchemy
of version 1.4.8 withpsycopg2
engine of version 2.8.6, andPostgreSQL
database of version 12.4. We have a web server responsible of querying this database, and to do so it's using a session factory (sessionmaker
) object attached to an engine we create with apool_size=20
limitation.On one of the queries the server performs we use the
ILIKE
operator on a textual column, and set it to escape\
characters:and when we execute this query it raises the following exception:
The compiled query that's attached to the exception contains this condition:
(Obviously the escaped string is supposed to contain only one slash character).
While trying to debug this issue using breakpoints and debug prints - we noticed the
render_literal_value
method of thePGCompiler
class, and it looks related to our issue:A direct suspect from this method is the dialect object that's attached to the compiler and its
_backslash_escapes
attribute specifically.After digging into it - we noticed that this attribute is set to True by default on the
PGDialect
class, and its value is determined again upon thePGDialect
instance initialization (oninitialize
method):By planting debug prints on the
initialize
method ofPGDialect
and therender_literal_value
method of thePGCompiler
, we observed the fact that sometimes the first rendering is called before the initialization takes place. At this state, the SQL Compilation Caching (https://docs.sqlalchemy.org/en/14/core/connections.html#sql-compilation-caching) introduced in SQLAlchemy 1.4 prevents the compiler instance from rendering ILIKE operators differently, and all queries are failing.To support our suspicion, we tried to disable the compilation caching, and from this point the behavior was as "expected":
Expected behavior The query is expected to succeed from the first time it's executed.
To Reproduce We've done everything we can to reproduce this error on ipython, but couldn't (the debug prints say the initialization happens before the first rendering). Here is the session creation mechanism we use:
Versions.
We'd really appreciate your attention on this issue. Thanks!
Have a nice day!