psycopg / psycopg2

PostgreSQL database adapter for the Python programming language
https://www.psycopg.org/
Other
3.36k stars 505 forks source link

Connection Sharing With psycopg2.connect()? #1725

Open fw-aaron opened 2 months ago

fw-aaron commented 2 months ago

Describe the bug

Please let us know:

1: what you did

I start a new thread using thread = threading.Thread(target=thread_runner); thread.start(). Both threads (the new one and the existing one) use the following function to create a connection to the database:

import psycopg2
def connect_db():
  try:
    conn = psycopg2.connect(database=..., user=..., password=..., host=..., port=...)
    conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
    logging.info("psycopg connection on fileno %s" % str(conn.fileno()))
  except Exception as e:
    os.sys.exit(-1)
  return conn

Then the cursor is retrieved (cursor=conn.cursor()) and used to execute queries (db.execute('update ftp_settings set last_submission = %s', [date]); db.connection.commit()).

2: what you expected to happen

Both thread should be able to access the database and run queries without trouble.

3: what happened instead

Logging output from the logging statement above looks like this:

[2024-08-24 15:39:43,824] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:48,851] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:48,879] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:48,886] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:48,889] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:48,891] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,499] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,508] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,511] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,513] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,520] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,528] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,531] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,533] INFO in db_connection: psycopg connection on fileno 14
[2024-08-24 15:39:50,538] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,552] INFO in db_connection: psycopg connection on fileno 16
[2024-08-24 15:39:50,591] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,600] INFO in db_connection: psycopg connection on fileno 11
[2024-08-24 15:39:50,602] INFO in db_connection: psycopg connection on fileno 11
[2024-08-24 15:39:50,605] INFO in db_connection: psycopg connection on fileno 11
[2024-08-24 15:39:50,762] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,769] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,772] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,775] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,778] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,946] INFO in db_connection: psycopg connection on fileno 15
[2024-08-24 15:39:50,947] INFO in db_connection: psycopg connection on fileno 18
[2024-08-24 15:39:50,947] INFO in db_connection: psycopg connection on fileno 19
[2024-08-24 15:39:50,954] INFO in db_connection: psycopg connection on fileno 15

After a while, I get the following error:

error: Second simultaneous read on fileno 14 detected.  Unless you really know what you're doing,
make sure that only one greenthread can read any particular socket.  Consider using a pools.Pool.
If you do know what you're doing and want to disable this error, call
eventlet.debug.hub_prevent_multiple_readers(False) - MY THREAD=<built-in method switch of
GreenThread object at 0x7fbf6aafc048>; THAT THREAD=FdListener('read', 14,
<built-in method switch of greenlet.greenlet object at 0x7fbf6aafc470>,
<built-in method throw of greenlet.greenlet object at 0x7fbf6aafc470>)

Is it possible that psycopg2.connect is returning a reused connection? Based on the error, it sounds like one greenthread used the connection, then later the second greenthread picked up the used connection. My impression is that psycopg2.connect should return a new connection. Here is my python dependency list:

argon2-cffi==16.3.0
asn1crypto==0.24.0
astroid==2.0.4
atomicwrites==1.3.0
bcrypt==3.1.4
cffi==1.11.5
click==6.7
cryptography==2.3
dnspython==1.16.0
eventlet==0.24.1
Flask==1.0.2
Flask-Login==0.4.0
Flask-Session==0.3.1
Flask-SocketIO==3.0.1
future==0.16.0
greenlet==0.4.13
gunicorn==19.9.0
idna==2.7
iniherit==0.3.9
isort==4.3.4
itsdangerous==0.24
Jinja2==2.10
lazy-object-proxy==1.3.1
MarkupSafe==1.1.1
mccabe==0.6.1
monotonic==1.5
more-itertools==6.0.0
paramiko==2.4.1
passlib==1.7.1
Pillow==5.2.0
pluggy==0.8.1
psycopg2==2.7.3.1
py==1.7.0
pyasn1==0.4.3
pycparser==2.18
pylint==2.1.1
PyNaCl==1.2.1
pysftp==0.2.9
pytest==4.2.1
python-engineio==2.2.0
python-slugify==1.2.5
python-socketio==2.0.0
pytz==2016.10
redis==2.10.6
reportlab==3.3.0
rq==0.12.0
scrypt==0.8.0
simplejson==3.16.0
six==1.11.0
Unidecode==1.0.22
Werkzeug==0.14.1
wrapt==1.10.11
yoyo-migrations==5.0.5
google-api-python-client==1.7.11
google-auth==1.11.0
google-auth-httplib2==0.0.3
google-auth-oauthlib==0.4.1
oauth2client==4.1.3

I don't think any of these packages would provide connection pooling, but I'm not familiar with many of these packages.

If possible, provide a script reproducing the issue.

The issue isn't reliably reproducible.

dvarrazzo commented 2 months ago

If you are using greenlets then you should use psycopg in green mode. You should also make sure that you are not sharing connections across processes, which I don't know if it's happening or not in flask.

No, psycopg will not reuse a connection across threads.

Looking better, I see that you are using eventlet. Did you call the appropriate functions to make eventlet compatible with psycopg?

fw-aaron commented 1 month ago

Hey, your name is on that page 🚀!

The first two lines of my main py file are

import eventlet
eventlet.monkey_patch()

Thank you for drawing my attention to this. I wasn't aware that this project uses that facility, nor that I use greenthreads by default because of this.

I start a new thread using thread = threading.Thread(target=thread_runner); thread.start()

Since my project uses monkey_patch, this thread is actually a green thread (right?).

conn = psycopg2.connect(database=..., user=..., password=..., host=..., port=...)

Since my project uses monkey_patch, this is using a green-thread-enabled psycopg connect (right?).

The problem is still unexplained, but I am understanding aspects of the problem much better now! I'll look into "Migrating off of Eventlet" https://eventlet.readthedocs.io/en/latest/asyncio/migration.html#migration-guide, but this might be quite difficult since any system function call could be affected by removing that monkey_patch()...

dvarrazzo commented 1 month ago

Does psycopg2.extensions.get_wait_callback return you None or anything?

fw-aaron commented 1 month ago

logging.info(" wait callback is " + str(psycopg2.extensions.get_wait_callback)) shows

wait callback is <built-in function get_wait_callback>

dvarrazzo commented 1 month ago

It's a function, you have to call it, with psycopg2.extensions.get_wait_callback().

fw-aaron commented 1 month ago

Oh, I thought the presence of get_wait_callback would be enough. When I do logging.info(" wait callback is " + str(psycopg2.extensions.get_wait_callback())), I get

wait callback is <function eventlet_wait_callback at 0x779b36c05840>

which also appears to be a function provided by eventlet. logging.info(" wait callback is " + str(psycopg2.extensions.get_wait_callback()())) yields

ERROR:root:m4tp: connect_db()
Traceback (most recent call last):
  File "app/db_connection.py", line 17, in connect_db
    logging.info(" wait callback is " + str(psycopg2.extensions.get_wait_callback()()))
TypeError: eventlet_wait_callback() missing 1 required positional argument: 'conn'

obviously because I'm calling the wait callback incorrectly.

dvarrazzo commented 1 month ago

Ok, so that means that psycopg is monkeypatched well by eventlet.

I have no idea what is going wrong there: you should hear from the eventlet folks.

fw-aaron commented 1 month ago

Thank you for your feedback! In the above logging just after:

Logging output from the logging statement above looks like this

What do you make of those file numbers? I guess it looks like psycopg opened a connection using file descriptor 14, then the connection was closed, then a new connection was opened which reused file number 14, then it was closed. This happened a dozen times or so, then a new connection was made while file number 14 was open, so file descriptor 15 was used. While that connection was open, a new connection was made resulting in file descriptor 16.

Does that sound like a reasonable interpretation of this logging output?