Open gaspardc-met opened 3 months ago
@kou @lidavidm any ideas on this one? I'm not familiar enough with the postgres driver (or postgres in general) to have an idea where the problem is
Is it possible to share the log?
I think we'll have to set up a pod for 12 hours and see if we can reproduce at all...
Do you set timeout related parameters such as idle_session_timeout
?
See https://www.postgresql.org/docs/current/runtime-config-client.html for other timeout related parameters?
Could you show SELECT * FROM pg_stat_activity
when this problem is happen?
See also: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW
Do you pass connection
to handle_sql_query()
? Or is connection=None
used?
One other thing, if we want to try and reproduce this, what was the PostgreSQL version?
Hello all, thanks for the quick response.
@lidavidm : Log entries
2024-05-20 14:05:17.660 GMT [521061] LOG: could not receive data from client: Connection reset by peer
2024-05-20 14:13:31.910 GMT [521817] FATAL: unsupported frontend protocol 0.0: server supports 3.0 to 3.0
2024-05-20 14:13:32.214 GMT [521818] FATAL: unsupported frontend protocol 255.255: server supports 3.0 to 3.0
2024-05-20 14:13:32.518 GMT [521819] FATAL: no PostgreSQL user name specified in startup packet
2024-05-20 14:15:07.995 GMT [521500] LOG: could not receive data from client: Connection reset by peer
2024-05-20 15:26:08.819 GMT [528434] LOG: invalid length of startup packet
The could not receive data from client: Connection reset by peer
happened a lot and is standard, I think it's a connection not properly closed by the webapp (now I close every connection) and happen with SQLalchemy too.
The others are "new" and happened when the connection was stuck and not before
PG version:
psql (PostgreSQL) 14.5
docker.io/bitnami/postgresql:14.5.0-debian-11-r35
@kou :
connection
and creating a new connection. I tried both. @kou :
SELECT * FROM pg_stat_activity
thenThanks. Interesting, there's occasional reference to errors like this elsewhere 1, but usually the suggestion is that something is port-scanning Postgres. That doesn't seem likely here. On the other hand, if the client were doing something wrong after a long time, restarting the client completely should presumably reset that. So instead it seems like something borks the server.
Just to clarify, though:
(1) Did you try restarting the Postgres server, too? (2) Did you restart the Postgres server and then try with SQLAlchemy? (3) If not, then it sounds like: using ADBC for a while borks the server, but using SQLAlchemy (without restarting the server) works, and it's unknown whether switching back to ADBC would fail or work?
Thanks. I'll try to find time to set up a container and reproduce a setup like this.
Sorry, it's looking like I won't have time to investigate this anytime soon. This is on my backlog and I do hope to get to it but any help here is welcome
Does it matter at all if you remove pandas and use the connection directly to parse the results? Its possible there is also something with pandas that is causing the problem
What happened?
Context before the bug (working):
pd.read_sql
from Pandas and a SQLalchemy engineSwitching to ADBC:
adbc_driver_postgresql
'sdbapi
connection withpd.read_sql
Problem:
pd.read_sql
(know this through caching) and then wait indefinitely.How can we reproduce the bug?
"postgresql://{user}:{password}@{host}:{port}/{db}"
formatted with the proper valuesThe function to execute the SQL query was:
The SQL queries ranged from
select * from TABLE_NAME
to selecting specific columns on a range of specific datesEnvironment/Setup
python 3.11 pandas == 2.2.2 adbc_driver_postgresql==0.11.0 adbc-driver-manager==0.11.0