Open ryanheise opened 1 week ago
@ryanheise thanks for reporting this, it is worth to investigate what we can improve here. Why postgresql version are you using? Any special extension or data type that could have been part of the result?
Here is the output from within psql:
postgres=# select version();
version
---------------------------------------------------------------------------------------------------------------------
PostgreSQL 15.4 (Debian 15.4-2.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
---------+---------+------------+-------------------------------------------------------------------
amcheck | 1.3 | pg_catalog | functions for verifying relation integrity
pg_trgm | 1.6 | public | text similarity measurement and index searching based on trigrams
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
vector | 0.5.1 | public | vector data type and ivfflat and hnsw access methods
Although I did not log the specific SQL query, there are queries based on the pgvector extension, both to insert/update embeddings and also to query embeddings. In each of these cases, I am passing the vectors in their string/text representation (I wasn't able to get it to work directly as an array).
I actually do have the idle query saved from my psql session. I can't give the entire schema, but mytable has columns with the following types:
...... boolean
myhash bytea not null
...... integer
...... integer not null
...... text
...... text not null
...... text[]
...... text[] not null
...... timestamp(3) with time zone
...... timestamp(3) with time zone not null
...... vector(512)
And the idle query was long these lines:
SELECT m.*, encode(m.myhash, 'base64') AS myhashBase64
FROM mytable m
WHERE ...
@ryanheise these were helpful, thank you! I did not explore the actual parsing error yet, but I have found a missing part where the protocol-level bug was not bubbling up, and I wrote a test for it #388. I've published a new version, hopefully it will expose the actual protocol-level bug with better details.
I just noticed this in my logs today:
After this happens, the connection is essentially hung on the query forever. If I go into psql from the console and select from
pg_stat_activity
I can see all the queries are in theidle
state and nothing is blocking them. This is not recoverable unless I manually kill the idle processes withpg_terminate_backend
.I can only see this exception by listening for zone errors so I can't directly catch it at the point in my code where the exception was triggered. I guess the exception should not happen in the first place, but if it does happen, it would be nice if server_messages.dart handled the exception and forwarded it back to the original call site.
I believe there might also be some other scenarios where the connection can hang, and the only way is to kill the idle processes manually. However, this is the only time when I've been given an actual exception to report. Since those other cases are still unknown, I am hoping there is an explanation in general for why this package might be hanging on idle processes where nothing is blocking them (I'm assuming it's something like the processes are remaining in the idle state because this package hasn't yet closed the cursors or something). As yet, I haven't worked out a way to reproduce it other than letting the application run for long enough until by chance the DB connection hangs. The stacktrace above, however, has only ever happened once, so I'm not sure if I'm likely to ever see it again.
This exception occurred with release 3.4.2.