tlocke / pg8000

A Pure-Python PostgreSQL Driver
BSD 3-Clause "New" or "Revised" License
515 stars 46 forks source link

Struct unpack error on `INSERT FROM ... SELECT` #98

Closed luciengaitskell closed 9 months ago

luciengaitskell commented 2 years ago

I'm running an insert into a temporary table with pg8000, and an a struct unpack error is raised on response from the Postgres 13 database.

I was able to successfully run this identical query directly from psql.

INFO  [sqlalchemy.engine.Engine] INSERT INTO table_tmp (a, b, c, d, e, f, g, h, i, j, k, l) SELECT table1.a, lower(table1.b) AS b, table1.c, table1.d, table1.e, table1.f, table1.g, table1.h, table1.i, lower(table1.j) AS j, lower(table1.k) AS k, table1.l
FROM table1
INFO  [sqlalchemy.engine.Engine] [generated in 0.00026s] ()
Traceback (most recent call last):
  File "./lib/python3.9/site-packages/pg8000/core.py", line 760, in handle_messages
    code, data_len = ci_unpack(self._read(5))
struct.error: unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "./lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "./lib/python3.9/site-packages/pg8000/dbapi.py", line 452, in execute
    self._context = self._c.execute_simple(operation)
  File "./lib/python3.9/site-packages/pg8000/core.py", line 620, in execute_simple
    self.handle_messages(context)
  File "./lib/python3.9/site-packages/pg8000/core.py", line 762, in handle_messages
    raise InterfaceError("network error on read") from e
pg8000.exceptions.InterfaceError: network error on read

I also tried adding a RETURNING 1 and RETURNING table_tmp.id, but neither solved the problem.

tlocke commented 2 years ago

Hi @luciengaitskell, thanks for the bug report. As it says, this type of error is usually caused by a network error, some problem with the underlying connection between the client's socket and the server's socket. You mention that it works with psql, but have you tried running psql with the host and port explicitly specified? I think psql by default uses unix sockets, and you may be using pg8000 with TCP/IP. Also, is it just this query that it errors on, or all queries? Finally, I notice you're using SQLAlchemy. To pin down the error it might be worth connecting with pg8000 on its own, then that rules out the problem being SQLAlchemy.

luciengaitskell commented 2 years ago

@tlocke Thanks for the suggestions. I'm using Google Cloud SQL, and therefore am using psql over the Cloud SQL Proxy through TCP/IP. One important piece of information I left out above, is that I use the Google Cloud SQL Connector to connect with pg8000/SQLAlchemy.

I can try executing this command both directly from pg8000, along with through SQLAlchemy without the connector.

luciengaitskell commented 2 years ago

@tlocke I ran the command directly as text with pg8000:

cursor.execute(
    """INSERT INTO table_tmp (a, b, c, d, e, f, g, h, i, j, k, l) SELECT table1.a, lower(table1.b) AS b, table1.c, table1.d, table1.e, table1.f, table1.g, table1.h, table1.i, lower(table1.j) AS j, lower(table1.k) AS k, table1.l
FROM table1"""
)

and encountered the same error:

Traceback (most recent call last):
  File "./lib/python3.9/site-packages/pg8000/core.py", line 760, in handle_messages
    code, data_len = ci_unpack(self._read(5))
struct.error: unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "pgtest.py", line 32, in <module>
    cursor.execute(
  File "./lib/python3.9/site-packages/pg8000/dbapi.py", line 452, in execute
    self._context = self._c.execute_simple(operation)
  File "./lib/python3.9/site-packages/pg8000/core.py", line 620, in execute_simple
    self.handle_messages(context)
  File "./lib/python3.9/site-packages/pg8000/core.py", line 762, in handle_messages
    raise InterfaceError("network error on read") from e
pg8000.exceptions.InterfaceError: network error on read
tlocke commented 2 years ago

That's interesting, does every SQL statement give an error when it's run, or just the one you've shown? Also it might be worth raising the problem with Google Cloud SQL connector and see if they can shed any light on the problem.

luciengaitskell commented 2 years ago

@tlocke

does every SQL statement give an error when it's run, or just the one you've shown?

That error only comes up on that single statement. I am able to create a temporary table before it successfully. Other queries I have tried to test also work properly.

Also it might be worth raising the problem with Google Cloud SQL connector and see if they can shed any light on the problem.

In my second test I was using the Google Cloud Auth Proxy, which is a TCP tunnel. There was no Cloud Connector used in that run.

tlocke commented 2 years ago

My guess is that the query that gives the problem is a long-running one, is that right? So I suspect that something is closing the connection if there hasn't been a response from the server for a timeout period. My suggestions are to either break the query into shorter ones, or to see if you can workout where the offending timeout is set. There's a timeout parameter when you create a pg8000 connection, and if it's omitted the timeout defaults to 'never time out'. There might be a timeout setting for Google Cloud Auth Proxy and maybe elsewhere in Google's setup.

luciengaitskell commented 2 years ago

@tlocke I've run multi-long ALTER queries through SQLAlchemy and pg8000 sucessfully. I don't understand why this should be different.

The same exact operation works properly through psql, using the same proxy and database. If it doesn't work through pg8000, then that indicates it's a problem with pg8000.

Do you have any more suggestions for debugging? The table I'm reading from is around 500 million rows, but this shouldn't be a problem for Postgres.

tlocke commented 2 years ago

The SQL you're executing doesn't have any parameters, is that right? In that case, as long as you'renot connecting over SSL, the communication over the network should be bit-for-bit identical. So you could use Wireshark https://www.wireshark.org/ to record the network traffic with psql and then with pg8000. It's doable, but not easy, so hats off to you if you can get something useful out of it! I can help interpret the meaning of the data, using the reference to the PostgreSQL binary protocol. https://www.postgresql.org/docs/current/protocol.html where psql always uses the Simple Query form, and pg8000 also uses Simple Query if there aren't any parameters. If you decide to give it a go, good luck and hope it goes well!

tlocke commented 2 years ago

Another thought is if you are able to write a test case that reliably reproduces the problem then I can the bug as well and should be able to fix it.

tlocke commented 9 months ago

I think that this was fixed in commit https://github.com/tlocke/pg8000/commit/a84bf8c7b68abd0966b12d24bf481c0018e0cd87 but please do re-open if the problem still occurs.