cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.11k stars 3.81k forks source link

Error managing failed copy #119113

Open dvarrazzo opened 9 months ago

dvarrazzo commented 9 months ago

In more recent CRDB than 21.2 (after fixing #81559), PQputCopyEnd() works well enough, but the message seems somewhat malformed. The error raised after failed copy appears as:

COPY from stdin failed: error from Python: ZeroDivisionError - division by zero
message contents do not agree with length in message type "E"

To Reproduce

Using psycopg 3.1.x. With PostgreSQL:

>>> import psycopg.crdb
>>> conn = psycopg.crdb.connect("dbname=psycopg3_test")
>>> cur = conn.cursor()
>>> cur.execute("create table whatever (id integer primary key)")
>>> with cur.copy("copy whatever from stdin") as copy:
...     1 / 0
ZeroDivisionError: division by zero

With CRDB 23.1:

# docker run -p 26257:26257 --name crdb --rm cockroachdb/cockroach:latest-v23.1 start-single-node --insecure

>>> import psycopg.crdb
>>> conn = psycopg.crdb.connect("host=localhost port=26257 user=root dbname=defaultdb")
>>> cur = conn.cursor()
>>> cur.execute("create table whatever (id integer primary key)")
>>> with cur.copy("copy whatever from stdin") as copy:
...     1 / 0
psycopg.DatabaseError: COPY from stdin failed: error from Python: ZeroDivisionError - division by zero
message contents do not agree with length in message type "E"

Note: psycopg recognises the error thrown from Python on the other side of the libpq in this code. In postgres, after a PQputCopyEnd(), the server will return a query_canceled error. I understand that, after fixing #81559, CRDB tries to do the same. However, because of an error in the message length of the error, the libpq shadows this error with a client-side error, which won't have a SQLSTATE. Therefore, the Python code will fail to re-raise the Python exception and will fall back with raising the database error instead.

Environment:

Jira issue: CRDB-36036

blathers-crl[bot] commented 9 months ago

Hello, I am Blathers. I am here to help you get the issue triaged.

It looks like you have not filled out the issue in the format of any of our templates. To best assist you, we advise you to use one of these templates.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

rafiss commented 8 months ago

I captured the pgwire messages from both CRDB and Postgres.

PostgreSQL (server version v14.8). I see Len: 166 in the TCP header, and Length: 165 in the pgwire message.

Frame 31: 242 bytes on wire (1936 bits), 242 bytes captured (1936 bits) on interface lo0, id 0
Null/Loopback
Internet Protocol Version 6, Src: ::1, Dst: ::1
Transmission Control Protocol, Src Port: 5432, Dst Port: 64666, Seq: 470, Ack: 254, Len: 166
PostgreSQL
    Type: Error
    Length: 165
    Severity: ERROR
    Text: ERROR
    Code: 57014
    Message: COPY from stdin failed: error from Python: ZeroDivisionError - division by zero
    Context: COPY whatever, line 1
    File: copyfromparse.c
    Line: 317
    Routine: CopyGetData

CRDB. I see Len: 133 in the TCP header, and Length: 132 in the pgwire message.

Frame 111: 189 bytes on wire (1512 bits), 189 bytes captured (1512 bits) on interface lo0, id 0
Null/Loopback
Internet Protocol Version 4, Src: 127.0.0.1, Dst: 127.0.0.1
Transmission Control Protocol, Src Port: 26257, Dst Port: 65480, Seq: 499, Ack: 255, Len: 133
PostgreSQL
    Type: Error
    Length: 132
    Severity: ERROR
    Text: ERROR
    Code: 57014
    File: copy_from.go
    Line: 607
    Routine: run
    Message: COPY from stdin failed: error from Python: ZeroDivisionError - division by zero