cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.88k stars 3.77k forks source link

logging: improve on PGSQL protocol level messaging #99135

Open fabiog1901 opened 1 year ago

fabiog1901 commented 1 year ago

I successfully pulled the logs by calling http://localhost:8080/debug/logspy?count=10000&duration=10s&grep=.&flatten=1&vmodule=*=2 .

The stmt is executed by this code; it uses psycopg 3.

import psycopg
import time

dburl = "postgres://root@crlMBP-C02D64WKMD6TMjg5.local:26257/defaultdb?sslmode=disable"

with psycopg.connect(dburl, autocommit=True) as conn:
    time.sleep(2)
    with conn.cursor() as cur0:
        cur0.execute("""delete from t where id = any (%s, %s)""", (155, 233), prepare=True)
        time.sleep(2)

I purposely sleep between conn creation, stmt exec and termination to identify them later in the log, nicely spaced 2s apart. Note: it's a prepared stmt; it's auto-committed.

In wireshark, the message flow is very clear once I filter for PGSQL (omitted the conn creation messages):

No. Time     Source    Destination       SrcPort Protocol Len  Info
131 7.416057 192.168.0.115 192.168.0.115 51145   PGSQL    121  >P/S
133 7.417079 192.168.0.115 192.168.0.115 26257   PGSQL    67   <1/Z
135 7.417286 192.168.0.115 192.168.0.115 51145   PGSQL    115  >B/D/E/S
137 7.418998 192.168.0.115 192.168.0.115 26257   PGSQL    86   <2/n/C/Z
151 9.422345 192.168.0.115 192.168.0.115 51145   PGSQL    61   >X

In the logs, I see when the connection is created

I230321 14:57:34.399684 115192 sql/pgwire/conn.go:164 ⋮ [n1,client=192.168.0.115:51023] 20575 new connection with options: {User:‹root› IsSuperuser:true SessionDefaults:map[‹database›:‹defaultdb›] CustomOptionSessionDefaults:map[] RemoteAddr:‹192.168.0.115:51023› ConnResultsBufferSize:16384 SessionRevivalToken:[] JWTAuthEnabled:false}

I see the inbound Parse/Sync

I230321 14:57:36.401754 115192 sql/pgwire/conn.go:440 ⋮ [n1,client=192.168.0.115:51023,user=root] 20660 pgwire: processing ‹ClientMsgParse›
I230321 14:57:36.402030 115192 sql/pgwire/conn.go:440 ⋮ [n1,client=192.168.0.115:51023,user=root] 20661 pgwire: processing ‹ClientMsgSync›

But don't see the outbound 1/Z that is, the ParseCompletion and ReadyForQuery.

Similarly, I see the B/D/E/S

I230321 14:57:36.403292 115192 sql/pgwire/conn.go:440 ⋮ [n1,client=192.168.0.115:51023,user=root] 20687 pgwire: processing ‹ClientMsgBind›
I230321 14:57:36.403352 115192 sql/pgwire/conn.go:440 ⋮ [n1,client=192.168.0.115:51023,user=root] 20688 pgwire: processing ‹ClientMsgDescribe›
I230321 14:57:36.403372 115192 sql/pgwire/conn.go:440 ⋮ [n1,client=192.168.0.115:51023,user=root] 20689 pgwire: processing ‹ClientMsgExecute›
I230321 14:57:36.403388 115192 sql/pgwire/conn.go:440 ⋮ [n1,client=192.168.0.115:51023,user=root] 20690 pgwire: processing ‹ClientMsgSync›

but don't see the outbound 2/n/C/Z i.e. BindCompletion, No-data, CommandCompletion, ReadyForQuery.

Upon inspecting the code - within my limited understanding of Go - I see that we print out incoming messages https://github.com/cockroachdb/cockroach/blob/v22.2.2/pkg/sql/pgwire/conn.go#L440 (i'm using v22.2.2 locally..) but for example don't see the same line of code for the ReadyForQuery or CommandComplete .

slack

Jira issue: CRDB-25718

knz commented 1 year ago

cc @rimadeodhar @rafiss for triage and integration in planning convos

knz commented 1 year ago

(there's a customer context)