Open programmerq opened 1 year ago
I just tried dbeaver + Redshift (non-serverless) and I can reproduce the Caused by: java.io.IOException: Unexpected packet type: 0
error.
There is a discrepancy between the JDBC driver vs the database agent on parsing RowDescription
.
I am able to "bypass" this error by setting a driver property client_protocol_version
to 0
and things start working
(note that this is a "hidden" property so have to add as "User Properties', reference)
I will try datagrip next to see if it has same problem. (and will look deeper into this client_protocol_version
)
--- update on datagrip
Setting client_protocol_version
to 0
also bypass connection errors.
I had originally set out to find out if there was a JDBC Driver option that could be adjusted to make the Redshift driver work. I have seen folks bring up how their DBAs would very much rather use the redshift driver when working with redshift via Teleport.
During the course of this, I ran into a fairly low-level bug. The Teleport proxy seems to be sending some PostgreSQL protocol messages in separate TCP packets compared to connecting directly to Redshift. This is triggering an error in the Redshift JDBC driver during connection initialization.
Steps to Reproduce
Launch Teleport proxy listening on port 53189 with
tsh proxy db --db-user teleport-redshift-serverless-access --port 53189 --tunnel work2
Confirm that the tunnel is configured correctly by using a non-redshift client such as the postgres jdbc driver, or the
psql
command line client.Connect to proxy on 53189 from JDBC using:
clusterid=work2,dbuser=teleport-redshift-serverless-access
loglevel=TRACE
andlogpath=/path/to/some/output/directory/
I had similar results with both datagrip and dbeaver. I did use a serverless redshift, and I followed this guide to set it up.
Test
Root Cause Analysis
JDBC Driver logs
The jdbc TRACE level logs showed a large amount of data, including stacktraces. The driver is available on github too: https://github.com/aws/amazon-redshift-jdbc-driver
I was able to enable logging in the drivers. When connecting from datagrip, The log showed that it was raising an I/O error while running the data through its UTF-8 decoder:
Illegal UTF-8 sequence: initial byte is 10xxxxxx: 133
jdbc-utf8.txt
Packet Capture
I ran wireshark to capture the packets and look for the illegal byte
133
, or0x85
The hex decoded TCP payload had the following line:
000002BA 44 00 00 00 85 00 01 00 00 00 7b 50 6f 73 74 67 D....... ..{Postg
This was part of the startup message, and the
85
was the integer indicating the message length. It seemed like something about Teleport's postgres server implementation didn't sit well with the redshift driver.jdbc-utf8.pcapng.txt (remove .txt extension to use in wireshark).
tcp.sequence eq 0
is the relevant connection. Use-d tcp.port==53189,pgsql
I compared a packet capture to the same Teleport database, and saw that the postgres jdbc and psql connections to this authenticated proxy were nearly identical. The only difference is that immediately after the startup message is sent, the jdbc client would disconnect and display the error to the user.
Connecting to Redshift Directly
I decided to compare this to the startup message when the redshift driver is directly connecting to redshift itself. I used https://github.com/neykov/extract-tls-secrets to write the tls keylog file so I could decrypt the packets in wireshark/tshark, and it worked.
It is essentially identical, but it does have a subtle difference. looking at the entire startup message, it looks like the startup message packets sent by Teleport are split into multiple packets. Here is the startup message when connecting directly to redshift:
redshift-direct-decrypted.pcapng.txt. Use
-d tcp.port==5439,pgsql
to get the pgsql to populate. The TLS keylog has been incorporated into the capture already.Here is the same startup message sent via Teleport:
This different packet segmentation seems to trigger a bug in the Redshift JDBC driver's parsing of the startup message.
Slight variant of the error.
I tried to confirm whether the same error was happening with both dbeaver and datagrip. They have slightly different configuration dialogs, and they both had different patch versions of the driver. The driver in DBeaver didn't throw the UTF-8 error, but it did fail at the same point in the packet exchange.
Caused by: java.io.IOException: Unexpected packet type: 0
is the error message in the log.and here's the accompanying log and packet capture for this test (via the authenticated proxy):
dbeaver.pcapng.txt dbeaver.jdbc.txt
Workarounds
The current recommendation in our documentation is to use the postgres jdbc driver, but that leaves a lot to be desired for end users. Things like external schemas do not populate, and it is not clear to end-users why.
Request
If Teleport could activate a redshift "dialect" that would help it fall more in line with the redshift implementation that the JDBC Redshift driver is tested against, that would be helpful. I don't know if the postgres protocol itself actually requires that messages not be broken up into separate packets, or if it's simply a convention that has always been around.
Versions
Other Debug logs are scattered throughout the ticket, but here are the teleport logs for tsh and the agent:
When clicking "test connection"
tsh -d proxy ...
shows:database agent logs show: