readysettech / readyset

Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
https://readyset.io
Other
4.54k stars 125 forks source link

Wrong timestamp returned by JDBC PreparedStatement for PostgreSQL v15 #1365

Closed ltbgogo closed 2 months ago

ltbgogo commented 2 months ago

Describe the problem I got a wrong value which is "2000-01-01 00:00:00.0" after querying timestamp field for five times. 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2000-01-01 00:00:00.0 2000-01-01 00:00:00.0 2000-01-01 00:00:00.0 2000-01-01 00:00:00.0 2000-01-01 00:00:00.0

To Reproduce

public static void main(String[] args) throws SQLException {
    try (Connection conn = DriverManager.getConnection(url, username, password)) {
        String sql = "select ctime from t_user limit 1";
        try (PreparedStatement stmt = conn.prepareStatement(sql)) {
            for (int i = 0; i < 10; i++) {
                try (ResultSet rs = stmt.executeQuery()) {
                    while (rs.next()) {
                        System.out.println(rs.getTimestamp("ctime"));
                    }
                }
            }
        }
    }
}

Expected behavior 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601 2024-08-30 10:40:38.622601

Additional data / screenshots

create table if not exists t_user
(
    id   varchar(32),
    name varchar(255),
    ctime timestamp default CURRENT_TIMESTAMP
);
insert into t_user(id, name) values('11', 'tom');
insert into t_user(id, name) values('22', 'bobby');
insert into t_user(id, name) values('33', 'ana');

Environment

jasobrown-rs commented 2 months ago

ACK, thanks for reporting. I'll take a look today.

jasobrown-rs commented 2 months ago

Ok, I figured out what was happening, and have a patch internally for review.

A long time ago, Postgres internally represented timestamp and interval types as doubles. Around version 8.4 (July 2009), this was changed to an integer for various precision correctness issues.

As a result, Postgres would, and still to this day, return a ParameterStatus to clients on a new connection to declare if the server is treating timestamp and interval types as doubles. This status is named integer_datetimes and, since Postgres v10, always returns on (meaning it's always returning 8-byte integers).

This distinction primarily matters then the server is asked to return values in the binary format; text format just returns a string representation of the value.

Some Postgres drivers still expect to see that integer_datetimes server parameter, and if they do not see it, will assume the server will represent timestamp and interval types as doubles. This is what happens with the PGJDBC driver. When the client asks for a timestamp column, to be returned in binary format - which is what pgjdbc does for named prepared statements - the connection will still check it's state to see if the timestamp's 8-bytes should be interpreted as a double or integer value (a long).

Readyset is not currently sending that integer_datetimes server parameter back to clients, so pgjdbc assumes doubles, and fails to interpret the datetime correctly.

So, that explains why you saw the 2000-01-01 00:00:00.0 value at all. But why did it only happen after the fifth invocation? It's because the driver has a config parameter preparedThreshold. When used with the defaults, as your example did, it converts the PreparedStatement java object into a "server-side prepared statement" (also called a named prepared statement). Additionally, it will switch to requesting that all returned columns be transferred in the binary format (versus the default text format). It's that binary format that got tripped up against integer_datetimes not being set.

Until the fix is landed and released in a stable docker container, you can add ?binaryTranser=false to the end of your connection URL and it should work (it did for me). Note that binaryTransfer is another pgjdbc config param (in the same docs).