fieldenms / tg

Trident Genesis
MIT License
14 stars 7 forks source link

PostgreSQL: data type of Date parameters can't always be resolved #2235

Closed homedirectory closed 1 month ago

homedirectory commented 5 months ago

Description

PostgreSQL driver for JDBC treats java.util.Date (simply Date henceforth) parameters as being of "unspecified" SQL type because of a timestamp vs timestampz conundrum. This behaviour is intended to prevent unexpected rotation by the server's timezone which happens when a Date is used where a timestampz is expected.

By default, properties of temporal types are mapped to an SQL timestamp type without time zone. For PostgreSQL this would be timestamp without time zone. Therefore, Date instances should always be resolved as timestamp without time zone (simply timestamp henceforth).

There are 2 solutions to this problem:

  1. Modify EQL-to-SQL transformation so that values of type Date are always cast as timestamp.
  2. Modify Hibernate query parameterisation logic so that values of type Date are forced to be resolved as timestamp.

    This approach relies on knowing the internals of PostgreSQL JDBC driver's implementation of java.sql.PreparedStatement, the analysis of which uncovered a solution for the problem: if method setObject is used with java.time.LocalDateTime as the parameter value, then its type will be successfully resolved as timestamp.

    Unfortunately, the API, PreparedStatement does not expose a method to set LocalDateTime, thus this solution is implementation-dependent.

    Moreover, this approach requires introducing an additional PostgreSQL-specific Hibernate type mapping for Date.

The first approach is preferred due to being significantly simpler.

Example

Consider the following SQL query:

SELECT CASE WHEN :mydate IS NULL THEN 50 ELSE 200 END;

If parameter mydate is bound to an instance of Date, the query fails with:

Error: "could not determine data type of parameter $1"

However, this works:

SELECT CASE WHEN CAST (:mydate AS timestamp) IS NULL THEN 50 ELSE 200 END;

References:

Expected outcome

Parameter values of type Date are always resolved as timestamp by PostgreSQL.

homedirectory commented 5 months ago

A challenge came up while following approach 1: Date values cannot be always cast as timestamp as that would not respect a special class of temporal properties -- UTC datetime properties.

Yielding literal values into UTC datetime properties requires timezone information to be present in a literal to interpret the fetched value correctly. That's because during parameter binding literal values are formatted using the local timezone. Therefore, discarding the timezone by casting will lead to local time being interpreted as UTC time.

Consider the following:

// Date uses system's default timezone; assume 2024-01-01 12:00:00+03
Date myDate = new Date();

// expected that dateUtc will be semantically equal to myDate (2024-01-01 09:00:00+00)
yield().val(myDate).as("dateUtc")

// in reality, dateUtc becomes 2024-01-01 12:00:00+00, 3 hours later

What happens during fetching is this:

SELECT CAST ('2024-01-01 12:00:00+03' AS timestamp) AS dateUtc
-- equivalent to
SELECT CAST '2024-01-01 12:00:00' AS dateUtc

As a result of discarding timezone information, '2024-01-01 12:00:00' is interpreted as UTC time.

Were there no casting, '2024-01-01 12:00:00+03' would be interpreted correctly by taking the +03 offset into account.

Curiously enough, this problem also manifests itself with SQL Server even though no explicit casting is performed.

The reverse approach, always casting as timestamp with time zone, can't be adopted either, as that would change the semantics of comparison with table columns of type timestamp -- they will be always converted to timestamp with time zone, leading to different results.

Alternative approach

Tackling the described problem in full generality is a complex task as it requires us to distinguish local and UTC datetime values. This could be accomplished by controlling the stage of binding query parameters (through Hibernate). However, this would rely on EQL's ability of identifying UTC datetime values, which would require additional analysis of the surrounding context (e.g., when a value is yielded as a property of UTC datetime type).

Nevertheless, this problem might not be worth solving at the moment because:

homedirectory commented 5 months ago

Registering a custom Hibernate type mapping to enforce SQL TIMESTAMP type for Date

Through experimentation it has been discovered that it's possible to control how Date values are bound to query parameters through Hibernate, effectively enforcing their SQL type to always be TIMESTAMP.

This can be achieved by registering a custom type mapping with Hibernate, which would then be used when binding query parameters instead of the standard one for Date.

Internally this type mapping would need to transform the supplied Date into a java.time.LocalDateTime and bind the latter. This is required because the PostgreSQL JDBC driver does not fully conform to the JDBC specification.

It would be nice if this type mapping could be registered only when PostgreSQL is used. Although this is not a strict requirement, as this type mapping shouldn't have any effect for SQL Server (and it has been observed not to).

What does the specification say?

JPA (aka Jakarta Persistence) 2.2 introduced support for java.time.{LocalDate,LocalTime,LocalDateTime,OffsetTime,OffsetDateTime}.

SR-338 Mainenance Release, 7/17/17, Footnote 102 on page 431 says:

Mapping of java.time.LocalDate, java.time.LocalTime, java.time.LocalDateTime, java.time.OffsetTime, and java.time.OffsetDateTime types to columns other than those supported by the mappings defined by Appendix B of the JDBC 4.2 specification is not required to be supported by the persistence provider beyond the support required for other serializable types.

It is not understood whether this applies only to table columns or to query parameters too. Let's assume it applies to both (1).

JDBC 4.2, Appendix B, Section B.4, Table B-4

PreparedStatement.setObject, PreparedStatement.setNull, RowSet.setNull and RowSet.setObject use the mapping shown TABLE B-4 when no parameter specifying a target JDBC type is provided.

This section describes query parameters exclusively, which strengthens assumption (1).

The said table contains the following entry:

java.time.LocalDateTime | TIMESTAMP

Which means that PreparedStatement.setObject(index, dateTime) where dateTime : LocalDateTime can be used to ensure that the bound parameter will be of SQL type TIMESTAMP.

Although, the said table also contains java.util.Date | TIMESTAMP, and the PostgreSQL JDBC driver violates it and uses SQL type UNSPECIFIED instead. So in using LocalDateTime we would be relying on the assumption that PostgreSQL JDBC driver won't suddenly start violating the java.time.LocalDateTime | TIMESTAMP rule...