jakartaee / persistence

https://jakartaee.github.io/persistence/
Other
197 stars 58 forks source link

Support "pure" UTC data flow for date/time values #156

Open lukasj opened 7 years ago

lukasj commented 7 years ago

As per note at https://github.com/javaee/jpa-spec/issues/63#issuecomment-316994047 "Feel free to open new issues for missing parts", adding a new issue specifically for the comments I made at #63.

lukasj commented 6 years ago
lukasj commented 7 years ago

@rwalkerands Commented Repeating/combining the relevant messages:

Please consider including support for a "pure UTC" data flow at least in the following specific sense:

I don't see how to achieve this with JPA as is.

Until yesterday I have been using an AttributeConverter like this, based on code that can be found on several blogs:

@Converter(autoApply = true)
public class LocalDateTimeConverter implements
    AttributeConverter<LocalDateTime, Timestamp> {

    @Override
    public Timestamp convertToDatabaseColumn(
            final LocalDateTime attribute) {
        if (attribute == null) {
            return null;
        }
        return Timestamp.valueOf(attribute);
    }

    @Override
    public LocalDateTime convertToEntityAttribute(
            final Timestamp dbData) {
        if (dbData == null) {
            return null;
        }
        return dbData.toLocalDateTime();
    }
}

I now realise that although this seems to work, it is wrong, because both Timestamp.valueOf() and Timestamp.toLocalDateTime() interpret the value as though it were in the default time zone, so the code breaks during daylight saving changeover when there's an hour of time that doesn't "exist" in the default time zone.

For example, consider the test value 2016-10-02 02:02:01, which is a perfectly valid time when interpreted as UTC, but which does not exist in my local time zone (Australia/Sydney). If I (with default time zone Australia/Sydney) use the above converter to persist the LocalDateTime value LocalDateTime.of(2016, 10, 2, 2, 1, 0), the value is sent to the database, and stored, as '2016-10-02 03:01:00.0', and when it is read back into a LocalDateTime, its value is now one hour out! (I've done exactly this test to confirm.)

My converter is now like this:

@Converter(autoApply = true)
public class LocalDateTimeConverter implements
    AttributeConverter<LocalDateTime, Timestamp> {

    @Override
    public Timestamp convertToDatabaseColumn(
            final LocalDateTime attribute) {
        if (attribute == null) {
            return null;
        }
        return Timestamp.from(attribute.toInstant(ZoneOffset.UTC));
    }

    @Override
    public LocalDateTime convertToEntityAttribute(
            final Timestamp dbData) {
        if (dbData == null) {
            return null;
        }
        return LocalDateTime.ofInstant(dbData.toInstant(), ZoneOffset.UTC);
    }
}

but this doesn't work as is; it requires the recently-added Hibernate-specific configuration (explained at http://in.relation.to/2016/09/12/jdbc-time-zone-configuration-property/) to specify using UTC when sending Timestamp values through JDBC:

hibernate.jdbc.time_zone=UTC

So, although LocalDateTime values are "in theory" separate from concerns about time zones, it seems that when it comes to persisting them as Timestamp values, the values must be interpreted as being in some time zone.

So, please consider providing some way of specifying that LocalDateTime values are persisted in a "transparent" way (avoiding all internal time zone conversions along the way), or, if that is not possible, specifying the time zone in which LocalDateTime values are to be interpreted.

For anyone reading this who wants to use my code, note that for MySQL you'll need to add these parameters to the JDBC URL:

&serverTimezone=UTC&useLegacyDatetimeCode=false

otherwise you "fall at the last hurdle" (i.e., the JDBC driver ruins everything when it converts Timestamp values to strings).

Other databases may require something similar.

It seems that the Hibernate-specific property hibernate.jdbc.time_zone=UTC does actually solve a problem; it is not "syntactic sugar" for something you can do another way in "pure" JPA.

So maybe it boils down to: please add a JPA setting that achieves what hibernate.jdbc.time_zone=UTC does. And maybe consider making the application of the setting scope-able (e.g., to individual fields).

lukasj commented 6 years ago

@ghost Commented I like the idea very much. Without the hibernate.jdbc.time_zone property I cannot think of any way to get this "pure UTC flow" other than setting the JVM time zone.

In particular java.time.Instant should be supported, it's basically the new java.util.Date.

One thing I disagree with though is that LocalDateTime should be treated as UTC. Actually, this is in contradiction to the purpose of this class. From the JavaDoc:

This class does not store or represent a time-zone. Instead, it is a description of the date, as used for birthdays, combined with the local time as seen on a wall clock. It cannot represent an instant on the time-line without additional information such as an offset or time-zone.

Such additional information is contained in OffsetDateTime or ZonedDateTime.

If you want a point in time without a specific offset/timezone, you should use Instant .

lukasj commented 6 years ago

@rwalkerands Commented Please see https://github.com/javaee/jpa-spec/issues/63#issuecomment-299389738, where, in response to a similar comment, I presented the code examples using ZonedDateTime.

The point remains exactly the same.

lukasj commented 6 years ago

@ghost Commented I think there are two things being mixed here.

One is persisting and retrieving instances of java.sql.Timestamp by JDBC. This is where the Hibernate property kicks in (and what the new JPA feature should adopt). So if you have a Timestamp instance representing 14:00:00 in UTC (leaving out dates for brevity), JDBC sends 14:00:00 to the database (if hibernate.jdbc.time_zone = UTC is set), rather than 15:00:00 (assuming your JVM timezone is UTC+1). Something similar should happen on retrieving the value. So there is the "UTC flow".

The other is the conversion of an unsupported type like java.time.ZonedDateTime from and to java.sql.Timestamp. This is not an automatism of JPA (if it was, that type wouldn't be unsupported). When reading ZonedDateTime from a Timestamp, it is not up to JPA to guess the time zone - it wasn't persisted in the database. If one only wants to use UTC in ZonedDateTime instances anyway (e.g. with your code return ZonedDateTime.ofInstant(dbData.toInstant(), ZoneOffset.UTC);), I recommend using java.time.Instant instead.

About LocalDateTime: This type is supported with JPA 2.2, so no AttributeConverter is necessary there. I consider it somewhat unsuitable that LocalDateTime types are going to be mapped to timestamp values, and it's not clear to me how this is going to be done, but this is a separate issue.

lukasj commented 6 years ago

@westonpace Commented I agree that Instant or ZonedDateTime is preferable to LocalDateTime. LocalDateTime is fine for storing into the database but not for pulling from the database. The problem is, in order to do anything with LocalDateTime, I have to specify a timezone, which defeats the purpose of the workflow as I understand it.

For example, consider this use case. I want to pull the next row from the "events" table and print how many seconds are between the current time and the event's "start" column (which is stored as a TIMESTAMP without TZ info, implicitly assumed to be UTC).

If the ORM gives me a LocalDateTime instance then I can't compute the number of seconds between that and the current time without populating a timezone so I would have to specify UTC somewhere in my code which defeats the purpose.

On the other hand, if the ORM gives me an Instant instance then I can compute the difference very easily.

This is the entire reason that LocalDateTime is currently used for "TIMESTAMP WITHOUT TZ" is that it forces the user to specify the timezone (instead of the ORM guessing the timezone). If we are doing a "pure UTC" workflow then the user should not have to specify the timezone and the ORM should be the one guessing.

Consider grabbing and storing the current time. With Instant I simply say Instant.now() and store that. The ORM calculates the proper "TIMESTAMP WITHOUT TZ" value for UTC and stores that.

When using LocalDateTime the ORM could guess the instance was created using LocalDateTime.now() and so is in the local time but if it was created using some other mechanism and thus not in the local time zone then the ORM would store incorrectly.

lukasj commented 6 years ago

@westonpace Commented It's possible I misunderstand the goal here too. My thought would be the goals are:

1) There is a column that is "TIMESTAMP WITHOUT TZ" because that best matches my workflow. 2) The ORM should ensure that all values stored in this column are stored in UTC and should take care of returning a TZ-aware object with the TZ set to UTC so that I can't make a mistake later when using it.

lukasj commented 6 years ago

@ghost Commented edited

Yes there are basically two goals:

I'd prefere there to be different properties for these two goals.