yesodweb / persistent

Persistence interface for Haskell allowing multiple storage methods.
MIT License
467 stars 296 forks source link

ZonedTime / ZT restore type #577

Open ghost opened 8 years ago

ghost commented 8 years ago

Any chance we can have the ZonedTime/ZT type restored to Persistent's supported types?

mlitchard commented 8 years ago

I'll take a stab at this if we want to put it back in.

erikd commented 8 years ago

I certainly missed it when it was removed. I for one would be happy if correct support for this was restored. My memory is that the original one was fundamentally incorrect in some way.

mlitchard commented 8 years ago

Knowing why it was removed would be helpful. Calling Dr. Snoyman.

snoyberg commented 8 years ago

This has been discussed at length multiple times in the past. Most recent discussion (which links to previous discussions) is: https://github.com/yesodweb/persistent/issues/290

erikd commented 8 years ago

Ah yes, that seems familiar!

ghost commented 8 years ago

Persistent supported the ZonedTime type, which is supported by postgresql-simple. The reason stated for its removal from persistent was the false expectation that Postgres would record the timezone along with the timestamp. The Postgres timezone with time stamp does not record a timezone, but it does something very important, it converts correctly between timezones. One could visualize this as an underlying canonical timestamp type with a function interface that converts the underlying canonical timestamp to a timestamp in a timezone of your choosing, and vice versa. The conversion function in Postgres accounts for all of the nasty details of daylight savings times, historical changes in daylight savings times, etc.

ghost commented 8 years ago

And mysql has the same functionality, so it's not just postgres:

https://dev.mysql.com/doc/refman/5.5/en/time-zone-support.html

ghost commented 8 years ago

To illustrate this simple feature one only need to run this SQL on Postgres: -- Do this first. SELECT NOW(); -- which by default should report time in UTC

-- Then do this. SET SESSION TIME ZONE 'America/Chicago'; SELECT NOW(); -- which will correctly report in US central time.

Inserts and updates also perform timezone corrections from a timezone of your choice to UTC (which is naturally a great canonical choice).

UTCTime does not support this. ZonedTime did correctly support this important feature. Unfortunately I need to move a working application to the latest Persistent to take advantage of some nice new features, but UTCTime is a deal breaker.

ghost commented 8 years ago

Unfortunately those discussions were based on false expectations and utterly missed the point of the ZT type. It is a vital zone conversion feature supported by Postgres and MySql and probably others. UTCTime is completely inadequate to provide seamless and correct timezone conversions. On Tue, 2016-06-14 at 22:59 -0700, Michael Snoyman wrote:

This has been discussed at length multiple times in the past. Most recent discussion (which links to previous discussions) is: #290 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

snoyberg commented 8 years ago

UTCTime is completely inadequate to provide seamless and correct timezone conversions.

In all these conversations, I've never seen an explanation of this claim. On the mailing list, someone asked that you demonstrate what you're trying to do.

Also, it's not an issue of false expectations. The ZonedTime data type was lossy: marshalling to and from the database would throw away information.

ghost commented 8 years ago

The false expectation was/is that ZonedTime (timezone with timestamp) would store timezone information in the database, which is in fact not the case and no claim was ever made by Postgres to that effect. The data type's lossy behavior is not salient to its intended purpose in Postgres which is to provide seamless and correct timezone conversions, which in my opinion is far more important than storing a timezone. I don't even understand why storing a timezone in a database has any meaning. Events typically take place in a geographic location, so that is far more important to store than a timezone. If you want use case, there are probably more than I can name, but a simple one is the ability to present dates and times to users in any timezone they need. Scheduling programs, calendaring, event recording, etc. are all applications that ZonedTime provides. I have a medical surgical scheduling application using ZonedTime under the prior Persistent that schedules surgical procedures across multiple timezones with doctors from all over various regions (tele-medicine, etc.). Postgres provides all the tools to correctly handle all of that and more. UTCTime is useless in user interfaces. UTC is not a time zone nor was it ever intended to be used as such by its creators. No user I know of wants time and dates presented in UTC. Quoted from an email to the Yesod mailing list on 04/18/2016: From one of my Foundation.hs files that worked with older persistent, but is broken by UTCTime: let searchPath = T.append "SET SEARCH_PATH = " sp      timezone = T.concat ["SET SESSION TIME ZONE \"", tz, "\""]    touch = "SELECT basal_view.session_touch()" :: T.Text    runIt = Import.NoFoundation.runPool (appDatabaseConf $ appSettings master)              (rawExecute (intercalate ";" [searchPath, touch, timezone]) [] *> query_pp)              pool Any 'timestamp with timezone' returned by Postgres would have the timestamp converted to the timezone listed in tz.On Thu, 2016-06-16 at 22:04 -0700, Michael Snoyman wrote:

UTCTime is completely inadequate to provide seamless and correct timezone conversions. In all these conversations, I've never seen an explanation of this claim. On the mailing list, someone asked that you demonstrate what you're trying to do. Also, it's not an issue of false expectations. The ZonedTime data type was lossy: marshalling to and from the database would throw away information. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

snoyberg commented 8 years ago

And why is it that converting a UTCTime value to a ZonedTime value doesn't do exactly the same thing?

ghost commented 8 years ago

For some reason UTCTime (at the SQL level?) is somehow forcing UTC (overriding the time zone session variable) in Postgres.

If it had not done so I would have been left with both the semantic weirdness of UTC not really being UTC, and the need to add the timezone offset (good luck correctly determining the offset) to show strings (or convert UTCTime to ZonedTime) so the browser can present data to javascript with an offset for calendaring libs. Also json/xml typed content handlers being queried by third party applications would not get much needed timezone offset information.

snoyberg commented 8 years ago

I don't know what this means, but it's starting to sound like you're storing local time in UTCTime, since that type most certainly stores UTCTime values. I'd need to see am example of the failure you're implying to really know what's going on.

On Tue, Jun 21, 2016, 3:59 AM Robert notifications@github.com wrote:

For some reason UTCTime (at the SQL level?) is somehow forcing UTC (overriding the time zone session variable) in Postgres.

If it had not done so I would have little to complain about beyond both the semantic weirdness of UTC not really being UTC, and the need to add the timezone offset to show strings (or convert UTCTime to ZonedTime) so the browser can present data to javascript with an offset for calendaring libs.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/yesodweb/persistent/issues/577#issuecomment-227312984, or mute the thread https://github.com/notifications/unsubscribe/AADBB7XN2azpuPuOFSpMEEg2wMUzh9x8ks5qNzdzgaJpZM4I02D_ .

ghost commented 8 years ago

This is most likely the culprit: Database.PostgreSQL.Simple.Time.Internal.Parser

-- | Behaves as 'zonedTime', but converts any time zone offset into a
-- UTC time.
utcTime :: Parser UTCTime
utcTime = do
  (Local.LocalTime d t) <- localTime
  mtz <- timeZoneHMS
  case mtz of
    Nothing -> let !tt = Local.timeOfDayToTime t
               in return (UTCTime d tt)
    Just tz -> let !(dd,t') = localToUTCTimeOfDayHMS tz t
                   !d' = addDays dd d
                   !tt = Local.timeOfDayToTime t'
                in return (UTCTime d' tt)

postgresql simple takes the offset reported by Postgres and changes the time/date (by adding or subtracting minutes, altering the date, etc.) back to UTC, which renders the result useless for my purposes. This function (utcTime) is lossy, but semantically correct -- as far as it is concerned UTCTime will be correctly presented in UTC regardless of what Postgres reports.

ghost commented 8 years ago

Also, just to be clear, the way one instructs Postgres to report timestamps with timezone in a particular timezone is to perform a SET operation before SELECT. Without a SET TIME ZONE operation the default is to report times/dates/offsets in UTC.

SET SESSION TIME ZONE 'America/Chicago';
SELECT timestampwithtimezonecolumnX FROM tableX;

This will result in a time/date/offset correctly adjusted to Chicago time to be presented to whatever client has retrieved it. I write 'adjusted' as Postgres stores timestamps internally in UTC.

The postgresql-simple module UTCTime interface converts timestamps reported by Postgres to UTC by undoing the good work done by Postgres. This is not helpful at all.

The postgresql-simple module ZonedTime type does not undo the work done by Postgres, which is a very good thing.

To instruct Postgres to assume that timestamps in insert/update commands are in a particular timezone one must perform the same command prior to performing the insert/update.

SET SESSION TIME ZONE 'America/Chicago';
INSERT INTO tableX (timestampwithtimezonecolumnX)
            VALUES ('2016-06-22 14:05:25-00');

Postgres will ignore the '-00' or really any offset and assume that the date and time are in Chicago time. Postgres adjusts the date and time to UTC for internal storage.

Inserts and updates using SET TIME ZONE work correctly with the current persistent because Postgres ignores any offsets and assumes the timezone in the SET command is what the user intends to insert/update.

ghost commented 8 years ago

Michael, to address your concern about storing local time in UTCTime, many of the records I am retrieving have timestamps created by default in Postgres itself using the NOW() function. When I retrieve the timestamp, and having correctly SET TIME ZONE 'America/Chicago' in runDB, the resulting UTCTime from persistent reports the UTC time/date and not 'America/Chicago' time/date. This is a result of postgresql-simple module's UTCTime parser messing with the date and time.

Beyond all that, the prior version of persistent that had ZonedTime works perfectly over uncounted thousands of transactions using SET TIME ZONE in runDB with otherwise identical Haskell code.

snoyberg commented 8 years ago

For the record, despite the fact that I'm the one defending the status quo here, I don't in any way feel strong about this decision, it's simply a decision that seems to have no way to make everyone happy (evidenced by the fact that I got pressured in the first place to remove this instance). I say this because if someone else on the Persistent team wants to decide that this decision was wrong, I don't really have a problem with it. But whatever we do, I'd like to make that decision large and public, because rehashing this discussion every few months is not productive.

ghost commented 8 years ago

I agree, large and public. If I can garner some time I'll work up some (hopefully) simple persistent examples. I wish I had been following the discussion that led to ZT's removal before a decision was made as Postgres makes matters a bit tricky to get it right. Anything related to times and dates and timezones in the real world can lead to crazy headaches for anyone, including the cleverest of folks. On Thu, 2016-07-14 at 09:09 -0700, Michael Snoyman wrote:

For the record, despite the fact that I'm the one defending the status quo here, I don't in any way feel strong about this decision, it's simply a decision that seems to have no way to make everyone happy (evidenced by the fact that I got pressured in the first place to remove this instance). I say this because if someone else on the Persistent team wants to decide that this decision was wrong, I don't really have a problem with it. But whatever we do, I'd like to make that decision large and public, because rehashing this discussion every few months is not productive. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ghost commented 8 years ago

I've included a pair of code blocks in this message. The first is Haskell. The second is Postgres SQL.

The SQL demonstrates how to create a table and a view over that table. The view presents the column 'cronos' both in its original form as Timestamp with time zone, and as plain Text.

The haskell code will query the DB and print both cronos and cronosAsText. Both should be identical, but they will not be. The reason is that cronos UTCTime is reverted back to UTC by postgresql-simple, but cronosAsText is left unmolested. If ZonedTime were available and cronos were defined with ZonedTime, then both columns would be identical.

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}

module Main where

import Prelude ()    
import ClassyPrelude    
import Database.Persist
import Database.Persist.Postgresql
import Database.Persist.TH
import Control.Monad.Logger    (LoggingT, runStderrLoggingT)

connectionString :: ConnectionString
connectionString = "use your imagination"

share [mkPersist sqlSettings] [persistLowerCase|
ZoneTest
    cronos UTCTime
    cronosAsText Text
    deriving Show
|]

main :: IO ()
main = runStderrLoggingT $ withPostgresqlPool connectionString 3 rundb

rundb :: ConnectionPool -> LoggingT IO ()
rundb pool = liftIO $ runSqlPersistMPool queryStuff pool

queryStuff :: SqlPersistM ()
queryStuff = do
  void $ rawExecute "SET LOCAL TIME ZONE 'America/Chicago';" []
  test <- selectList [] [Asc ZoneTestCronos]
  liftIO $ mapM_ (print . entityVal) test
CREATE TABLE zonetestbase
(
  id serial NOT NULL,
  cronos timestamp with time zone NOT NULL DEFAULT now(),
  CONSTRAINT zonetestbase_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE zonetestbase
  OWNER TO postgres;

CREATE OR REPLACE VIEW zone_test AS 
 SELECT zonetestbase.id,
    zonetestbase.cronos,
    zonetestbase.cronos::text AS cronos_as_text
   FROM zonetestbase;

ALTER TABLE zone_test
  OWNER TO postgres;
GRANT ALL ON TABLE zone_test TO postgres;
GRANT SELECT ON TABLE zone_test TO public;

INSERT INTO zonetestbase(id, cronos) VALUES (1, NOW());
ghost commented 8 years ago

Just to add to my comment above. I jiggered a copy of persistent 2.5 and made a newtype ZT ZonedTime and added a new SqlType constructor SqlOffsetTime. The new version of Persistent works perfectly and supports both UTCTime as SqlDayTime and ZT as SqlOffsetTime.

I made it work with persistent-postgresql and made no attempt to try another database module.

winterland1989 commented 8 years ago

I think maybe LocalTime maybe a middle ground because it's:

A simple day and time aggregate, where the day is of the specified parameter, and the time is a TimeOfDay.

As long as you put a utc time into the database in timezone A, and you try read it in the same timezone, it's OK because the conversion is invisble to you. But the problem exists when:

  1. people read the database using a different timezone, since sql server may want to do something smart.
  2. In mysql it's impossible to set default timestamp column use UTC_TIMESTAMP, see here. If we use CURRENT_TIMESTAMP, we can't read it back as a utc time anymore.

If we use LocalTime, then people have to do whatever conversion they should do, and hopefully haskell's type system can stop them writing utc time directly into database without thinking.

How do you think @robertLeeGDM @snoyberg ?

ghost commented 8 years ago

Naming to avoid confusion. DTNAKED ~ MySQL DATETIME ~ Postgres TIMESTAMP DTWOFFSET ~ MySQL TIMESTAMP ~ Postgres TIMESTAMP WITH TIME ZONE

SqlDayTime in the current 2.5 release is tied to UTCTime. When I jiggered 2.5 and added SqlOffsetTime/ZT as DTWOFFSET, and left SqlDayTime/UTCTime as a DTNAKED, my goal was to avoid breaking more than I had to.

I did have a similar thought that UTCTime as plain DTNAKED is misleading and indeed should be LocalTime. I've thought that UTCTime is a poor substitute for either DTWOFFSET or DTNAKED.

So along those lines for completeness I think that SqlDayTime should be LocalTime when the column is DTNAKED. SqlOffsetTime should be added as ZT for DTWOFFSET.

For backward compatibility I wonder if a method can be contrived so that UTCTime can be substituted for either. Certainly ZonedTime can be losslessly converted to UTCTime, just not the other way around. That way existing code using UTCTime can continue to work and the newer types can be factored in for fine grain type happiness.

N.B. Offsets are not time zones. UTC is not a time zone. ZonedTime as defined is a mix of date and time with an offset (so far useful), a daylight savings time flag (probably useless here), and a field for an arbitrarily defined string that may refer to an actual governmental time zone or the name of your favourite dog food (not useful here). UTCTime is useful when libraries that process 8601 formatted strings can alter date and time with non zero offsets to UTC, effectively blinding the offset (e.g. postgresql-simple).

It would be great if the time package had a type named OffsetTime that was defined simply as a date and time with an offset. To properly deal with actual governmental time zones ZonedTime would need to be defined more like an object with functions for setting and correctly getting date/time/governmental time zone name/Olson key/etc.