elixir-ecto / myxql

MySQL 5.5+ driver for Elixir
Apache License 2.0
273 stars 67 forks source link

datetime vs timestamp #188

Closed gregors closed 4 months ago

gregors commented 4 months ago

Hello, I was seeing some weird (to my eye) things when using this library in relation to timestamps vs datetime. Please bear with me as it's been awhile since I've wandered into the murky swamps of timezones and MySql and I'm just trying to understand what I'm seeing fully.

Looking through the code it seems that we say we support timestamps in the documentation but what is actually happening is that we throw away the timezone information and send it using datetime instead.

Now, looking through the history I saw this

https://github.com/elixir-ecto/myxql/issues/30

Now normally what happens with a timestamp (since it has timezone information) is that the server receives it and stores it at UTC. Upon retrieval the SHOW VARIABLES like 'time_zone'; is consulted and formatted - for example, if it's UTC it would stay UTC or if the local connection was "America/New_York", then the time changes to reflect that based on the current setting of "time_zone". And in the code I even see reference to this behavior

in lib/myxql/protocol/values.ex

  # TIMESTAMP vs DATETIME
  #
  # https://dev.mysql.com/doc/refman/8.0/en/datetime.html
  # MySQL converts TIMESTAMP values from the current time zone to UTC for
  # storage, and back from UTC to the current time zone for retrieval. (This
  # does not occur for other types such as DATETIME.)
  #
  # Comparing to Postgres we have:
  # MySQL TIMESTAMP is equal to Postgres TIMESTAMP WITH TIME ZONE
  # MySQL DATETIME  is equal to Postgres TIMESTAMP [WITHOUT TIME ZONE]

But what we actually do is use Datetime and throw away the tz information such that the time that comes back from MySql will always be UTC, i.e. setting the time_zone variable on the sever will have no effect. Notice the mysql type :mysql_type_datetime instead of :mysql_type_timestamp

  defp encode_binary_datetime(%DateTime{
         year: year,
         month: month,
         day: day,
         hour: hour,
         minute: minute,
         second: second,
         microsecond: {microsecond, _}, 
         time_zone: "Etc/UTC"
       }) do
    {:mysql_type_datetime,
     <<11, year::uint2(), month::uint1(), day::uint1(), hour::uint1(), minute::uint1(),
       second::uint1(), microsecond::uint4()>>}
  end 

  defp encode_binary_datetime(%DateTime{} = datetime) do
    raise ArgumentError, "#{inspect(datetime)} is not in UTC"
  end 

Is this what the following is getting at?

Datetime fields are represented as NaiveDateTime, however a UTC DateTime can be used for encoding as well

Is this what is intended or is the timestamp a work in progress? If it's intended I might add some additional documentation, if it's still in progress I'd be happy to help. Also if I'm confused, I'd be very happy to be enlightened.

Cheers!

gregors commented 4 months ago

As an illustration, if I insert a timestamp as UTC i.e. ~U[2024-07-06 02:06:58Z] when I read it out of the database it's the same.

converting this datetime we see

DateTime.shift_zone(m.updated_at, "America/New_York")
{:ok, #DateTime<2024-07-05 22:06:58-04:00 EDT America/New_York>}

if however, the time_zone of the Mysql changes, for example

SET GLOBAL time_zone = 'America/New_York';

And we do a select we get ~U[2024-07-05 22:06:58Z] that seems very similar to our shifted timezone doesn't it? Because Mysql has done the conversion to America/New_York for us and it is no longer in UTC. However, we think got UTC back (but we didn't), and thus if we try to convert it again we now get the incorrect time.

DateTime.shift_zone(m.updated_at, "America/New_York")
{:ok, #DateTime<2024-07-05 18:06:58-04:00 EDT America/New_York>}

It's worth pointing out, that it seems setting the database time to anything other than UTC (on Mysql) will end up with an incorrect UTC time since we assume we're always getting UTC from mysql which in this case, it no longer is. I suspect if the time_zone for mysql doesn't change between insertion and selection we probably get the right value in Elixir regardless of what the Mysql time_zone is set to, however other clients that might access the database will get the wrong value, for example if the time is always set to America/New_York

It's also worth noting that I'm using an older version of MariaDb in these tests. I can test this out on the latest version as well.

** update - yeah if the mysql time_zone isn't UTC the time that returns is incorrect.

josevalim commented 4 months ago

For completeness, Postgrex always converts the timezone to UTC when we read it and it is only accepts UTC datetimes when writing. I am not sure if MySQL provides similar affordances.

gregors commented 4 months ago

looking through the examples, I suspect the fix is to make the mysql connection always use UTC, as per this example.

  Run a query after connection has been established:

      iex> {:ok, pid} = MyXQL.start_link(after_connect: &MyXQL.query!(&1, "SET time_zone = '+00:00'"))
      {:ok, #PID<0.69.0>}
josevalim commented 4 months ago

We can enforce it (perhaps also allow to turn it off), but maybe documenting that we expect UTC is enough.

gregors commented 4 months ago

confirmed that adding the following to my Ecto.Repo configuration solves my issues

 after_connect: {MyXQL, :query!, ["SET time_zone = '+00:00'", []]}
wojtekmach commented 4 months ago

Agreed about docs for now. Closing in favour of #189.