googleapis / python-spanner-django

Cloud Spanner database backend for Django
BSD 3-Clause "New" or "Revised" License
90 stars 28 forks source link

django: casting DateField to DateTimeField adds an unexpected hour #260

Open timgraham opened 4 years ago

timgraham commented 4 years ago
======================================================================
FAIL: test_cast_from_db_date_to_datetime (db_functions.comparison.test_cast.CastTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/tests/db_functions/comparison/test_cast.py", line 89, in test_cast_from_db_date_to_datetime
    self.assertEqual(dtm.start_as_datetime, datetime.datetime(2018, 9, 28, 0, 0, 0, 0))
AssertionError: datetime.datetime(2018, 9, 28, 7, 0) != datetime.datetime(2018, 9, 28, 0, 0)

SQL:

SELECT ... CAST(db_functions_dtmodel.start_date AS TIMESTAMP) AS start_as_datetime
FROM db_functions_dtmodel
odeke-em commented 4 years ago

Indeed I can see Spanner doing this, and despite storing data in instances in different regions: Los Angeles and HongKong, the 07:00 is always added irregardless, per:

DDL

CREATE TABLE exp (
    id INT64 NOT NULL,
    pub_date DATE,
) PRIMARY KEY (id)

Data

id pub_date
1 2018-09-28

Query SQL

SELECT
    (SELECT pub_date from exp) as orig,
    (SELECT CAST(pub_date AS TIMESTAMP) from exp) as ecast

Results

orig ecast
2018-09-28 2018-09-28T07:00:00Z

@bvandiver @skuruppu might you have some insights or please help us document this change on Cloud Spanner since there is no indication on the Cloud Spanner docs.

odeke-em commented 4 years ago

Seems like the only way to avoid that implicit 07:00 (zonal dependent) addition is to FORMAT_DATE the date to a string, explicitly setting that time as T00:00:00 and then cast from STRING to TIMESTAMP

SELECT TIMESTAMP(FORMAT_DATE('%Y-%m-%dT00:00:00Z', pub_date)) from exp

which will produce: 2018-09-28T00:00:00Z

odeke-em commented 4 years ago

Andrew from the Cloud Spanner team showed me that instead we could just do

SELECT TIMESTAMP(pub_date, "GMT") from exp

with the distinction that "GMT" requests that the cast timestamp not have the implicit padding.

timgraham commented 4 years ago

Either expressions works for for this particular test, however, they create an issue for the db_functions.comparison.test_cast.CastTests.test_cast_from_python_to_datetime where a Python datetime is cast to timestamp. For the format_date() expression: "No matching signature for function FORMAT_DATE for argument types: STRING, TIMESTAMP." and for the GMT expression, "No matching signature for function TIMESTAMP for argument types: TIMESTAMP, STRING."

We could put in a hacky type check to fix that particular test, however, the solution would crash if casting a TIMESTAMP column to TIMESTAMP. I'm not sure if this case comes up in other tests but the suggested solutions don't seem robust. Similar to https://github.com/orijtech/django-spanner/issues/253#issuecomment-594813097, I don't think we can "type check" the expression if it's a column.

skuruppu commented 4 years ago

From the docs:

Casting from a date to a timestamp interprets date_expression as of midnight (start of the day) in the default time zone, America/Los_Angeles.

So '2018-09-28' is really '2018-09-28T00:00:00.000000-07:00'. The result is displayed in UTC so it's '2018-09-28T07:00:00.000000Z'.

c24t commented 4 years ago

See https://github.com/googleapis/python-spanner-django/issues/191.