dolthub / doltpy

A Python API for Dolt
Apache License 2.0
55 stars 13 forks source link

fix datetime precision loss in clean_types #137

Closed jzcruiser closed 3 years ago

jzcruiser commented 3 years ago
>>> import datetime
>>> isinstance(datetime.datetime.now(), datetime.date)
True

https://til.codeinthehole.com/posts/datetimedatetime-is-a-subclass-of-datetimedate/

oscarbatori commented 3 years ago

Thanks for this @jzcruiser, we will take a look presently.

max-hoffman commented 3 years ago

the change is fairly simple and seems to work locally, but i don't know how to debug the CI failure immediately:

E           sqlalchemy.exc.ProgrammingError: (mysql.connector.errors.ProgrammingError) Failed processing pyformat-parameters; Python 'timestamp' cannot be converted to a MySQL type
E           [SQL: INSERT INTO characters (name, adjective, id, date_of_death) VALUES (%(name_m0)s, %(adjective_m0)s, %(id_m0)s, %(date_of_death_m0)s), (%(name_m1)s, %(adjective_m1)s, %(id_m1)s, %(date_of_death_m1)s), (%(name_m2)s, %(adjective_m2)s, %(id_m2)s, %(date_of_death_m2)s) ON DUPLICATE KEY UPDATE name = VALUES(name), adjective = VALUES(adjective), date_of_death = VALUES(date_of_death)]
E           [parameters: {'name_m0': 'Anna', 'adjective_m0': 'tragic', 'id_m0': 1, 'date_of_death_m0': Timestamp('1877-01-01 00:00:00'), 'name_m1': 'Vronksy', 'adjective_m1': 'honorable', 'id_m1': 2, 'date_of_death_m1': None, 'name_m2': 'Oblonksy', 'adjective_m2': 'buffoon', 'id_m2': 3, 'date_of_death_m2': None}]

I have split concerns but will come back to this later @jzcruiser . If you figure out why this is happening before me it would expedite merging. The failure on CI suggests that it might be a linux thing in either sqlalchemy, python-mysql-connector, or datetime. Just need to either avoid that timestamp auto-conversion, or convert timestamp to something that mysql likes better. If this command succeeds on mysql but not dolt's mysql interface we'll have to dig a bit deeper to get a dolt database fix.

jzcruiser commented 3 years ago

I had no idea why CI failed before you guys pointed it out, thanks! I will find some time to dig deeper.

jzcruiser commented 3 years ago

Hi @max-hoffman, the failure is caused by pandas' Timestamp type. Surprisingly, a pd.Timestamp val is an instance of datetime.datetime:

import datetime import pandas as pd isinstance(pd.Timestamp('today'), datetime.datetime) True

I just updated the PR, please take a look. Thanks!