singer-io / singer-python

Writes the Singer format from Python
https://singer.io
Apache License 2.0
537 stars 129 forks source link

`utils.strptime` and `utils.strftime` are asymmetrical #81

Closed prakashpp closed 5 years ago

prakashpp commented 5 years ago

EDIT: Current summary of this issue is here.


I am using python 3.7 and it looks like https://github.com/singer-io/singer-python/blob/f6c522731b430fac7005bb9b4c6e046224b8e31c/singer/utils.py#L14 is not valid format for strptime. Am I missing something here?

KAllan357 commented 5 years ago

There's a bit of history with that, I guess this format is not supported on python 3.7.

https://github.com/singer-io/singer-python/pull/52 https://github.com/singer-io/singer-python/pull/63

For now its best to stay on 3.5.x unless there's some overriding reason not to be on that version of python.

prakashpp commented 5 years ago

Ah ok, I was not aware about 3.7 support. So this is what I am facing

>>> from datetime import datetime
>>> from singer import utils
>>> import pytz
>>> utc = pytz.UTC
>>> now = datetime.utcnow().replace(tzinfo=utc)
>>> dtime = utils.strftime(now)
>>> utils.strptime(dtime)
ValueError: time data '2018-10-19T20:53:38.514627Z' does not match format '%Y-%m-%dT%H:%M:%SZ'

In short I can not get the datetime back with formatted string. The problem is because of https://github.com/singer-io/singer-python/blob/f6c522731b430fac7005bb9b4c6e046224b8e31c/singer/utils.py#L27 is not respecting DATETIME_FMT_MAC. I am not sure if this is python 3.7 specific (frankly I am python2 guy). I can do a workaround in my code to handle it but you think its an issue?

prakashpp commented 5 years ago

Also locally I am running python 3.7 but this error is coming on stitchdata running on python 3.5. Let me know, happy to send a fix.

timvisher commented 5 years ago

I need to know some more details about the error your seeing.

  1. Can you provide logs for python 3.5 running into this issue?
  2. Can you tell me the platform details that you're seeing this on? Docker? Virtual Machine? AWS? What distribution are you using, how did you install python 3.5, etc.

I'm going to do a bit of digging myself but at the moment I suspect that the underlying C library is actually what's throwing this error.

timvisher commented 5 years ago

So it appears that you've uncovered an asymmetry between utils.strptime and utils.strftime. I'm unsure how long this asymmetry has existed but it's not great and unfortunately because that code hasn't changed for so long it's also not something that's going to be easy to fix.

That said, it appears if you use utils.striptime_to_utc it will DTRT.

Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime
>>> from singer import utils
>>>
>>> utils.strptime_to_utc(utils.strftime(utils.now()))
datetime.datetime(2018, 10, 22, 16, 26, 15, 917887, tzinfo=<UTC>)
# => datetime.datetime(2018, 10, 22, 16, 26, 15, 917887, tzinfo=<UTC>)

We (or you via a PR) may take the step to deprecate utils.strptime entirely at some point. The presence of these 4 functions is certainly confusing. I believe based on an internal conversation that it's possible utils.strptime was informally deprecated.

prakashpp commented 5 years ago

@timvisher thank you for the explanation, I will send a PR as soon I get time.

timvisher commented 5 years ago

https://github.com/singer-io/singer-python/pull/82 closes this issue.