Closed jpivarski closed 2 years ago
@jpivarski that's a good point actually; microseconds are the finest granularity that we can reason about with date-time objects. As this is all Python-iteration anyway, users can always write a loop to transform these things if they need to for some reason.
I just noticed the issue number:
Closed by #1721
Version of Awkward Array
HEAD
Description and code to reproduce
https://github.com/scikit-hep/awkward/blob/77b06b3575737ec3cebc4c9b76ddceaf584d57c3/src/python/content.cpp#L952-L957
and
https://github.com/scikit-hep/awkward/blob/77b06b3575737ec3cebc4c9b76ddceaf584d57c3/src/python/content.cpp#L846-L886
handle NumPy datetime objects, but Python datetime objects are not explicitly recognized. (What happens? Does it raise a "cannot convert..." ValueError?)
Given a
datetime.datetime
, one can call.timestamp()
to get a number of seconds since 1970 as a floating point number. To fill theArrayBuilder::datetime
orArrayBuilder::timedelta
, we have to provide a 64-bit integer and a type string. We could choose that type string to be"s"
and cast the floating point number as an integer as-is, or choose the type string to be"ms"
and multiply the number by 1000, etc.Either we hard-code a choice or pass it down as an argument (but that means passing it down through C++, like the options in
from_json
). Note:"ns"
(nanosecond) resolution gives us a range from 1677 through 2262"us"
(microsecond) resolution gives us a range from 290307 B.C.E through 294247 C.E., though if we were constructing Pythondatetime
objects, these years are already out of range."ms"
,"s"
) if Pythondatetime
can't even represent them.So the only units that make any sense to assume are microseconds and nanoseconds. Microseconds covers the entire range that Python
datetime
is capable of (and more), and nanosecond resolution sounds rather special-purpose. (Users can make NumPy datetimes if they need that.) So I would vote to hard-code it as microseconds.