Open jreback opened 7 years ago
xref #11022, https://github.com/pandas-dev/pandas/issues/6741
This also works, but exposing internal impl, verbose and not user friendly
In [36]: Series(s.values.astype('datetime64[s]').astype('i8'), index=s.index)
Out[36]:
0 1451606400
1 1451692800
2 1451779200
dtype: int64
If we would add user facing functionality, I think I would like to_epoch()
most (certainly not something like int64[s]
IMO)
I changed this to make this an enhancement for a simple .to_epoch()
method on Timestamp/DTI.
since timestamp now has timestamp method, should,we use the same name for DTI?
yes this would be reasonable (though to be honest the .timestamp()
name is not very informative.....
I also don't really like the name. It is rather confusing given that we already have a Timestamp
class (for timestamps itself it is ok to keep subclass consistency). So when adding such a method to DatetimeIndex / dt accessor, I would think about not using the same name.
I am partial to to_epoch
, we use this term elsewhere.
One question: what to do with NaTs?
In [5]: pd.DatetimeIndex(['2017', '2018', None]).values.astype('datetime64[s]').astype("i8")
Out[5]: array([ 1483228800, 1514764800, -9223372036854775808])
Do we value having integer dtype more? I think so in this case.
Second question: timezones. Unix time is defined in UTC, so should we
This will necessitate an ambiguous
parameter.
third question: how to handle higher-precision components?
In [6]: pd.DatetimeIndex(['2017-01-01T00:00:00.01', '2017-01-01T00:00:00.02']).to_epoch()
Out[6]: array([1483228800, 1483228800])
I don't think we should use floats and fractional components. So that leaves truncating or rounding to the nearest unit
.
One question: what to do with NaTs?
In [5]: pd.DatetimeIndex(['2017', '2018', None]).values.astype('datetime64[s]').astype("i8") Out[5]: array([ 1483228800, 1514764800, -9223372036854775808])
Do we value having integer dtype more? I think so in this case.
could a <IntegerArray>
be returned in this case. it would need some casting since currently
pd.array(pd.DatetimeIndex(['2017', '2018', None]).values.astype('datetime64[s]'), dtype='Int64')
raises
TypeError: datetime64[s] cannot be converted to an IntegerDtype
yes there are various use cases where we could do things like this
add a
.to_epoch(unit='s')
method toTimestamp
andDatetimeIndex
that returns the epoch for that unit. I think would default this tos
as that seems pretty common, but allow any of our units.