pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.71k stars 17.92k forks source link

to_datetime loses freq on DatetimeIndex with offset #6562

Closed cancan101 closed 1 year ago

cancan101 commented 10 years ago

It would be cool if the offset were kept. Perhaps even an error should be raised if not all of the Timestamps have the same offset:

In [36]:
dti = pd.to_datetime([pd.Timestamp("2014-1-1", offset="M"),])
dti

Out[36]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-01-01]
Length: 1, Freq: None, Timezone: None

See #6560

mroeschke commented 6 years ago

Timestamp no longer accepts the offset keyword (deprecated and removed). Closing.

jbrockmendel commented 6 years ago

@mroeschke I think this is still an issue with the freq keyword:

>>> pd.Timestamp('now', freq='D')
Timestamp('2018-07-07 08:17:00.395067', freq='D')

>>> pd.to_datetime(pd.Timestamp('now', freq='D'))
Timestamp('2018-07-07 08:17:08.939068', freq='D')

>>> pd.to_datetime([pd.Timestamp('now', freq='D')])
DatetimeIndex(['2018-07-07 08:17:14.227013'], dtype='datetime64[ns]', freq=None)
mroeschke commented 6 years ago

Oh I see. I had also rationalized closing this in light of #15146, but I see you mentioned that removing freq from Timestamp is non-trivial. Do you believe that's still the case?

jbrockmendel commented 6 years ago

Do you believe that's still the case?

Backwards-compat would be annoying, not sure if anyone would really complain though.

Getting rid of it would break a behavior that is nice for testing that (dti + other == [x + other for x in dti]).all(). Probably not a good enough reason to keep the Timestamp.freq attribute if it isn't otherwise needed.

But as long as the attribute does exist, I think to_datetime should preserve it.

mroeschke commented 6 years ago

Sounds good. Reopening.

jbrockmendel commented 5 years ago

@mroeschke didn't this get solved recently?

mroeschke commented 5 years ago

This now works for the scalar case but still not the array case. But should the DatetimeIndex have a freq in this case with just 1 element?

In [5]: pd.__version__
Out[5]: '0.24.0.dev0+1010.ge413c491e'

In [6]: pd.to_datetime(pd.Timestamp("2014-1-1", freq="M"))
Out[6]: Timestamp('2014-01-01 00:00:00', freq='M')

In [7]: pd.to_datetime([pd.Timestamp("2014-1-1", freq="M")])
Out[7]: DatetimeIndex(['2014-01-01'], dtype='datetime64[ns]', freq=None)
lukemanley commented 1 year ago

This was once closed and then reopened due to Timestamp.freq but can now be closed again since freq has been deprecated and removed. closing