pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.8k stars 17.98k forks source link

ENH: `.to_timedelta` method for `PeriodIndex` and `Period` objects #35346

Open rwijtvliet opened 4 years ago

rwijtvliet commented 4 years ago

Is your feature request related to a problem?

I oftentimes find myself needing the duration of a Period or PeriodIndex, which I calculate with

timedelta = (p+1).start_time - p.start_time

It seems useful and commonly used enough to be included as a standard method.

Describe the solution you'd like

Additional .to_timedelta method, just as there exists a .to_timestamp method on these objects.

API breaking implications

None afaics.

Describe alternatives you've considered

Use a .duration attribute instead, which does the conversion from timedelta to seconds ((p+1).start_time - p.start_time).total_seconds(), but that makes little sense.

Additional context

Sample:

pi = pd.period_range('2019', periods=3, freq='Y')
pi
PeriodIndex(['2019', '2020', '2021'], dtype='period[A-DEC]', freq='A-DEC')

tdi = ((pi+1).start_time - pi.start_time) #now
tdi = pi.to_timedelta() #as suggested
tdi
TimedeltaIndex(['365 days', '366 days', '365 days'], dtype='timedelta64[ns]', freq=None)

# finding the middle of a period
mid_ts = pi.start_time + 0.5*((pi+1).start_time - pi.start_time) #now
mid_ts = pi.start_time + 0.5*pi.to_timedelta() #as suggested
mid_ts
DatetimeIndex(['2019-07-02 12:00:00', '2020-07-02 00:00:00', '2021-07-02 12:00:00'],
              dtype='datetime64[ns]', freq='8772H')
rwijtvliet commented 4 years ago

PS - I'd gladly make a pull request if this is considered useful. I just thought I'd ask here first.

jbrockmendel commented 4 years ago

i dont think we'd want to_timedelta specifically, but .diff could be nice. Main sticking point is getting timedelta64 back vs array of DateOffset objects.

There's another issue about implementing .diff for numeric Index subclasses that could be addressed at the same time

TomAugspurger commented 4 years ago

Agreed that diff would be better than to_timedelta.