pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.79k stars 17.97k forks source link

Quarter.onOffset looks fishy #18235

Open jbrockmendel opened 7 years ago

jbrockmendel commented 7 years ago
qe = pd.offsets.QuarterEnd(startingMonth=2)
qs = pd.offsets.QuarterBegin(startingMonth=2)
bqe = pd.offsets.BQuarterEnd(startingMonth=2)
bqs = pd.offsets.BQuarterBegin(startingMonth=2)

feb1 = pd.Timestamp('2017-02-01')
feb28 = pd.Timestamp('2017-02-28')
apr30 = pd.Timestamp('2017-04-30')

>>> qs.onOffset(feb1)
True
>>> qe.onOffset(feb28)
True
>>> qe.onOffset(apr30)
False

>>> bqs.onOffset(feb1)
True
>>> bqe.onOffset(feb28)
True
>>> bqe.onOffset(apr30)
False

The QuarterStart behavior makes sense to me; the QuarterEnd does not. If Feb1 is the start of a quarter, shouldn't the end of that same quarter be Apr30?

jbrockmendel commented 6 years ago

On a closer look, it appears that the behavior is correct but misleading. QuarterEnd(startingMonth=2) is an offset for quarters that start on Feb 1, May 1, Aug 1, Nov 1. I assumed that QuarterEnd(startingMonth=2) would be an offset corresponding to the ends of those same quarters. That is incorrect. QuarterEnd(startingMonth=2) is an offset for quarters that end on Feb 2[89], May 31, Aug 31, Nov 30.

I'm not the first to make this mistake or otherwise end up confused:

I suggest the following:

Thoughts? I'm happy to implement+test, but would like to get buy-in first.