BUG: Calculate mean returns by date before mean return by quantile

MichaelJMath commented 5 years ago

Compute mean return by date. If by_date flag is false, then compute and return the average of the daily mean returns (and also the standard error of these daily mean returns).

Resolves: Issue #309

twiecki commented 5 years ago

There seem to be some failing unittests: https://travis-ci.org/quantopian/alphalens/jobs/412480773

MichaelJMath commented 5 years ago

It looks like one of the issues is that in older versions of pandas, you can't groupby an index level by specifying the name of the level. I'll look into it a bit later.

MichaelJMath commented 5 years ago

The remaining failed unittests don't seem to come from any of the code I edited. They seem to result from a problem in commit e444e8da74a524ce42b172f9931c62597508a749. My logic is as follows:

Below is one of the failed test cases:

ERROR: test_average_cumulative_return_by_quantile_0 (alphalens.tests.test_performance.PerformanceTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.7/site-packages/parameterized/parameterized.py", line 392, in standalone_func
    return func(*(a + p.args), **p.kwargs)
  File "/home/travis/build/quantopian/alphalens/alphalens/tests/test_performance.py", line 920, in test_average_cumulative_return_by_quantile
    0, after + 1), filter_zscore=False)
  File "/home/travis/build/quantopian/alphalens/alphalens/utils.py", line 754, in get_clean_factor_and_forward_returns
    filter_zscore)
  File "/home/travis/build/quantopian/alphalens/alphalens/utils.py", line 304, in compute_forward_returns
    period_len = diff_custom_calendar_timedeltas(start, end, freq)
  File "/home/travis/build/quantopian/alphalens/alphalens/utils.py", line 912, in diff_custom_calendar_timedeltas
    actual_days = np.busday_count(start, end, weekmask, holidays)
TypeError: Iterator operand 0 dtype could not be cast from dtype('<M8[us]') to dtype('<M8[D]') according to the rule 'safe'

It seems there is an issue in the way pandas stores dates. (https://stackoverflow.com/questions/31917964/python-numpy-cannot-convert-datetime64ns-to-datetime64d-to-use-with-numba)

I tested this using the following virtual environment configs (matching what was in the unittest):

conda create -q -n testenv --yes python=2.7 ipython flake8 numpy scipy nose matplotlib pandas=0.20.3 statsmodels seaborn

I tried checking out prior commits and got the error when testing on every commit going back to the e444e8da74a524ce42b172f9931c62597508a749 commit. The commit before that did not generate the error.

This might need to be listed as a separate issue, but I'm not sure why it would just be showing up now on the unittests, and not on prior commits.

Please let me know if I seem to be off track here.

EDIT: Some additional info: When I change line 912 in utils to:

actual_days  = np.busday_count(np.array(start).astype('datetime64[D]'), np.array(end).astype('datetime64[D]'),  weekmask, holidays)

The code ran without error.

twiecki commented 5 years ago

Thanks for tracking this done @MichaelJMath, I bet you are right. Want to do another PR with your fix?