Closed jmccorriston closed 4 years ago
Here's a profile of the version in this branch:
Wed Feb 19 10:41:50 2020 returns_tearsheet_profile.stats
7705970 function calls (7550159 primitive calls) in 17.993 seconds
Ordered by: cumulative time
List reduced from 3252 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 17.993 17.993 <ipython-input-3-ca392862d9db>:11(run_returns_tear_sheet)
1 0.001 0.001 17.992 17.992 /Users/jmccorriston/quant-repos/alphalens/alphalens/plotting.py:38(call_w_context)
1 0.028 0.028 17.964 17.964 /Users/jmccorriston/quant-repos/alphalens/alphalens/tears.py:165(create_returns_tear_sheet)
2 0.058 0.029 8.409 4.205 /Users/jmccorriston/quant-repos/alphalens/alphalens/performance.py:454(mean_return_by_quantile)
2 0.029 0.015 7.395 3.698 /Users/jmccorriston/quant-repos/alphalens/alphalens/utils.py:382(demean_forward_returns)
2 0.047 0.024 7.022 3.511 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/groupby/generic.py:570(transform)
2 0.024 0.012 6.974 3.487 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/groupby/generic.py:516(_transform_general)
1 0.009 0.009 4.829 4.829 /Users/jmccorriston/quant-repos/alphalens/alphalens/performance.py:208(factor_returns)
1 0.005 0.005 4.210 4.210 /Users/jmccorriston/quant-repos/alphalens/alphalens/performance.py:129(factor_weights)
1 0.000 0.000 4.084 4.084 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/groupby/generic.py:809(apply)
1 0.007 0.007 4.084 4.084 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/groupby/groupby.py:695(apply)
1 0.009 0.009 4.077 4.077 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/groupby/groupby.py:741(_python_apply_general)
3 0.001 0.000 3.756 1.252 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/IPython/core/display.py:131(display)
3 0.001 0.000 3.734 1.245 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/IPython/core/formatters.py:89(format)
36 0.000 0.000 3.733 0.104 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/IPython/core/formatters.py:220(catch_format_error)
1 0.000 0.000 3.726 3.726 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/matplotlib/pyplot.py:251(show)
1 0.000 0.000 3.726 3.726 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/ipykernel/pylab/backend_inline.py:23(show)
27 0.000 0.000 3.718 0.138 </Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/decorator.py:decorator-gen-9>:1(__call__)
27 0.000 0.000 3.718 0.138 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/IPython/core/formatters.py:331(__call__)
2 0.000 0.000 3.711 1.855 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/IPython/core/pylabtools.py:244(<lambda>)
Here was the profile from the original issue (https://github.com/quantopian/alphalens/issues/357):
Fri Jan 31 10:09:55 2020 returns_tearsheet_profile.stats
62029996 function calls (61402118 primitive calls) in 130.569 seconds
Ordered by: cumulative time
List reduced from 3735 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 130.587 130.587 <ipython-input-1-eb0a21d53746>:10(run_returns_tear_sheet)
1 0.001 0.001 130.587 130.587 /Users/jmccorriston/quant-repos/alphalens/alphalens/plotting.py:38(call_w_context)
1 0.038 0.038 130.566 130.566 /Users/jmccorriston/quant-repos/alphalens/alphalens/tears.py:165(create_returns_tear_sheet)
6 0.863 0.144 98.381 16.397 /Users/jmccorriston/quant-repos/alphalens/alphalens/performance.py:332(cumulative_returns)
1 0.000 0.000 80.395 80.395 /Users/jmccorriston/quant-repos/alphalens/alphalens/plotting.py:757(plot_cumulative_returns_by_quantile)
7 0.000 0.000 80.307 11.472 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/frame.py:6737(apply)
7 0.000 0.000 80.298 11.471 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/apply.py:144(get_result)
7 0.001 0.000 80.297 11.471 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/apply.py:261(apply_standard)
11 0.000 0.000 80.247 7.295 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/apply.py:111(f)
7 0.000 0.000 56.973 8.139 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/apply.py:297(apply_series_generator)
13608 0.093 0.000 47.133 0.003 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/series.py:1188(__setitem__)
13608 0.069 0.000 46.881 0.003 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/series.py:1191(setitem)
4536 0.136 0.000 46.233 0.010 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/series.py:1261(_set_with)
4536 0.469 0.000 45.371 0.010 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/series.py:1303(_set_labels)
22715/18179 0.485 0.000 43.101 0.002 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/indexes/base.py:2957(get_indexer)
4541 0.082 0.000 31.554 0.007 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/indexes/datetimelike.py:686(astype)
4536 0.035 0.000 30.160 0.007 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py:706(astype)
4541 0.054 0.000 29.979 0.007 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/arrays/datetimelike.py:516(astype)
4541 0.023 0.000 29.849 0.007 /Users/jmccorriston/.virtualenvs/alphalens_env/lib/python3.7/site-packages/pandas/core/arrays/datetimelike.py:346(_box_values)
4548 3.760 0.001 29.825 0.007 {pandas._libs.lib.map_infer}
@jmccorriston good job! I will have a look at this PR in the following days.
The build is now passing for python 2.7 and 3.5, which is the current state of master. I think the last question is in regards to https://github.com/quantopian/alphalens/pull/361/files#r415894312, and if we believe that's right then I think this PR is good.
the new call signature to cumulative_returns() will not be compatible with this: https://github.com/quantopian/alphalens/blob/master/alphalens/performance.py#L933
This PR includes changes that address the performance issues highlighted in https://github.com/quantopian/alphalens/issues/357. This change simplifies the
cumulative_returns
computation, which significantly speeds up thecreate_returns_tear_sheet
function (and subsequently,create_full_tear_sheet
).The changes definitely need input from other folks including @luca-s and someone from engineering at Quantopian (I'll get someone to take a look), so I added a
do not merge
label. I expect I'll need to make pretty significant changes before we can merge.The branch also sprawled a bit and made a number of other changes, including a few additions and tweaks to functions in
alphalens.utils
as well as some stylistic changes, and minor functional changes. The description below gives a summary of the changes made in this PR.Changes
1D
in all cumulative returns computations.alphalens
depend onempyrical
.alphalens.performance
. At least for now, it has been moved into its own function,subportfolio_cumulative_returns
, inalphalens.utils
.3D12h
or1h
) from all turnover metrics.inf
andnan
handling to the event study tearsheet.inf
handling toget_clean_factor_and_forward_returns
.get_clean_factor_and_forward_returns
such that it can take returns as input (it could previously only take daily prices).backshift_returns_series
toalphalens.utils
. This can be used to convert backward-looking returns into a forward-returns series, which is helpful when you only have access to returns data, instead of adjusted daily pricing.