quantopian / empyrical

Common financial risk and performance metrics. Used by zipline and pyfolio.
https://quantopian.github.io/empyrical
Apache License 2.0
1.31k stars 406 forks source link

PERF/BUG: Improve performance; prevent return-stream mutation #27

Closed jdricklefs closed 8 years ago

jdricklefs commented 8 years ago

@twiecki @ahgnaw Let me know what you think.

The principal goal with this is to improve performance for consumers who generally call all of the metrics (... such as zipline). The two meaningful changes for that aim:

  1. When one calculation (e.g. alpha) internally depends on another calculation (e.g. beta), I added a kwarg (named with a leading _ to avoid shadowing the namespace) to allow the consumer to pass that value in. This is optional in all cases.
  2. If the risk_free rate (or the required_return) were 0, then we're wasting time doing calculations in service of returns - risk_free. I initially simply did a "if == 0" type check for this, but discovered that some empyrical functions depended on the behavior of creating a new Series internally as they mutated that value! For this reason, my helper method _adjust_returns will return a .copy() of the given returns (which is faster than doing the difference).

The always-create-a-new-Series behavior from point 2 exposed that a number of the expected values in the unit tests were dependent on other empyrical methods or tests mutating them. The biggest problem was in mixed_returns which leads with a nan (presumably to evaluate the nan-tolerance of each method). Because some tests were replacing leading nan's in the input returns with 0.01, the expected values in the tests were simply wrong. Hence, changing those calculations.

Finally, while working through that, it was discovered that sharpe_ratio, stability_of_timeseries and tail_ratio would always return nan if there was a nan in the return streams. I made them nan-tolerant and adjusted the tests accordingly.

twiecki commented 8 years ago

Awesome, do you have some numbers?

jdricklefs commented 8 years ago

Awesome, do you have some numbers?

I'd mostly been using zipline unit tests under cProfile to evaluate this. In some of those (e.g. tests/test_algorithm.py:TestAlgoScript.test_algo_record_nan) these changes bring runtime from ~1.9s to ~1.1s.

analicia commented 8 years ago

This is fantastic, LGTM