More performance improvements

vfilimonov commented 9 years ago

OK, suppose that #24 is closed and #25 is merged.

Now we can see some other performance bottlenecks. Checking the simple strategy:

x = np.random.randn(10000, 100) * 0.01
idx = pd.date_range('1975-01-01', freq='B', periods=x.shape[0])
df = np.exp(pd.DataFrame(x, index=idx).cumsum())

s0 = bt.Strategy('s0', [bt.algos.RunMonthly(),
                        bt.algos.SelectAll(),
                        bt.algos.SelectRandomly(len(df.columns)/2),
                        bt.algos.WeighRandomly(),
                        bt.algos.Rebalance()])

Here's the call-graph for it (before #25 the importance of other nodes was not seen due to an overhead in .update()): perf-3

What do we see:

StrategyBase.update() calls SecurityBase.value, which in turn sometimes calls StrategyBase.update().
Strategy.close() calls SecurityBase.value, which sometimes calls StrategyBase.update().
SecurityBase.weight calls StrategyBase.update()
and some other calls of either of .update() from other members.

Up to my tests:

calls of SecurityBase.value in StrategyBase.update() are redundant - the parameter _value was never changed here. So I think that the call could be changed to a plain access of protected field and save us ~10% more.
In Strategy.close() - for the simple strategy again, call of SecurityBase.value does not provide any value in comparison with protected _value.

I'm still looking into call dependencies and trying to understand how can we optimize it. But, Phil, I'd appreciate your feedback on two bullets above and your thoughts on this issue. It seems that .stale can be ignored for some cases (like above) and the call to a property could be replaced by a direct access to a protected field - which could save us quite some time.

pmorissette commented 9 years ago

Hey Vlad,

Thanks for this in-depth analysis and code improvements. Your contributions have greatly improved the library so far (ffn as well)!

First things first, I want to make sure the nose tests are all passing before diving into optimizations. As it stands, we have some failures due to some of the recent changes. I will try to fix these asap and then I will be able to start looking into the optimizations.

So far, my approach to optimizations has been conservative since I wanted to maintain correctness and flexibility at the detriment of speed. That being said, if we can make sure correctness and flexibility are maintained, then I am all for optimizations. Speed is always a critical factor for backtests especially when using Python so any speed improvements will be greatly welcomed.

At first glance, your proposed changes seem to be good. I have always had the feeling that there was some juice to squeeze out of the value/update/stale cycle. We just have to make sure we aren't introducing any issues - both in simple strategies but also in strategies of strategies. Same applies for fixes on #25.

I'll try to get back to you shortly.

Thanks again! Phil

vfilimonov commented 9 years ago

Sure, Phil, thank you! I agree that we should not sacrifice the flexibility to the speed (though in my develop I might do this), and definitely the precision is the priority.

I look forward for your feedback!

vfilimonov commented 9 years ago

btw, how do you run your tests? nosetests test_core.py?

pmorissette commented 9 years ago

Hey Vlad,

Glad we are on the same page.

You may use the command you provided above, but to run all the project's tests, I use the following command from the project's root directory:

nosetests -d

Same goes for ffn by the way.

Cheers, Phil

pmorissette commented 9 years ago

Ok fixed the failing tests. Let me look into the speed optimizations now.

Cheers, Phil

francol commented 8 years ago

https://github.com/pmorissette/bt/pull/41 Introduced a huge performance bottleneck. With even a few symbols creating a temporary context and then checking for errors takes up 20% of the time during a backtest and when I try and run a backtest using all Symbols in the SP500 this ends up taking upto 90% of the time in the backtest.

Small backtest Large Backtest

pmorissette commented 8 years ago

Hey @francol,

Great find! I should probably take the time to write some benchmarks to make sure new commits don't slow things down.

Let me look into it and get back to you. If you have any suggestions, I'm all ears!

Thanks again, Phil

pmorissette commented 8 years ago

Hey @francol,

Just pushed up some code that should fix this slowdown (8ac2b4e4bcd6db0aab5fc396eee5818aea3f3f19). If you get a chance, please let me know what you think. On my machine, the performance is back to pre-#41.

Cheers, Phil

BradKML commented 3 months ago

Is this issue considered "up"? As of 2023, certain articles still consider bt less fast than backtesting.py, hope that there can be more performance improvements made in the near future

pmorissette / bt

More performance improvements #26