Suggestion for passing data between Algos

malyvsen commented 4 years ago

Hi! I come from a machine learning background and I see a lot of similarities in the architectures of bt and some ML frameworks. I wanted to offer a way of passing data between Algos whose analogue in ML is what makes some frameworks particularly elegant and flexible.

Instead of communicating through a dict, which forces Algos to be organized sequentially, an Algo could take all the Algos whose outputs it is to operate on as arguments to its constructor. __call__ would be quite similar to what it is now - except that, instead of getting data from target.temp, Algos would just call the Algos they were given at construction.

For example, an algo for mean-variance optimization might look like this:

class MVO(bt.Algo):
    def __init__(self, expected_returns_algo, volatilities_algo):
        self.expected_returns_algo = expected_returns_algo
        self.volatilities_algo = volatilities_algo

    def __call__(self, historical_prices):
        expected_returns = self.expected_returns_algo(historical_prices)
        volatilities = self.volatilities_algo(historical_prices)
        return some_mvo_function(expected_returns, volatilities)

The benefits:

Ability to represent any (acyclic) graph structure of Algos
No strange bugs from accidentally overriding another Algo's stuff
Currently, Algos need to know their purpose (because they set a named field in a dict) - this way of passing data would make it sensible to write purpose-agnostic Algos, eg. a Log algo which turns returns into log returns, but also volatility into log volatility
I think this would make it easier to experiment with your Algos, because purpose-agnostic Algos can be much more atomic - in the example above, expected_returns_algo could just as easily be historical returns as it could be log historical returns
Strategies using other strategies as their assets could maybe also be done this way, leading to an even more lightweight framework?

The downside is, of course, that this breaks backwards compatibility.

What do you think? I'd be willing to call about this if you want to :)

ptomecek commented 3 years ago

I originally was also not a fan of the way that algos communicated with each other (especially the fact that you need knowledge of the magic strings that correspond to dictionary entries). However, as I've used the framework more, I have come to terms with it more, though I think there is still some room for improvement. With regards to your question, my view is that the Algos are not for data processing, but rather to model the control flow of actions being taken (select instruments, select weights, rebalance, hedge, lifecycle instruments, etc). This means that within any of those steps, you can use the package of your choice for representing acyclic graphs of data computations that will produce an "action" - this keeps the data that needs to be transferred between algos to a minimum. For traditional signal analysis, I particularly like the Modular Toolkit for Data Processing (http://mdp-toolkit.sourceforge.net/)

malyvsen commented 3 years ago

Sure, using algos your way does take the load off them :) So then representing an acyclic graph becomes less important, and each Algo does know its purpose anyway. Some benefits of the change still remain though:

No strange bugs from accidentally overriding another Algo's stuff
Strategies using other strategies as their assets could maybe also be done this way, leading to an even more lightweight framework?
What you mentioned about having to remember magic strings

I might make an experimental version sometime this month to see about #2, I think that's my favorite baby ^^

pmorissette / bt

Suggestion for passing data between Algos #231