robertmartin8 / PyPortfolioOpt

Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
https://pyportfolioopt.readthedocs.io/
MIT License
4.39k stars 940 forks source link

Returns data #325

Closed fletch-man closed 3 years ago

fletch-man commented 3 years ago

Couldn't find a solid answer after a lot of searching so I figured that I'd post here.

The only dataset I have access to is a set of $PnL timeseries that I am trying to optimize as a portfolio. Is it acceptable to pass these $PnL to the MV optimizer, or do they strictly have to be a % return?

The $PnL timeseries come from a basket daily trading strategies, and margin requirements are often unclear as it is based on futures markets, so I cant calculate return on capital which would obviously be the correct thing to do in this case. If the optimizer strictly accepts % returns then any ideas on a workaround would be appreciated.

Thanks for your help

robertmartin8 commented 3 years ago

Hi @fletch-man,

It's a tough question. The reason why PnL is harder to deal with is that it makes your return streams much harder to compare. A stream of $200 +/- 20 PnL per day sounds far better than $100 +/- 30, but clearly not if the former requires 10x the capital.

Essentially, whatever you pass as returns needs to be something for which it's obvious that bigger is better, such that a value of 8 for asset A means that it's preferable to asset B with value 6 (whatever units you're using for 8 and 6). This is not necessarily true in the case of PnLs. Your scale also needs to be linear.

Assuming your dollar PnL streams are operating on similar capital allocations, the above two conditions are indeed met, so in principle you can use PyPortfolioOpt.

However, any method that makes reference to percentage returns (like max_sharpe, efficient_return, efficient_risk) will not work unless you fiddle with the parameters. For example, ef.efficient_risk(0.2) gives the portfolio that maximises return for a target volatility of 0.20. But this 0.20 is a figure quoted in units of return, so you'd have to adapt it to your units.

You will also run into issues with annualising, so will need to use compounding=False in expected_returns methods.

These caveats aside, you should be able to use simple methods like ef.min_volatility() without too much trouble.

Would love to hear back from you – this issue was also raised in #280 so I'd like to figure out how to make it more usable.

Best, Robert

fletch-man commented 3 years ago

Thanks for the feedback @robertmartin8.

What do you you specifically mean by "Your scale also needs to be linear."?

I believe I found a way to implement the optimization. One thing I was able to leverage what that a given PnL stream is generated from 1 position in the given strategy i.e. the PnL on a given day was generated from trading a single future contract. This isn't as ideal as having a return on capital; at the same time, capital requirements on futures are so minimal that a return on capital would be extremely volatile as the underlying price moves up and down.

I ended up just running with min_volatility() which worked well, since max_sharpe() ended up giving me trouble around infeasible solutions. One thing that did improve results for me was to normalize the pnl data between -1 and 1 before generating the covariance matrix. Doing this same normalization and then using methods that require a mu input like max_sharpe may also help with the infeasibility issues I had.

Haven't looked into how 'appropriate' it is to do this normalization, but it definitely improved results when compared to using un-normalized PnL, and to an equal weighted portfolio. Obviously once this normalization is performed you cant use certain functions like ef.portfolio_performance(verbose=True) because the output will be unrelated to the actual PnL series, so just have to take the returned weights and then apply to the PnL series in a custom performance function.

Please let me know if you see any flaws in my logic here. Thanks again for your response.

Best, Fletch

robertmartin8 commented 3 years ago

@fletch-man

What do you specifically mean by "Your scale also needs to be linear."?

Let's say instead of returns you have some numbers -5 -> +5 (+5 is the max gain, -5 is the max loss). It must be the case that +4 is "twice as good" as +2, e.g you'd be indifferent between 1 lot of the +4 asset and 2 lots of the +2 asset (ignoring volatility). Similarly, a value of -3 should be as bad as +3 is good, such that you'd be indifferent between short the -3 asset and long the +3 asset. This is important because mean-variance optimisation is essentially maximising the dot product of weights and returns (plus constraints).

This isn't as ideal as having a return on capital; at the same time, capital requirements on futures are so minimal that a return on capital would be extremely volatile as the underlying price moves up and down.

Fair point. I think the best way of thinking about this is pretending you were the boss of the trader with that PnL stream. The "capital" is however much money/risk/max-drawdown you allocate to them.

I think what you're doing makes sense, but as you've pointed out, many methods might not be supported. If I had this problem, I'd consider converting all dollar returns to percentage returns based on some nominal amount of capital.

Best, Robert

fletch-man commented 3 years ago

@robertmartin8

Fair point. I think the best way of thinking about this is pretending you were the boss of the trader with that PnL stream. The "capital" is however much money/risk/max-drawdown you allocate to them.

Agreed that this is the best way forward, I will see if I can come up with some sort of baseline to compute a return on something.

Thanks again for all of the help. Also PyPortfolioOpt is an awesome tool and want to say thanks to you and the other contributors for all of your effort developing it