robertmartin8 / PyPortfolioOpt

Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
https://pyportfolioopt.readthedocs.io/
MIT License
4.24k stars 927 forks source link

Practical application #578

Open ErfolgreichCharismatisch opened 5 months ago

ErfolgreichCharismatisch commented 5 months ago

What are you trying to do? I am trying to make sense of the output

What data are you using? I generate my own stock_prices.csv with my current portfolio and use the example

import pandas as pd
from pypfopt import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns

# Read in price data
df = pd.read_csv("tests/resources/stock_prices.csv", parse_dates=True, index_col="date")

# Calculate expected returns and sample covariance
mu = expected_returns.mean_historical_return(df)
S = risk_models.sample_cov(df)

# Optimize for maximal Sharpe ratio
ef = EfficientFrontier(mu, S)
raw_weights = ef.max_sharpe()
cleaned_weights = ef.clean_weights()
ef.save_weights_to_file("weights.csv")  # saves to file
print(cleaned_weights)
ef.portfolio_performance(verbose=True)

from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices

latest_prices = get_latest_prices(df)

da = DiscreteAllocation(cleaned_weights, latest_prices, total_portfolio_value=10000)
allocation, leftover = da.greedy_portfolio()
print("Discrete allocation:", allocation)
print("Funds remaining: ${:.2f}".format(leftover))

Now my question is, how to interpret this.

It seems to be simple as in Discrete allocation showing the amount of stocks you should have in your portfolio, so selling and buying to get this should yield the highest return annually.

So it seems that we do Buy and Hold and on NYE, we sell and buy to adapt to the new deviation.

Then it raises the question, what stock data to use:

  1. All data available that is updated annually, every time starting from the IPO?
  2. Should all stocks have data, ie how to deal with stocks that weren't on the market yet?
  3. Should it be monthly instead of annually updated? Should it be only the last year's data, as you want to predict the next year?

Please shed some practical light.

baobach commented 5 months ago

For the model to work properly, you need a well-diversified portfolio. One way to select asset classes is to create a factor portfolio, i.e., select asset classes that represent risk factors for your portfolio. You can read more about it here (Lin, 2020). Having a set of 30 assets, your next task is to filter highly correlated assets and then work with the remaining asset classes. I don't recommend a portfolio with more than 30 asset classes because the computational power for optimizing them can be heavy, and the inverse covariance matrix cannot be obtained.

To answer your question:

  1. Use all available data; monthly returns over a 3-year period are desirable. Don't pick a timeframe close to the IPO, and anywhere between the current time period and the latest regime shift is ideal. You don't want to include 2008 in your portfolio decision in 2024.

  2. Returns are required to compute the covariance matrix. Missing values can lead to unstable returns and corner solutions for the optimizer, i.e., allocating 100% of your wealth into one asset class.

  3. The portfolio optimization process is not trying to predict future returns but to minimize the risk of the portfolio holdings given the information about its historical returns. You can implement a different approach, like the Black-Litterman model, to incorporate your future returns of the asset and adjust the weights accordingly.

ErfolgreichCharismatisch commented 5 months ago

So the idea is to have a rather uneventful deviation that diversifies risk, so the likelihood of losses is minimized without trying to maximize returns?

baobach commented 5 months ago

This is a tricky question. The application of the modern portfolio theory is to "Optimize" your target functions. It can take many forms: Minimize risk, Maximize returns, Maximize Sharpe ratio etc. However, the input of the formula is the empirical returns vector i.e. expected returns. If you want to use the maximize return function, this information is not a reliable prediction of future returns. Today the expected annual returns of SPY is 30%. How can I use this information and construct a portfolio at this moment and hope that it will stay the same in 5 years?