robertmartin8 / PyPortfolioOpt

Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
https://pyportfolioopt.readthedocs.io/
MIT License
4.38k stars 942 forks source link

Minimum Variance Portfolio not on efficient frontier for portfolios with # assets >= 50 #300

Closed MarcBoettinger closed 3 years ago

MarcBoettinger commented 3 years ago

Hi,

I just went through the Merton (1972) prove of the analytical efficient frontier and the minimum variance portfolio. We recall that the expected return of the minimum variance portfolio is (mu S^-1 transpose(1)) / (transpose(1) S^-1 1) and the volatility (1 / (transpose(1) S^-1 1) )^0.5.

Now I plotted the unconstraint efficient frontier for a 10 assets, 50 assets and 100 assets portfolio using pyporfolioopt.

The analytical minimum variance portfolio is on the efficient frontier with 10 assets but will deviate from it with an increasing number of assets. Now there is no doubt that the analytical value should be prefered. I checked on how returns are understood (simple returns) which could be a potential source of a difference.

markowitz100 markowitz50 markowitz10

My guess is the following: the efficient frontier in pyportfolioopt is derived by a minimization of sigma given mu. The algorithm stops too early when the number of assets is > 10. If that is true the impact is significant.

My code:


import pandas as pd
import numpy as np
from pypfopt import EfficientFrontier
from pypfopt import HRPOpt
from pypfopt import plotting
import matplotlib.pyplot as plt
from contextlib import redirect_stdout
from scipy.cluster.hierarchy import leaves_list
from pypfopt import CLA

def import_stock_data():
    # Read data from file 'filename.csv'
    data = pd.read_csv("../../data/all_stocks_5yr.csv", header=0)
    stock_data = pd.pivot_table(data, index=['date'], columns=['Name'], values=['close'])
    stock_data.index = pd.to_datetime(stock_data.index)
    stock_data.pct_change = stock_data.close / stock_data.close.shift(1) - 1
    return stock_data

stock_data = import_stock_data()
(l, N) = stock_data.shape

k=100

returns = stock_data.pct_change
returns = returns.iloc[1:]
returns = returns.dropna(axis=1)
returns_samp = returns.sample(n=k, axis=1)
(l, N) = returns_samp.shape
mu = np.exp(np.sum(np.log(returns_samp + 1)) / (l / 252.)) - 1
S = np.cov(returns_samp.T) * 252

print("\nMarkowitz")

cla1 = CLA(mu, S)
print(cla1.max_sharpe())

mu_mv = (np.matmul(mu.T,np.linalg.inv(S)).sum() )/ (np.matmul(np.matmul(np.ones((1, k)),np.linalg.inv(S)), np.ones((k, 1))))
sigma_mv = np.sqrt(1 / (np.matmul(np.matmul(np.ones((1, k)),np.linalg.inv(S)), np.ones((k, 1)))))

with open('../markowitz_performance_weights.txt', 'w') as f:
    with redirect_stdout(f):
        cla1.portfolio_performance(verbose=False)

fig, ax = plt.subplots()
cla1.max_sharpe()
ret_tangent, std_tangent, _ = cla1.portfolio_performance()
ax = plotting.plot_efficient_frontier(cla1, ax=ax, points=500, showfig=False)  # to plot
ax.scatter(sigma_mv, mu_mv, marker=(10,1,0), color="g", s=100, label="Minimum Variance")
ax.figure.savefig('../markowitz100.eps', dpi=1200)
robertmartin8 commented 3 years ago

@MarcBoettinger

CLA has weight bounds of 0 to 1. Is this also true for your analytical result?

Could you check whether EfficientFrontier does a better job? e.g

from pypfopt import EfficientFrontier, plotting

ef = EfficientFrontier(mu, S)

fig,ax = plt.subplots()
plotting.plot_efficient_frontier(ef, ax=ax, show_assets=True)

# Return and std for the optimised min volatility portfolio
ef.min_volatility()
ret, std, _ = ef.portfolio_performance()
phschiele commented 3 years ago

@MarcBoettinger @robertmartin8 Thats correct, Merton (1972) does have the no-short sale constraint.

For a valid comparison, one should specify the bounds as (-np.inf, np.inf) when creating the EfficientFrontier object.

MarcBoettinger commented 3 years ago

Thanks, that's correct, my fault. The function efficient_frontier is constraint by default to weights in (0,1).

This can be closed.

MarcBoettinger commented 3 years ago

Weights are constraint by default to (0,1) whereas Merton 1972 is unconstraint with weights sum to 1