robertmartin8 / PyPortfolioOpt

Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
https://pyportfolioopt.readthedocs.io/
MIT License
4.25k stars 928 forks source link

Calculating CVaR - can't replicate the results I see in the portfolio_performance #512

Open OriKatz1 opened 1 year ago

OriKatz1 commented 1 year ago

What are you trying to do? I am trying to understand the calculation of CVaR after optimization.

What have you tried? Multiplying the optimal weights with the returns dataframe, sum(axis=1), get "returns_distribution", and then: var = returns_distribution.quantile(0.05) cvar = returns_distribution[returns_distribution <= var].mean()

Or, using the formula similar to the optimization code: beta=0.95 alpha=returns_distribution.quantile(1-beta) cvar = alpha + 1 / (len(returns_distribution) (1 - beta)) np.sum((- returns_distribution + alpha).clip(lower=0))

What ever I do, I just can't get the same number I see in the portfolio_performance. I get the same number when I calculate the expected return, but not the CVaR.

What am I missing here?

robertmartin8 commented 1 year ago

The calculation is based on an approximation of the CVaR required to solve the problem using convex optimisation. The docs have more details on this.

OriKatz1 commented 1 year ago

Yes, I read this, and I am trying to calculate it myself using the same formula, and I just don't get the same numbers. I get very different numbers.

robertmartin8 commented 1 year ago

I think PyPortfolioOpt is incorrect here. It is reporting the result of the optimisation problem (a problem which is equivalent to minimising CVaR), but that result is not equivalent to CVaR itself.

Instead, like you suggest, we should probably be taking the optimal weights, dotting them with returns to get a return distribution, then finding the expected shortfall.

Paging @phschiele @nknudde to double check.

OriKatz1 commented 1 year ago

Ok, thanks. And also thanks for the package in general, it's great.

OriKatz1 commented 1 year ago

Hi, I found the error: it turns out that the columns in my returns data were not in the same order as the rows in the expected returns data. You might want to add something to the optimization code that makes sure, in case of pandas dataframes and series, that the covariance metrics, the returns dataframe and the expected returns series are sorted all in the same order.

robertmartin8 commented 1 year ago

I see, but the outcome is still not equal to the actual CVaR, as computed using returns[returns<quantile].mean(), right?

OriKatz1 commented 1 year ago

No, but it's very close. Before that I got 2 very different outcomes and couldn't understand why

sorensenj50 commented 1 year ago

I am seeing similar issues with CDaR. For instance, after looping through a range of returns and getting the efficient return with min CDaR, the frontier is actually lower than one of my assets. When implement my own CDaR function and calculate the CDaR on my end for each combination of weights, the frontier lines up. I am also having trouble making sense of the results when I implement Tracking Error constraints (wildly different when I recalculate with the output weights and my own tracking error function) but this could be a different issue. I can provide a code example if needed.