Closed gpfins closed 3 years ago
@nknudde Thanks for the contribution! CVaR has been bothering me a for a while, so it's great to see a cvxpy implementation.
Some thoughts:
EfficientSemivariance
PR, I'm thinking that it would make sense for EfficientCVar
to inherit from EfficientFrontier
. This saves on duplication and logically makes sense. A full explanation behind this architecture decision is here.frequency
parameter. It doesn't seem to be connected to anything?Regarding the first point, you don't have to do anything – it'll be easier if I merge your changes into a dev branch, change the architecture, then PR into master.
Best, Robert
@nknudde Thanks for the quick turnaround re shorting!
I still need to think a bit more about whether this is processing return frequencies correctly.
For example, if a user passes in daily returns vs weekly returns vs monthly returns, does the current code still make sense?
Hi @robertmartin8, right now it does not take frequency into account. I did not have time to look yet at how to rescale it properly, but I'll try to asap.
Ok, great!
For consistency with EfficientSemivariance
, we should probably move EfficientCVar
into the efficient_frontier.py
file, so that (if they so choose), users can from pypfopt.efficient_frontier import EfficientFrontier, EfficientCvar
.
I realise this results in a large efficient_frontier.py
file, which I guess is not ideal. The alternative is to create an efficient_frontier
folder, then put efficient_x.py
files in there, and do some clever __init__.py
stuff. But I don't want to over-engineer it, so for now let's just put everything in the same big file.
@nknudde
Just to let you know, I noticed a bug in the implementation:
self._objective = self._alpha + 1.0 / len(self.tickers) / (
1 - self._beta
) * cp.sum(self._u)
I believe the denominator is meant to contain len(self.historic_returns)
instead of len(self.tickers)
. From the paper, we have:
q
indexes over history, while each y_k
is a vector of length N
(where N
is the number of assets).
I have merged your code into a dev branch, where I'm fixing this (and figuring out the frequency stuff).
EDIT: I've fixed this, and it now agrees with the manually computed historical CVaR:
beta = 0.95
df = get_data()
mu = expected_returns.mean_historical_return(df)
historical_rets = expected_returns.returns_from_prices(df).dropna()
cv = EfficientCVaR(mu, historical_rets, beta=beta)
cv.min_cvar()
_, cvar = cv.portfolio_performance()
# Manually compute the expected return below the VaR
portfolio_rets = historical_rets @ cv.weights
var_hist = portfolio_rets.quantile(1 - beta)
cvar_hist = -portfolio_rets[portfolio_rets < var_hist].mean()
# Check equal
np.testing.assert_almost_equal(cvar_hist, cvar, decimal=3)
Good news, however. frequency
shenanigans aren't necessary. EfficientCVaR minimises (and calculates): daily CVaR if daily returns are provided, weekly CVaR if weekly returns are provided, monthly CVaR if monthly returns are provided, etc.
For example, in the test cases (daily), a 95%-CVaR of 0.017 means that the average daily return of all returns below the 5th percentile is -1.7%.
I just need to touch up the code/tests and write docs, and we should be good to go!
Thanks! Sorry I did not really have time to look at it yet. About the frequency business, it is true the resulting value matches the frequency of the dataset, was just wondering if there would be any way to e.g. if daily returns are given to be able to calculate monthly CVaR. Maybe I'm taking it a bit too far though.
If there is anything I can help with, please let me know.
@nknudde I am quite content with the functionality as is.
One small thing I've noticed is that the historic_returns
doesn't actually have to be historic returns. Users can generate their own sample returns (e.g using copulas) and provide it to EfficientCVaR
to get a forward-looking estimate.
This applies to EfficientSemivariance
too I believe (@phschiele). So I will probably refactor historic_returns
-> returns
(this would be consistent with e.g HRPOpt
). Since it's not a kwarg, this shouldn't be a breaking change for the majority of users
Yup totally true, I was actually using this in combination with copulas!
I added a methodology to do conditional value at risk optimization from this paper. Feel free to use it, or to give some suggestions.