Idzorek's method for Black-Litterman

robertmartin8 commented 4 years ago

Idzorek's method is a way of translating percentage confidence into the uncertainty matrix required by Black-Litterman. This can be seen in section 3.2 (pg 23) of Idzorek's Step-by-Step Guide to Black-Litterman.

This has been requested in #70, and selfishly, I now think it would be useful for my own investing :)

That being said, Idzorek's method is really quite involved, and I would thus like to open the floor to any suggestions regarding implementation/architecture.

A brief overview of the steps are:

Construct a BL returns vector for each view separately (assume 100% confidence)
Convert these return vectors to implied weights for each view, w_k
Determine the deviation between each view's weight vector and the market weights
Multiply this deviation by the user's confidence to get the tilt_k, and compute target weights for each view using w_k%= w_mkt + tilt_k
The kth uncertainty is the uncertainty that minimises the sum of square difference between w_k and w_k%.
Put these k uncertainties on the diagonal of a matrix.

I think I've figured out how to do it on the backend. Have a static method within the BlackLitterman class (it must be static so that we can instantiate BlackLitterman objects) that computes the Idzorek matrix.

I'm not quite sure what the API should be. Some ideas:

# 1. Recognise omega parameter as a string then automatically construct
bl = BlackLittermanModel(S, absolute_views=viewdict, omega="idzorek")
ret = bl.bl_returns()

# 2. Method to replace current omega with Idzorek omega
bl = BlackLittermanModel(S, absolute_views=viewdict)
bl.omega = bl.idzorek_omega() 
# or
bl.set_idzorek_omega()
ret = bl.bl_returns()

# 3. Put function in module rather than class
idzorek_omega = black_litterman.idzorek_uncertainty(absolute_views=viewdict)
bl = BlackLittermanModel(S, absolute_views=viewdict, omega=idzorek_omega)

I think option 1 is the nicest API, but I don't know if it's too confusing to allow the omega parameter in the constructor to accept a string, in addition to the pd.DataFrame and None options currently provided.

@schneiderfelipe any thoughts?

schneiderfelipe commented 4 years ago

@robertmartin8 I think 1. is the best and not confusing at all. Some pandas functions/methods do something like that (e.g., axis=0 or axis="index").

robertmartin8 commented 4 years ago

@schneiderfelipe

I've made decent progress so far, and have written the code for steps 1-4. However, I'm having a bit of trouble with step 5:

Screenshot 2020-04-22 at 20 16 36

Screenshot 2020-04-22 at 20 17 30

I think w_k represents the Black-Litterman weights for a single view of uncertainty omega_k (this uncertainty is what we are trying to solve for. Do you have any thoughts on how I should solve this optimisation problem?

Worst case I could use scipy.optimize lol

schneiderfelipe commented 4 years ago

@robertmartin8 w_k is a number, right? As such, the equality after "where" in the figure looks strange. Correct me if I'm wrong, but aren't numbers being summed to matrices in there?

With that being said, everything is linear there, so we could transform this optimization problem in a linear one. On the other hand, this transformation is hard. It basically means doing the derivatives, and equating to zero. The problem is the large inverse in the middle of the expression, but this can be rewritten by using this lemma. This and explicitly stating all matrix multiplications will probably do the trick (the good thing is that a lot of stuff will not contribute to the derivative; Pi, for example, obviously won't appear in the derivative or w_k). I tried to sketch something, but it was no good. I'll try it again when I have time. But it might not be worth it.

For the time being, I would say to use scipy.optimize, especially for getting tests right, just to be sure. What do you think, is it worth it?

robertmartin8 commented 4 years ago

@schneiderfelipe w_k is a vector of weights, omega_k is a number. In this case, p_k is the 1xN row vector (a single row of the picking matrix). So we are actually adding NxN to NxN matrices in the middle bracket. Then the three brackets together become w_k = (N x N)(N x N)(N x 1) so w_k is (Nx1) as needed.

I'll have a go at the maths – I'm happy to let cvxpy solve it if possible (and it should be possible since the problem is convex?). scipy.optimize is the worst case.

robertmartin8 commented 4 years ago

@schneiderfelipe I've worked on this a bit, and I'm not certain it's linear. The sum of squares is quadratic in w_k, and w_k is quadratic in 1/omega_k (because even after using the lemma, there is a 1/omega_k in the middle bracket and another in the last bracket). So actually we end up having to minimise a quartic function. I've attached my working – let me know if you spot any errors (there are bound to be a few)

IdzorekMethod.pdf

robertmartin8 commented 4 years ago

This is an implementation using scipy.optimize

def sum_square_deviations(s, w_target, delta, sigma, p, q, pi):
    tau_sigma_inv = np.linalg.inv(tau * sigma)
    middle_bracket = tau_sigma_inv + s * p.T @ p
    last_bracket = tau_sigma_inv @ pi + s * p.T @ q

    w_k = (
        np.linalg.inv(delta * sigma)
        @ np.linalg.inv(middle_bracket)
        @ last_bracket
    )
    return np.sum((w_target - w_k) ** 2)

result = sco.minimize(
    sum_square_deviations,
    50,
    args=(w_target, risk_aversion, cov_matrix, P_view, Q_view, pi),
    bounds=[(0, None)],
)

s_k = result["x"]
omega_k = 1 / s_k

schneiderfelipe commented 4 years ago

@schneiderfelipe w_k is a vector of weights, omega_k is a number. In this case, p_k is the 1xN row vector (a single row of the picking matrix). So we are adding NxN to NxN matrices in the middle bracket. Then the three brackets together become w_k = (N x N)(N x N)(N x 1) so w_k is (Nx1) as needed.

Right, p_k is a row vector :smile:.

@schneiderfelipe I've worked on this a bit, and I'm not certain it's linear. The sum of squares is quadratic in w_k, and w_k is quadratic in 1/omega_k (because even after using the lemma, there is a 1/omega_k in the middle bracket and another in the last bracket). So actually we end up having to minimise a quartic function. I've attached my working – let me know if you spot any errors (there are bound to be a few)

IdzorekMethod.pdf

Excellent, I haven't found any error. I thought the final system could the linear, but due to the inverses, that was asking too much. It isn't convex either since the derivative is unbound, right?

robertmartin8 commented 4 years ago

Yeah, sadly not convex. I'm using scipy.optimize for now.

Have pushed some code – it seems to give roughly correct answers, but unfortunately not working entirely as intended. For example, if you pass 100% confidence in all your views, I would have guessed that this would be equal to zero uncertainty (which we fixed in #70). But a quick test is showing that it's not the case. I'm wondering whether this is a characteristic of Idzorek's method or (more likely) I've made a mistake in the code. Reading through the paper, it isn't immediately obvious to me that passing a confidence of 1 results in omega_k being zero.

    mcaps = {
        "GOOG": 927e9, "AAPL": 1.19e12,
        "FB": 574e9, "BABA": 533e9, 
        "AMZN": 867e9, "GE": 96e9,
        "AMD": 43e9, "WMT": 339e9,
        "BAC": 301e9, "GM": 51e9,
        "T": 61e9, "UAA": 78e9,
        "SHLD": 0, "XOM": 295e9,
        "RRC": 1e9, "BBY": 22e9,
        "MA": 288e9,"PFE": 212e9,
        "JPM": 422e9, "SBUX": 102e9,
    }

    viewdict = {"AAPL": 0.20, "BBY": -0.30, "BAC": 0, "SBUX": -0.2, "T": 0.131321}
    # view_confidences = [1, 0.2, 0.5, 0.3, 0.7]
    view_confidences = [1, 1, 1, 1, 1]

    bl = BlackLittermanModel(
        S,
        pi="market",
        absolute_views=viewdict,
        omega="idzorek",
        view_confidences=view_confidences,
        market_caps=mcaps,
    )
    bl.omega

array([[0.02417931, 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.01347744, 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.00900962, 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.00762943, 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.01369031]])

schneiderfelipe commented 4 years ago

They may not be zero but are admittedly small. It certainly has to do with Pi and Sigma. Can you run a toy example with fictitious values for Pi and Sigma (like constant prior and identity)? And what happens with zero confidences?

robertmartin8 commented 4 years ago

@schneiderfelipe Good idea:

# Identity covariance
S = pd.DataFrame(np.diag(np.ones((5,))), index=range(5), columns=range(5))
# Constant view of 0.3 return
views = {k: 0.3 for k in range(5)}
# Prior
pi = pd.Series(0.1, index=range(5))

# Perfect confidence - should equal views
bl = BlackLittermanModel(
    S, pi=pi, absolute_views=views, omega=np.diag(np.zeros(5))  # perfect confidence
)
pd.testing.assert_series_equal(bl.bl_returns(), pd.Series([0.3] * 5))

# No confidence - should equal priors
bl = BlackLittermanModel(S, pi=pi, absolute_views=views, omega=S * 1e6)
pd.testing.assert_series_equal(bl.bl_returns(), pi)

It's reassuring to see that the current BL functions as intended 😅. Now for Idzorek:

    # Idzorek perfect confidence
    bl = BlackLittermanModel(
        S, pi=pi, absolute_views=views, omega="idzorek", view_confidences=[1] * 5
    )
    # All entries are the same so we only need to look at one of them
    print(bl.omega[0,0])   # 0.0454
    print(bl.bl_returns()[0])  #0.205

    # Idzorek zero confidence
    bl = BlackLittermanModel(
        S, pi=pi, absolute_views=views, omega="idzorek", view_confidences=[0] * 5
    )
    print(bl.omega[0,0])  # 0.186
    print(bl.bl_returns()[0])  # 0.142

So we can see that it is scaling the correct way – lower confidence means higher uncertainty and closer to the prior. I verified this by looping over confidence values between 0 and 1, and it agrees with intuition. The only problem is that the extreme values shown above are not equal to the true zero/perfect confidence cases.

They may not be zero but are admittedly small.

I've gone through the paper again, but I still can't see why this has to be true. I may try working backwards and showing that omega_k = 0 does indeed minimise the sum of squares iff confidence is zero (in which case w_k% = w_k100%)

schneiderfelipe commented 4 years ago

Excellent! I have a gut feeling that Idzorek's method sort of "breaks the ties" with information from priors, what do you think? A test for this would be to change Pi to a constant value but keep the real Sigma.

robertmartin8 commented 4 years ago

On the plus side, I've checked that the code does give the right answer (i.e a divide by zero error) when we pass zero confidence. In this case, omega_k should be infinity. I guess I will catch this error and return 1e6 (huge uncertainty)?

But unfortunately I can't seem to figure out whether or not a confidence of 100% is supposed to result in omega=0 using Idzorek's method. I guess I could put in a special handler: if the confidence for a view is 1, set that omega to zero. But then the confidence won't be "linear", because there will be a big drop in omega from 0.99 confidence to 1.00 confidence.

Anyway, I ran some more tests and think there's a bug somewhere in my implementation:

    df = get_data()
    S = risk_models.sample_cov(df[["GOOG", "AAPL", "FB", "BABA"]])
    viewdict = {"GOOG": 0.40, "AAPL": 0.40, "FB": 0.40, "BABA": 0.40}
    bl = BlackLittermanModel(
        S,
        pi=np.array([0.1] * 4),
        absolute_views=viewdict,
        omega="idzorek",
        view_confidences=[1, 1, 1, 1],
    )
    print(bl.omega)
    print(bl.bl_returns())

[[0.0495849  0.         0.         0.        ]
 [0.         0.92239836 0.         0.        ]
 [0.         0.         0.09367556 0.        ]
 [0.         0.         0.         0.05108767]]

GOOG    0.137354
AAPL    0.124426
FB      0.135541
BABA    0.138346

For reference, the covariance matrix is:

          GOOG      AAPL        FB      BABA
GOOG  0.093211  0.046202  0.030801  0.029793
AAPL  0.046202  0.207537  0.021141  0.023839
FB    0.030801  0.021141  0.137201  0.030073
BABA  0.029793  0.023839  0.030073  0.100958

The uncertainty matrix is pretty reasonable except for the value of AAPL. It does have the highest variance of the group, but certainly not enough to justify the 10x higher uncertainty in the view.

The returns seem to have the correct pattern (i.e higher uncertainty means closer to prior), but they are all wayyy too close to the prior (this is 100% confidence!)

I'll try to look it over with fresh eyes this weekend

robertmartin8 commented 4 years ago

@schneiderfelipe I've been thinking of a different way to convert percentage confidence into omega. Let me know if it makes any sense.

In addition to providing the view, we get the user to provide a symmetric error bar and a percentage confidence. e.g I expect AAPL to return 50%, and I am 95% sure that this return will be between 20% and 80%. With this, assuming a normal distribution, you can calculate the standard deviation. In this case, 95% corresponds to 2 sigma, so the standard deviation of the view is 0.15 and the diagonal entry of omega is the square of that.

I'm still trying to get Idzorek's method to work, but I personally prefer the method I've just described (and I'd use that over Idzorek for my investments).

Regarding my question about whether 100% confidence under Idzorek is supposed to result in returns equal to the views, I think the answer is no. Page 21 of the paper seems to imply that the 0-100% confidence scale is in addition to the fundamental uncertainty that comes from the variance (i.e the tau p Sigma p term). I'm not sure how much I agree with that.

robertmartin8 commented 4 years ago

@schneiderfelipe I've just read Walter's Black-Litterman Model in Detail paper and it's basically solved all my problems. He gives a closed-form solution for Idzorek's method on page 30.

Additionally, the method I described above regarding standard deviations is described on page 15 😂 .

Note to self: if you ever have a question about Black-Litterman, the first place to look is Walter's paper.

robertmartin8 commented 4 years ago

This is essentially all there is to it 🤦

view_omegas = []
for view_idx in range(len(Q)):
    conf = view_confidences[view_idx]
    P_view = P[view_idx].reshape(1, -1)
    alpha = (1 - conf) / conf  # formula (44)
    omega = tau * alpha * P_view @ cov_matrix @ P_view.T  # formula (41)
    view_omegas.append(omega.item())

return np.diag(view_omegas)

It's got some very nice theoretical features. With an identity covariance, and constant prior/views, the confidence results in a linear scaling between prior returns and view returns as we go from 0 to 1.

schneiderfelipe commented 4 years ago

@schneiderfelipe I've been thinking of a different way to convert percentage confidence into omega. Let me know if it makes any sense.

In addition to providing the view, we get the user to provide a symmetric error bar and a percentage confidence. e.g I expect AAPL to return 50%, and I am 95% sure that this return will be between 20% and 80%. With this, assuming a normal distribution, you can calculate the standard deviation. In this case, 95% corresponds to 2 sigma, so the standard deviation of the view is 0.15 and the diagonal entry of omega is the square of that.

I'm still trying to get Idzorek's method to work, but I personally prefer the method I've just described (and I'd use that over Idzorek for my investments).

Your idea is nice! I think it is a nice companion to Idzorek.

robertmartin8 / PyPortfolioOpt

Idzorek's method for Black-Litterman #95