Open Marzio-USI opened 3 months ago
Looking ....
On Tue, May 7, 2024 at 12:10 PM Marzio-USI @.***> wrote:
When inspecting the Shur complementary portfolio, I encountered some issues.
In the formulation of your presentation this can be found:
1.WJUNzwuJfEXrJmD0Qkot6g.png (view on web) https://github.com/microprediction/precise/assets/71531645/4f1192b3-f664-4cad-913d-fd110bc3c689 Screenshot.2024-05-07.at.18.06.19.png (view on web) https://github.com/microprediction/precise/assets/71531645/267db4ad-9939-429c-8f63-15ca39aad6d0
with $$A^{c}(\gamma)= A - \gamma B D^{-1}C$$ By following the third step of the algorithm (Augment and allocate) starting from
precise/precise/skaters/portfoliostatic/schurportfactory.py line 122
3. Augment and allocate
Ag, Dg, info = schur_augmentation(A=A, B=B, C=C, D=D, gamma=gamma) <--------HERE
precise/precise/skaters/portfoliostatic/schurportutil.py , line 13
def schur_augmentation(A,B,C,D, gamma): """ Mess with A, D to try to incorporate some off-diag info """ if gamma>0.0: max_gamma = _maximal_gamma(A=A, B=B, C=C, D=D) augA, bA = pseudo_schur_complement(A=A, B=B, C=C, D=D, gamma=gamma max_gamma) # <-------------HERE augD, bD = pseudo_schur_complement(A=D, B=C, C=B, D=A, gamma=gamma max_gamma) # <---------HERE TOO
augmentation_fail = False if not is_positive_def(augA): try: Ag = nearest_pos_def(augA) except np.linalg.LinAlgError: augmentation_fail=True else: Ag = augA if not is_positive_def(augD): try: Dg = nearest_pos_def(augD) except np.linalg.LinAlgError: augmentation_fail=True else: Dg = augD if augmentation_fail: print('Warning: augmentation failed') reductionA = 1.0 reductionD = 1.0 reductionRatioA = 1.0 Ag = A Dg = D else: reductionD = np.linalg.norm(Dg)/np.linalg.norm(D) reductionA = np.linalg.norm(Ag)/np.linalg.norm(A) reductionRatioA = reductionA/reductionD else: reductionRatioA = 1.0 reductionA = 1.0 reductionD = 1.0 Ag = A Dg = D info = {'reductionA': reductionA, 'reductionD': reductionD, 'reductionRatioA': reductionRatioA} return Ag, Dg, info
We arrive at this pseudo_schur_complement function where $A^{c}(\gamma)= A - \gamma B D^{-1}C$ is computed:
precise/precise/skaters/portfoliostatic/schurportutil.py , line 57
def pseudo_schur_complement(A, B, C, D, gamma, lmbda=None, warn=False): """ Augmented cov matrix for "A" inspired by the Schur complement """ if lmbda is None: lmbda=gamma try: Ac_raw = schur_complement(A=A, B=B, C=C, D=D, gamma=gamma) nA = np.shape(A)[0] nD = np.shape(D)[0] Ac = to_symmetric(Ac_raw) M = symmetric_step_up_matrix(n1=nA, n2=nD) Mt = np.transpose(M) BDinv = multiply_by_inverse(B, D, throw=False) BDinvMt = np.dot(BDinv, Mt) Ra = np.eye(nA) - lmbda * BDinvMt Ag = inverse_multiply(Ra, Ac, throw=False, warn=False) except np.linalg.LinAlgError: if warn: print('Pseudo-schur failed, falling back to A') Ag = A n = np.shape(A)[0] b = np.ones(shape=(n,1)) return Ag, b
However after that the following operations are performed:
$$ \begin{aligned} Ag &= Ra^{-1} \cdot Ac\ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot Ac \ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot \text{tosim}(Ac_raw) \ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot \text{tosim}(A^{c}(\gamma)) \ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot \text{tosim}(A - \gamma B D^{-1}C) \ \end{aligned} $$
So if we assume that $Ac$ is already symmetric (to simplify the process):
$$ \begin{aligned} Ag &= (I - \lambda B D^{-1}M^T)^{-1} \cdot (A - \gamma B D^{-1}C) \ \end{aligned} $$
Which does not match what I expect from the presentation:
- "Before performing inter-group allocation we make a different modification. We multiply the precision of $A^c$ by $b_Ab_A^T$ element-wise (and similarly, multiply the precision of $D^c$ by $b_Db_D^T$)".
Meaning:
$$ A' = (A - \gamma B D^{-1}C)^{*b_A} $$
Specifically I dont understand where $M$ the symmetric step up matrix comes from and why $b_A$ is never computed.
And again when we perform the sub allocation: precise/precise/skaters/portfoliostatic/schurportfactory.py lines 132 and 138
Sub-allocatewA = hierarchical_schur_complementary_portfolio(cov=Ag, port=port, port_kwargs=port_kwargs,
alloc=alloc, alloc_kwargs=alloc_kwargs, splitter=splitter, splitter_kwargs=splitter_kwargs, seriator=seriator, seriator_kwargs=seriator_kwargs, seriation_depth = seriation_depth-1, delta=delta, gamma=gamma)wD = hierarchical_schur_complementary_portfolio(cov=Dg, port=port, port_kwargs=port_kwargs, alloc=alloc, alloc_kwargs=alloc_kwargs, splitter=splitter, splitter_kwargs=splitter_kwargs, seriator=seriator, seriator_kwargs=seriator_kwargs, seriation_depth=seriation_depth - 1, delta=delta, gamma=gamma)
the same augmented matrix $Ag$ used to allocate in:
precise/precise/skaters/portfoliostatic/schurportfactory.py line 122
3. Augment and allocateAg, Dg, info = schur_augmentation(A=A, B=B, C=C, D=D, gamma=gamma)aA, aD = alloc(covs=[Ag, Dg]) #<----------HERE
is also passed to the next "iteration", while on the slides I found:
- The intra-group allocation pertaining to block $A$ is determined by covariance matrix $A_{/ b_A(\lambda)}^c$. In this notation the vector $bA(\lambda)=\overrightarrow{1}-\lambda B D^{-1} \overrightarrow{1}$. The generalized Schur complement is $A^c(\gamma)=A-\gamma B D^{-1} C$. The notation $A{/ b}^c$ denotes $A^c /\left(b b^T\right)$ with division performed element-wise.
$$ A'' = A^c(\gamma)/b_Ab_A^T $$
— Reply to this email directly, view it on GitHub https://github.com/microprediction/precise/issues/36, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWLINM526MLLE3YOU2YIDLZBD4HTAVCNFSM6AAAAABHLJ5UTOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4DGNZUGE4DGMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
My mistake; in case someone is having the same doubts, this might be helpful:
$$ \Sigma = \begin{bmatrix} A & B\ C & D \end{bmatrix} $$
Where $\Sigma \in \mathbb{R}^{n \times n}$ and:
The minimum variance portfolio:
$$ w \propto \Sigma^{-1} \vec{1} $$
We can view any expression of the form $\Sigma^{-1}$ in terms of the minimum variance portfolio $w(\Sigma)$ and its portfolio variance $\nu(\Sigma)$, viz:
$$ \Sigma^{-1}\vec{1} = \vec{1}^T \Sigma^{-1} \vec{1} \cdot w(\Sigma) = \dfrac{1}{\nu(\Sigma)}w(\Sigma) $$
In particular if $B = 0$ then the global minimum variance allocation is proportional to:
$$ w \propto \Sigma^{-1}\vec{1} = \begin{bmatrix} A & 0\ 0 & B\ \end{bmatrix}^{-1} \vec{1} = \begin{bmatrix} A^{-1} \vec{1}\ D^{-1} \vec{1} \end{bmatrix} = \begin{bmatrix} \dfrac{1}{\nu(A)}w(A)\ \dfrac{1}{\nu(D)}w(D)\ \end{bmatrix} $$
Instead with Schur allocation we can say that:
$$ \Sigma^{-1} = \begin{bmatrix} A & B\ C & D \end{bmatrix}^{-1} = \begin{bmatrix} (A - BD^{-1}C)^{-1} & 0 \ 0 & (D - CA^{-1}B) \end{bmatrix} \begin{bmatrix} I_p & -BD^{-1} \ -CA^{-1} & I_q \end{bmatrix} $$
Where $I_p$, $I_q$ are identity matrices $\in \mathbb{R}^{p \times p}$ and $\in \mathbb{R}^{q \times q}$ respectively.
Thus, the global minimum variance allocation is proportional to:
$$ \begin{aligned} w \propto \Sigma^{-1}\vec{1} &= \begin{bmatrix} A & B\ C & D \end{bmatrix}^{-1} \vec{1}\ &= \begin{bmatrix} (A - BD^{-1}C)^{-1} & 0 \ 0 & (D - CA^{-1}B) \end{bmatrix} \begin{bmatrix} I_p & -BD^{-1} \ -CA^{-1} & I_q \end{bmatrix} \cdot \vec{1} \ \end{aligned} $$
With $A^c = (A- BD^{-1}C)$ and $D^c = (D - CA^{-1}B)$ then:
$$ \begin{aligned} w \propto \Sigma^{-1}\vec{1} &= \begin{bmatrix} (A^c)^{-1} & 0 \ 0 & (D^c)^{-1} \end{bmatrix} \begin{bmatrix} I_p & -BD^{-1} \ -CA^{-1} & I_q \end{bmatrix} \cdot \vec{1} \ &\ &= \begin{bmatrix} (A^c)^{-1} & -(A_c)^{-1}BD^{-1} \ -(D^{c})^{-1}CA^{-1} & (D^c)^{-1} \end{bmatrix} \cdot \vec{1} \end{aligned} $$
Which can be transformed as follows:
$$ \begin{aligned} \begin{bmatrix} (A^c)^{-1} & -(A_c)^{-1}BD^{-1} \ -(D^{c})^{-1}CA^{-1} & (D^c)^{-1} \end{bmatrix} \cdot \vec{1} &= \begin{bmatrix} (A^c)^{-1} & -(A_c)^{-1}BD^{-1} \ -(D^{c})^{-1}CA^{-1} & (D^c)^{-1} \end{bmatrix} \begin{bmatrix} \mathbf{\vec{1}_p}\ \mathbf{\vec{1}_q}\ \end{bmatrix} \ &\ &= \begin{bmatrix} (A^c)^{-1} \mathbf{\vec{1}_p} - \big((A_c)^{-1}BD^{-1}\big) \mathbf{\vec{1}_q}\ (D^c)^{-1} \mathbf{\vec{1}_q} - \big((D_c)^{-1}CA^{-1}\big) \mathbf{\vec{1}_p}\ \end{bmatrix} \end{aligned} $$
We can then define a matrix $M$ such that:
$$ M \mathbf{\vec{1_p}}= \mathbf{\vec{1_q}} $$
To simplify it to:
$$ \begin{aligned} \begin{bmatrix} (A^c)^{-1} & -(A_c)^{-1}BD^{-1} \ -(D^{c})^{-1}CA^{-1} & (D^c)^{-1} \end{bmatrix} \;\; \vec{1} &= \begin{bmatrix} (A^c)^{-1} & -(A_c)^{-1}BD^{-1} \ -(D^{c})^{-1}CA^{-1} & (D^c)^{-1} \end{bmatrix} \begin{bmatrix} \mathbf{\vec{1}_p}\ \mathbf{\vec{1}_q}\ \end{bmatrix} \ &\ &= \begin{bmatrix} (A^c)^{-1} \mathbf{\vec{1}_p} - \big((A_c)^{-1}BD^{-1}\big) \mathbf{\vec{1}_q}\ (D^c)^{-1} \mathbf{\vec{1}_q} - \big((D_c)^{-1}CA^{-1}\big) \mathbf{\vec{1}_p}\ \end{bmatrix} \ &\ &= \begin{bmatrix} (A^c)^{-1} \mathbf{\vec{1}_p} - \big((A_c)^{-1}BD^{-1}\big) M^{(1)}\mathbf{\vec{1}_p} \ (D^c)^{-1} \mathbf{\vec{1}_q} - \big((D_c)^{-1}CA^{-1}\big) M^{(2)}\mathbf{\vec{1}_q}\ \end{bmatrix} \ \end{aligned} $$
So that we can group them together
$$ \begin{aligned} \begin{bmatrix} (A^c)^{-1} & -(A_c)^{-1}BD^{-1} \ -(D^{c})^{-1}CA^{-1} & (D^c)^{-1} \end{bmatrix} \cdot \vec{1} &= \begin{bmatrix} (A^c)^{-1} \mathbf{\vec{1}_p} - \big((A_c)^{-1}BD^{-1}\big) M^{(1)}\mathbf{\vec{1}_p} \ (D^c)^{-1} \mathbf{\vec{1}_q} - \big((D_c)^{-1}CA^{-1}\big) M^{(2)}\mathbf{\vec{1}_q}\ \end{bmatrix} \ &\ &= \begin{bmatrix} \bigg((A^c)^{-1} - (A^c)^{-1}BD^{-1}M^{(1)}\bigg)\mathbf{\vec{1}_p} \ \bigg((D^c)^{-1} - (D^c)^{-1}CA^{-1}M^{(2)}\bigg)\mathbf{\vec{1}_q}\ \end{bmatrix} \ &\ &= \begin{bmatrix} \bigg((A^c)^{-1}\Big( I_p - BD^{-1}M^{(1)} \Big)\bigg)\mathbf{\vec{1}_p} \ \bigg((D^c)^{-1}\Big( I_q - CA^{-1}M^{(2)} \Big)\bigg)\mathbf{\vec{1}_q}\ \end{bmatrix} \end{aligned} $$
Thus following the matrix inverse property:
$$ (A B)^{-1} = B^{-1}A^{-1} $$
We can do:
$$ \begin{aligned} w \propto \Sigma^{-1}\vec{1} &= \begin{bmatrix} A & B\ C & D \end{bmatrix}^{-1} \vec{1}\ &\ &= \begin{bmatrix} \bigg((A^c)^{-1}\Big( I_p - BD^{-1}M^{(1)} \Big)\bigg)\mathbf{\vec{1}_p} \ \bigg((D^c)^{-1}\Big( I_q - CA^{-1}M^{(2)} \Big)\bigg)\mathbf{\vec{1}_q}\ \end{bmatrix} \ &\ &= \begin{bmatrix} \dfrac{1}{\nu(Ag)}w(Ag)\ \dfrac{1}{\nu(Dg)}w(Dg)\ \end{bmatrix} \end{aligned} $$
where
$$ Ag = \bigg( (A^c)^{-1}\Big( I_p - BD^{-1}M^{(1)} \Big) \bigg)^{-1} = \Big( I_p - BD^{-1}M^{(1)} \Big)^{-1} \Big((A^c)^{-1}\Big)^{-1} = \Big( I_p - BD^{-1}M^{(1)} \Big)^{-1} A^c $$
and
$$ Dg = \bigg( (D^c)^{-1}\Big( I_q - CA^{-1}M^{(2)} \Big) \bigg)^{-1} = \Big( I_q - CA^{-1}M^{(2)} \Big)^{-1} \Big((D^c)^{-1}\Big)^{-1} = \Big( I_q - CA^{-1}M^{(2)} \Big)^{-1} D^c $$
In conclusion:
$$ \begin{aligned} w \propto \Sigma^{-1}\vec{1} &= \begin{bmatrix} A & B\ C & D \end{bmatrix}^{-1} \vec{1}\ &\ &= \begin{bmatrix} \dfrac{1}{\nu(Ag)}w(Ag)\ \dfrac{1}{\nu(Dg)}w(Dg)\ \end{bmatrix} \ &\ &= \begin{bmatrix} \dfrac{1}{\nu\Bigg(\Big( I_p - BD^{-1}M^{(1)} \Big)^{-1} A^c\Bigg)}w\Bigg(\Big( I_p - BD^{-1}M^{(1)} \Big)^{-1} A^c\Bigg)\ \dfrac{1}{\nu\Bigg(\Big( I_q - CA^{-1}M^{(2)} \Big)^{-1} D^c\Bigg)}w\Bigg(\Big( I_q - CA^{-1}M^{(2)} \Big)^{-1} D^c\Bigg)\ \end{bmatrix} \end{aligned} $$
One question that still remains is how to choose $M$
From my perspective, $M$ can be defined as follows:
$$ M^{(1)} = \begin{cases} I_p & \text{if } p = q \ \dfrac{1}{p} \mathbf{\vec{1}_q}\mathbf{\vec{1}_p^T} \end{cases} $$
$$ M^{(2)} = \begin{cases} I_q & \text{if } p = q \ \dfrac{1}{q} \mathbf{\vec{1}_p}\mathbf{\vec{1}_q^T} \end{cases} $$
Let me know if something is wrong or missing, and if there is a particular motivation for which $M$ is defined as the 'step-up matrix'.
That's exactly right the choice of step-up matrix is a bit arbitrary here.
When inspecting the Shur complementary portfolio, I encountered some issues.
In the formulation of your presentation this can be found:
with $$A^{c}(\gamma)= A - \gamma B D^{-1}C$$ By following the third step of the algorithm (Augment and allocate) starting from
precise/precise/skaters/portfoliostatic/schurportfactory.py
line 122precise/precise/skaters/portfoliostatic/schurportutil.py
, line 13We arrive at this pseudo_schur_complement function where $A^{c}(\gamma)= A - \gamma B D^{-1}C$ is computed:
precise/precise/skaters/portfoliostatic/schurportutil.py
, line 57However after that the following operations are performed:
$$ \begin{aligned} Ag &= Ra^{-1} \cdot Ac\ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot Ac \ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot \text{tosim}(Ac_raw) \ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot \text{tosim}(A^{c}(\gamma)) \ &= (I - \lambda B D^{-1}M^T)^{-1} \cdot \text{tosim}(A - \gamma B D^{-1}C) \ \end{aligned} $$
So if we assume that $Ac$ is already symmetric (to simplify the process):
$$ \begin{aligned} Ag &= (I - \lambda B D^{-1}M^T)^{-1} \cdot (A - \gamma B D^{-1}C) \ \end{aligned} $$
Which does not match what I expect from the presentation:
Meaning:
$$ A' = (A - \gamma B D^{-1}C)^{*b_A} $$
Specifically I dont understand where $M$ the symmetric step up matrix comes from and why $b_A$ is never computed.
And again when we perform the sub allocation:
precise/precise/skaters/portfoliostatic/schurportfactory.py
lines 132 and 138the same augmented matrix $Ag$ used to allocate in:
precise/precise/skaters/portfoliostatic/schurportfactory.py
line 122is also passed to the next "iteration", while on the slides I found:
$$ A'' = A^c(\gamma)/b_Ab_A^T $$