ahmadgazar / centroidal-MPC

GNU General Public License v3.0
0 stars 0 forks source link

On covariance gradient propagation in chance-constrained SCP #3

Open ahmadgazar opened 2 years ago

ahmadgazar commented 2 years ago

Recall the chance-constraint SCP subproblem $j$ is written as the following QP:

\begin{align} \&\min_{\substack{\boldsymbol{x}_k,\ \boldsymbol{u}_{e,k}}} l^jf(N) + \sum^{N-1}\{i=0}l^j(\boldsymbol{x}_k, \boldsymbol{u}k) + \sum^{N}_{i=0} \boldsymbol{\gamma} \tag{1a}\newline \& \quad \quad \boldsymbol{x}\{k+1} = \boldsymbol{f}(\boldsymbol{x}_k^j, \boldsymbol{u}_k^j) + \boldsymbol{A}_k(\boldsymbol{x}_k - \boldsymbol{x}_k^j) + \boldsymbol{B}_k (\boldsymbol{u}_k - \boldsymbol{u}^j_k), \tag{1b} \newline \& \quad\quad \boldsymbol{G}^ui \boldsymbol{f}\{e,k} + \phi^{-1}(1-\delta_u)(|| \boldsymbol{G}^ui \boldsymbol{K}||\{\Sigmak} + \nabla\{\boldsymbol{z}}|| \boldsymbol{G}^ui \boldsymbol{K}||\{\Sigma_k} . (\boldsymbol{z}-\boldsymbol{z}^j)) \leq b_i, \tag{1c} \newline \& \quad \quad \omega_j |\boldsymbol{x}_k-\boldsymbol{x}^j_k| - \boldsymbol{\Delta}_j \leq \boldsymbol{\gamma} \quad \gamma \geq 0, \tag{1d}\newline \&\quad\quad \boldsymbol{x}_0 = \boldsymbol{x}(0), \tag{1e}\newline \&\quad\quad \boldsymbol{x}_f = \boldsymbol{x}(N), \tag{1f}\newline \&\quad\quad \forall k = 0,1,..,N-1,\tag{1g} \end{align}

where $\boldsymbol{A}k \overset{\Delta}{=} \nabla_{\boldsymbol{x}} \boldsymbol{f}(\boldsymbol{x}, \boldsymbol{u})|\{\boldsymbol{x}^j_k, \boldsymbol{u}^j_k}$, $\boldsymbol{B}k \overset{\Delta}{=} \nabla_{\boldsymbol{u}} \boldsymbol{f}(\boldsymbol{x}, \boldsymbol{u})|\{\boldsymbol{x}^j_k, \boldsymbol{u}^j_k}$, $\boldsymbol{z} = \begin{bmatrix} \boldsymbol{x}^T & \boldsymbol{u}^T \end{bmatrix}^T$ is the concatenated states and controls, and finally $\phi$ is the CDF of the normal distribution with $\delta_u$ being the allocated risk violation of the individual chance constraint.

ahmadgazar commented 2 years ago

(1c) is the linearized individual chance constraint of the friction pyramid. Notice that since constraints are linearized as well at every SCP iteration, there is the extra term derivative term $\nabla_{\boldsymbol{z}}|| \boldsymbol{G}^ui \boldsymbol{K}||\{\Sigma^j_k}$, which needs extra care. First, let's remove out $ \boldsymbol{G}^ui \boldsymbol{K}$ since they are constant to focus more on $\nabla\{\boldsymbol{z}}||\Sigma_k||$.

Recall that the covariance dynamics at the next time step is computed as follows: \begin{align} \boldsymbol{\Sigma}_{k+1} = (\boldsymbol{A}_k+\boldsymbol{B}k) \boldsymbol{\Sigma}_{xu\{k}} (\boldsymbol{A}_k+\boldsymbol{B}_k)^T + \boldsymbol{C}_k \Sigma_p \boldsymbol{C}_k^T + \boldsymbol{\Sigma}_w, \end{align} where $\boldsymbol{C}k \overset{\Delta}{=} \nabla_{\boldsymbol{p}} \boldsymbol{f}(\boldsymbol{x}, \boldsymbol{u})|\{\boldsymbol{x}^j_k, \boldsymbol{u}^j_k}$ is the derivative of the dynamics w.r.t. the contact positions since we are considering uncertainties on the contact position, and $\boldsymbol{\Sigma}_w$ is the covariance of the additive white noise on all the dynamics.

The above equation can be factorized in the following matrix form: \begin{align} \boldsymbol{\Sigma}_{k+1} = \begin{bmatrix} \boldsymbol{A}_k & \boldsymbol{B}_k \end{bmatrix} \begin{bmatrix} \boldsymbol{\Sigma}_k & \boldsymbol{\Sigma}_k \boldsymbol{K}^T \newline \boldsymbol{K} \boldsymbol{\Sigma}^T & \boldsymbol{K}\boldsymbol{\Sigma}_k \boldsymbol{K}^T \end{bmatrix} \begin{bmatrix}\boldsymbol{A}_k \newline \boldsymbol{B}_k \end{bmatrix} + \boldsymbol{C}_k \Sigma_p \boldsymbol{C}_k^T + \boldsymbol{\Sigma}_w. \end{align}

Given the above dynamics, we can go on and take the derivative $\nabla_{\boldsymbol{z}}||\Sigma_k||$ w.r.t. the current state and control $\boldsymbol{x}_k$, and $\boldsymbol{u}_k$ respectively, which I computed with JAX. Notice that since we are solving a nonlinear OCP, those derivatives are valid only at the linearization knots. However, there is another derivative term, which is usually ignored. We need also to consider the propagated covariance derivatives w.r.t. the previous states and controls up till the current state (${\boldsymbol{x}0, ..., \boldsymbol{x}\{k-1}}$, ${\boldsymbol{u}0, ..., \boldsymbol{u}\{k-1}}$).

That being said, the derivatives of the covariance from the previous states and controls is propagated as follows: \begin{align} \nabla_{\boldsymbol{z}}||\Sigma_{k+1}|| = \boldsymbol{A}k \nabla\{\boldsymbol{z}}||\Sigma_k|| \boldsymbol{A}^T_k.
\end{align}