Closed dennisYatunin closed 8 months ago
If we imagine a system with n prognostic variables, and denote the tendency due to the boundary term as t_{bc}
, t_{bc}
is a vector of length n. Then, the derivative is an object like ∂t_{bc,j}∂Y_i
, where i and j range from 1 to n, and Y_i
is (for us) the prognostic state defined on cell centers at the top (or bottom) layer.
Two questions
∂t_{bc,j}∂Y_i
or something different (e.g. derivatives of boundary conditions with respect to state)? Based on what is written, we set values of ∂(boundary_field)/∂(field)
, and I think boundary_field
is the tendency contribution, but want to double check.∂t_{bc,j}∂Y_i \propto \delta_{i,j}
? Or could we set nonzero values of the tendency of one variable with respect to another?Can we:
This describes the current issues well and clearly. The description of the solution could use detail (though the draft PR helps).
For us to be able to track progress, could you please add milestones expected completion dates (roughly weekly milestones)?
It'll also be important for @sriharshakandala, @charleskawczynski, and @simonbyrne to review, especially with regard to GPU-friendliness of the proposed solutions.
Wherever this refactoring lands, it'll be important to keep the current Schur complement formulation around (e.g., for convection resolving simulations). It may be nice to allow this as an option that lives within this refactoring, rather than outside of it.
Also, please do not leave the simple array-based implementation test as an after thought-- this will be helpful to evaluate the design before it's fully implemented, and it will be too late to help reviewers judge the quality of the design if it's the very last thing that is added. cc @dennisYatunin
@kmdeck @juliasloan25 After discussing the matter with @simonbyrne, it looks like we will be able to set the derivatives of operator boundary conditions without adding any new operators (and without any assumptions like ∂t_{bc,j}∂Y_i \propto \delta_{i,j}
). However, we will no longer be able to attach boundary conditions to face-to-center operators like DivergenceF2C
, and we will instead need to ensure that the arguments of these operators either have well-defined boundary values or are wrapped in a SetBoundaryOperator
. This is because, if we allow operators to overwrite the face values of their inputs, then we also need to allow the corresponding operator matrices to overwrite the values of matrices/vectors that they get multiplied by, but there is no good way to distinguish a matrix that overwrites values during multiplication from one that does not overwrite values (since every value stored in a Field
must have the same type, we cannot simply attach additional type information to indicate how values should be handled during multiplication).
Here are some concrete examples:
flux³
is a field of scalars on the faces of three cells, and that top_flux³
is another scalar. Suppose that we define
div = Operators.DivergenceF2C()
set_boundary_flux = Operators.SetBoundaryOperator(; top = Operators.SetValue(CT3(top_flux³)))
flux = set_boundary_flux.(CT3.(flux³))
CT3
is short for Contravariant3Vector
, so flux
is a field that represents a vector-valued flux through each cell face, where the flux through the top cell face is overwritten from CT3(flux³[3.5])
to CT3(top_flux³)
. If J[n]
denotes the Jacobian of the metric tensor at index n
, we can express div.(flux)
as
\begin{gather}
\text{div}.(\text{flux}) =
\text{div}.\begin{pmatrix}
\text{CT3}(\text{flux³}[0.5]) \\
\text{CT3}(\text{flux³}[1.5]) \\
\text{CT3}(\text{flux³}[2.5]) \\
\text{CT3}(\text{top\_flux³})
\end{pmatrix} =
\begin{pmatrix}
\dfrac{1}{J[1]}\ (J[1.5]\ \text{flux³}[1.5] - J[0.5]\ \text{flux³}[0.5]) \\
\dfrac{1}{J[2]}\ (J[2.5]\ \text{flux³}[2.5] - J[1.5]\ \text{flux³}[1.5]) \\
\dfrac{1}{J[3]}\ (J[3.5]\ \text{top\_flux³} - J[2.5]\ \text{flux³}[2.5])
\end{pmatrix}
\end{gather}
Now, suppose that
\frac{\partial\text{top\_flux³}}{\partial\text{flux³}[n]} =
\begin{cases}
\text{top\_flux³\_deriv} & n = 3.5 \\
0 & n < 3.5
\end{cases}
This means that the derivative of div.(flux)
with respect to flux³
is the matrix
\begin{align}
\dfrac{\partial\text{div}.(\text{flux})}{\partial\text{flux³}} &= \begin{pmatrix}
\dfrac{\partial\text{div}.(\text{flux})[1]}{\partial\text{flux³}[0.5]} &
\dfrac{\partial\text{div}.(\text{flux})[1]}{\partial\text{flux³}[1.5]} &
\dfrac{\partial\text{div}.(\text{flux})[1]}{\partial\text{flux³}[2.5]} &
\dfrac{\partial\text{div}.(\text{flux})[1]}{\partial\text{flux³}[3.5]} \\
\dfrac{\partial\text{div}.(\text{flux})[2]}{\partial\text{flux³}[0.5]} &
\dfrac{\partial\text{div}.(\text{flux})[2]}{\partial\text{flux³}[1.5]} &
\dfrac{\partial\text{div}.(\text{flux})[2]}{\partial\text{flux³}[2.5]} &
\dfrac{\partial\text{div}.(\text{flux})[2]}{\partial\text{flux³}[3.5]} \\
\dfrac{\partial\text{div}.(\text{flux})[3]}{\partial\text{flux³}[0.5]} &
\dfrac{\partial\text{div}.(\text{flux})[3]}{\partial\text{flux³}[1.5]} &
\dfrac{\partial\text{div}.(\text{flux})[3]}{\partial\text{flux³}[2.5]} &
\dfrac{\partial\text{div}.(\text{flux})[3]}{\partial\text{flux³}[3.5]}
\end{pmatrix} = \\[1em]
&= \begin{pmatrix}
-\dfrac{J[0.5]}{J[1]} & \dfrac{J[1.5]}{J[1]} & 0 & 0 \\
0 & -\dfrac{J[1.5]}{J[2]} & \dfrac{J[2.5]}{J[2]} & 0 \\
0 & 0 & -\dfrac{J[2.5]}{J[3]} & \dfrac{J[3.5]}{J[3]}\ \text{top\_flux³\_deriv}
\end{pmatrix}
\end{align}
After this SDI is implemented, the preferred way to specify this matrix will be div_matrix.(∂flux∂flux³)
, where
div_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(div)
set_boundary_flux_deriv = Operators.SetBoundaryOperator(;
top = Operators.SetValue(CT3(top_flux³_deriv)),
)
∂flux∂flux³ = set_boundary_flux_deriv.(CT3.(one.(flux³)))
The field ∂flux∂flux³
represents the derivative of each vector-valued flux with respect to the corresponding scalar in flux³
:
\text{∂flux∂flux³} = \begin{pmatrix}
\dfrac{\partial\text{flux}[0.5]}{\partial\text{flux³}[0.5]} \\
\dfrac{\partial\text{flux}[1.5]}{\partial\text{flux³}[1.5]} \\
\dfrac{\partial\text{flux}[2.5]}{\partial\text{flux³}[2.5]} \\
\dfrac{\partial\text{flux}[3.5]}{\partial\text{flux³}[3.5]}
\end{pmatrix} = \begin{pmatrix}
\text{CT3}(1) \\
\text{CT3}(1) \\
\text{CT3}(1) \\
\text{CT3}(\text{top\_flux³\_deriv})
\end{pmatrix}
θ
is a field of scalars on the centers of three cells, and that h
, K
, Fb
, and Ft
are all differentiable functions from scalars to scalars. In addition, suppose that
interp = Operators.InterpolateC2F()
grad = Operators.GradientC2F()
set_boundary_fluxes = Operators.SetBoundaryOperator(;
bottom = Operators.SetValue(C3(∂x³∂ξ³[0.5] Fb(θ[1]))),
top = Operators.SetValue(C3(∂x³∂ξ³[3.5] Ft(θ[3]))),
)
div = Operators.DivergenceF2C()
The values of ∂x³∂ξ³
are used to convert the boundary fluxes Fb(θ[1])
and Ft(θ[3])
from physical units (m/s
if the values of θ
are dimensionless), to "covariant units" (m^2/s
if the values of θ
are dimensionless). This makes it possible to specify the boundary fluxes as Covariant3Vector
s, which are normal to the boundary faces. We can also project the boundary fluxes onto the ξ³
-axis to turn them into Contravariant3Vector
s, which are the native inputs of the div
operator:
\begin{align}
\text{CT3}(\text{C3}(∂x³∂ξ³[0.5]\ Fb[1])) &= \text{CT3}(\text{C3}(1))\ ∂x³∂ξ³[0.5]\ Fb[1] = \\
&= \text{CT3}(g³³[0.5])\ ∂x³∂ξ³[0.5]\ Fb[1] = \\
&= \text{CT3}(g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb[1]) \\
\text{CT3}(\text{C3}(∂x³∂ξ³[3.5]\ Ft[3])) &= \text{CT3}(\text{C3}(1))\ ∂x³∂ξ³[3.5]\ Ft[3] = \\
&= \text{CT3}(g³³[3.5])\ ∂x³∂ξ³[3.5]\ Ft[3] = \\
&= \text{CT3}(g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft[3])
\end{align}
To simplify our notation, we will be using the following shorthands:
\begin{align}
h[n] &= h(θ[n]) \\
K[n] &= K(θ[n]) \\
Fb[n] &= Fb(θ[n]) \\
Ft[n] &= Ft(θ[n]) \\
h'[n] &= h'(θ[n]) \\
K'[n] &= K'(θ[n]) \\
Fb'[n] &= Fb'(θ[n]) \\
Ft'[n] &= Ft'(θ[n]) \\
\end{align}
We can express the tendency of θ
as
\begin{align}
θₜ &= \text{div}.(\text{set\_boundary\_fluxes}.(\text{interp}.(K.(θ))\ .*\ \text{grad}.(h.(θ)))) = \\
&= \text{div}.\left(\text{set\_boundary\_fluxes}.\left(
\text{interp}.\begin{pmatrix} K[1] \\ K[2] \\ K[3] \end{pmatrix}
\ .*\
\text{grad}.\begin{pmatrix} h[1] \\ h[2] \\ h[3] \end{pmatrix}
\right)\right) = \\[1em]
&= \text{div}.\left(\text{set\_boundary\_fluxes}.\left(
\begin{pmatrix}
\text{undefined} \\
\dfrac{1}{2}\ (K[1] + K[2]) \\
\dfrac{1}{2}\ (K[2] + K[3]) \\
\text{undefined}
\end{pmatrix}\ .*\ \begin{pmatrix}
\text{undefined} \\
\text{C3}(1)\ (h[2] - h[1]) \\
\text{C3}(1)\ (h[3] - h[2]) \\
\text{undefined}
\end{pmatrix}
\right)\right) = \\[1em]
&= \text{div}.\left(\text{set\_boundary\_fluxes}.\left(
\begin{pmatrix}
\text{undefined} \\
\dfrac{1}{2}\ (K[1] + K[2]) \\
\dfrac{1}{2}\ (K[2] + K[3]) \\
\text{undefined}
\end{pmatrix}\ .*\ \begin{pmatrix}
\text{undefined} \\
\text{CT3}(g³³[1.5])\ (h[2] - h[1]) \\
\text{CT3}(g³³[2.5])\ (h[3] - h[2]) \\
\text{undefined}
\end{pmatrix}
\right)\right) = \\[1em]
&= \text{div}.\left(\text{set\_boundary\_fluxes}.
\begin{pmatrix}
\text{undefined} \\
\text{CT3}\left(\dfrac{1}{2}\ g³³[1.5]\ (K[1] + K[2])\ (h[2] - h[1])\right) \\
\text{CT3}\left(\dfrac{1}{2}\ g³³[2.5]\ (K[2] + K[3])\ (h[3] - h[2])\right) \\
\text{undefined}
\end{pmatrix}
\right) = \\[1em]
&= \text{div}.
\begin{pmatrix}
\text{CT3}(g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb[1]) \\
\text{CT3}\left(\dfrac{1}{2}\ g³³[1.5]\ (K[1] + K[2])\ (h[2] - h[1])\right) \\
\text{CT3}\left(\dfrac{1}{2}\ g³³[2.5]\ (K[2] + K[3])\ (h[3] - h[2])\right) \\
\text{CT3}(g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft[3])
\end{pmatrix} = \\[1em]
&= \begin{pmatrix}
\dfrac{1}{J[1]}\ \begin{pmatrix}
\dfrac{1}{2}\ J[1.5]\ g³³[1.5]\ (K[1] + K[2])\ (h[2] - h[1]) -{} \\
J[0.5]\ g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb[1]
\end{pmatrix} \\
\dfrac{1}{J[2]}\ \begin{pmatrix}
\dfrac{1}{2}\ J[2.5]\ g³³[2.5]\ (K[2] + K[3])\ (h[3] - h[2]) -{} \\
\dfrac{1}{2}\ J[1.5]\ g³³[1.5]\ (K[1] + K[2])\ (h[2] - h[1])
\end{pmatrix} \\
\dfrac{1}{J[3]}\ \begin{pmatrix}
J[3.5]\ g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft[3] -{} \\
\dfrac{1}{2}\ J[2.5]\ g³³[2.5]\ (K[2] + K[3])\ (h[3] - h[2])
\end{pmatrix}
\end{pmatrix}
\end{align}
The derivative of the tendency with respect to θ
is the matrix
\begin{align}
\dfrac{\partial θₜ}{\partial θ} &= \begin{pmatrix}
\dfrac{\partial θₜ[1]}{\partial θ[1]} &
\dfrac{\partial θₜ[1]}{\partial θ[2]} &
\dfrac{\partial θₜ[1]}{\partial θ[3]} \\
\dfrac{\partial θₜ[2]}{\partial θ[1]} &
\dfrac{\partial θₜ[2]}{\partial θ[2]} &
\dfrac{\partial θₜ[2]}{\partial θ[3]} \\
\dfrac{\partial θₜ[3]}{\partial θ[1]} &
\dfrac{\partial θₜ[3]}{\partial θ[2]} &
\dfrac{\partial θₜ[3]}{\partial θ[3]}
\end{pmatrix} = \\[1em]
&= \tiny\begin{pmatrix}
\begin{matrix}
\dfrac{J[1.5]\ g³³[1.5]}{2J[1]}\begin{pmatrix}K'[1]\ (h[2] - h[1]) -{} \\ (K[1] + K[2])\ h'[1]\end{pmatrix} -{} \\
\dfrac{J[0.5]}{J[1]}\ g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb'[1]
\end{matrix} &
\dfrac{J[1.5]\ g³³[1.5]}{2J[1]}\begin{pmatrix}K'[2]\ (h[2] - h[1]) +{} \\ (K[1] + K[2])\ h'[2]\end{pmatrix} &
0 \\[1em]
\dfrac{J[1.5]\ g³³[1.5]}{2J[2]}\begin{pmatrix}K'[1]\ (h[2] - h[1]) -{} \\ (K[1] + K[2])\ h'[1]\end{pmatrix} &
\begin{matrix}
\dfrac{J[2.5]\ g³³[2.5]}{2J[2]}\begin{pmatrix}K'[2]\ (h[3] - h[2]) -{} \\ (K[2] + K[3])\ h'[2]\end{pmatrix} -{} \\
\dfrac{J[1.5]\ g³³[1.5]}{2J[2]}\begin{pmatrix}K'[2]\ (h[2] - h[1]) +{} \\ (K[1] + K[2])] h'[2]\end{pmatrix}
\end{matrix} &
\dfrac{J[2.5]\ g³³[2.5]}{2J[2]}\begin{pmatrix}K'[3]\ (h[3] - h[2]) +{} \\ (K[2] + K[3])\ h'[3]\end{pmatrix} \\[1em]
0 &
\dfrac{J[2.5]\ g³³[2.5]}{2J[3]}\begin{pmatrix}K'[2]\ (h[3] - h[2]) -{} \\ (K[2] + K[3])\ h'[2]\end{pmatrix} &
\begin{matrix}
\dfrac{J[3.5]}{J[3]}\ g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft'[3] -{} \\
\dfrac{J[2.5]\ g³³[2.5]}{2J[3]}\begin{pmatrix}K'[3]\ (h[3] - h[2]) +{} \\ (K[2] + K[3])\ h'[3]\end{pmatrix}
\end{matrix}
\end{pmatrix}
\end{align}
Now, suppose that
unit_CT3_field = CT3.(ones(Spaces.FaceExtrudedFiniteDifferenceSpace(axes(θ))))
interp_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(interp)
grad_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(grad)
set_matrix_boundary_rows = Operators.SetBoundaryOperator(;
bottom = Operators.SetValue(UpperBidiagonalMatrixRow(C3(∂x³∂ξ³[0.5] Fb'(θ[1])))),
top = Operators.SetValue(LowerBidiagonalMatrixRow(C3(∂x³∂ξ³[3.5] Ft'(θ[3])))),
)
div_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(div)
After this SDI is implemented, the preferred way to specify the derivative matrix will be
div_matrix.(unit_CT3_field) .⋅ set_matrix_boundary_rows(
interp_matrix.(K'.(θ)) .* grad.(h.(θ)) .+
interp.(K.(θ)) .* grad_matrix.(h'.(θ))
)
The first term in this matrix-matrix multiplication is
\text{div\_matrix}.(\text{unit\_CT3\_field}) = \begin{pmatrix}
-\dfrac{J[0.5]}{J[1]} & \dfrac{J[1.5]}{J[1]} & 0 & 0 \\
0 & -\dfrac{J[1.5]}{J[2]} & \dfrac{J[2.5]}{J[2]} & 0 \\
0 & 0 & -\dfrac{J[2.5]}{J[3]} & \dfrac{J[3.5]}{J[3]}
\end{pmatrix}
The second term roughly corresponds to the matrix
\begin{gather}
\text{set\_matrix\_boundary\_rows}.\begin{pmatrix}
\text{interp\_matrix}.(K'.(θ))\ .*\ \text{grad}.(h.(θ))\ .+ \\
\text{interp}.(K.(θ))\ .*\ \text{grad\_matrix}.(h'.(θ))
\end{pmatrix} =\\[1 em]
= \text{set\_matrix\_boundary\_rows}.\tiny\begin{pmatrix}
\begin{pmatrix}
\text{undefined} & \text{undefined} & \text{undefined} \\
\dfrac{1}{2}\ K'[1] & \dfrac{1}{2}\ K'[2] & 0 \\
0 & \dfrac{1}{2}\ K'[2] & \dfrac{1}{2}\ K'[3] \\
\text{undefined} & \text{undefined} & \text{undefined}
\end{pmatrix}\ .*\ \begin{pmatrix}
\text{undefined} \\
g³³[1.5]\ (h[2] - h[1]) \\
g³³[2.5]\ (h[3] - h[2]) \\
\text{undefined}
\end{pmatrix}\ .+ \\[1em]
\begin{pmatrix}
\text{undefined} \\
\dfrac{1}{2}\ (K[1] + K[2]) \\
\dfrac{1}{2}\ (K[2] + K[3]) \\
\text{undefined}
\end{pmatrix}\ .*\ \begin{pmatrix}
\text{undefined} & \text{undefined} & \text{undefined} \\
-g³³[1.5]\ h'[1] & g³³[1.5]\ h'[2] & 0 \\
0 & -g³³[2.5]\ h'[2] & g³³[2.5]\ h'[3] \\
\text{undefined} & \text{undefined} & \text{undefined}
\end{pmatrix}
\end{pmatrix} \normalsize=\\[1 em]
= \text{set\_matrix\_boundary\_rows}.\tiny\begin{pmatrix}
\begin{pmatrix}
\text{undefined} & \text{undefined} & \text{undefined} \\
\dfrac{g³³[1.5]}{2}\ K'[1]\ (h[2] - h[1]) & \dfrac{g³³[1.5]}{2}\ K'[2]\ (h[2] - h[1]) & 0 \\
0 & \dfrac{g³³[2.5]}{2}\ K'[2]\ (h[3] - h[2]) & \dfrac{g³³[2.5]}{2}\ K'[3]\ (h[3] - h[2]) \\
\text{undefined} & \text{undefined} & \text{undefined}
\end{pmatrix}\ .+ \\[1em]
\begin{pmatrix}
\text{undefined} & \text{undefined} & \text{undefined} \\
-\dfrac{g³³[1.5]}{2}\ (K[1] + K[2])\ h'[1] & \dfrac{g³³[1.5]}{2}\ (K[1] + K[2])\ h'[2] & 0 \\
0 & -\dfrac{g³³[2.5]}{2}\ (K[2] + K[3])\ h'[2] & \dfrac{g³³[2.5]}{2}\ (K[2] + K[3])\ h'[3] \\
\text{undefined} & \text{undefined} & \text{undefined}
\end{pmatrix}
\end{pmatrix} \normalsize=
\end{gather}
\begin{gather}
= \text{set\_matrix\_boundary\_rows}.\tiny\begin{pmatrix}
\text{undefined} & \text{undefined} & \text{undefined} \\
\dfrac{g³³[1.5]}{2}\begin{pmatrix}K'[1]\ (h[2] - h[1]) -{} \\ (K[1] + K[2])\ h'[1]\end{pmatrix} &
\dfrac{g³³[1.5]}{2}\begin{pmatrix}K'[2]\ (h[2] - h[1]) +{} \\ (K[1] + K[2])\ h'[2]\end{pmatrix} &
0 \\
0 &
\dfrac{g³³[2.5]}{2}\begin{pmatrix}K'[2]\ (h[3] - h[2]) -{} \\ (K[2] + K[3])\ h'[2]\end{pmatrix} &
\dfrac{g³³[2.5]}{2}\begin{pmatrix}K'[3]\ (h[3] - h[2]) +{} \\ (K[2] + K[3])\ h'[3]\end{pmatrix} \\
\text{undefined} & \text{undefined} & \text{undefined}
\end{pmatrix} \normalsize= \\[1em]
= \tiny\begin{pmatrix}
g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb'(θ[1]) & 0 & 0 \\
\dfrac{g³³[1.5]}{2}\begin{pmatrix}K'[1]\ (h[2] - h[1]) -{} \\ (K[1] + K[2])\ h'[1]\end{pmatrix} &
\dfrac{g³³[1.5]}{2}\begin{pmatrix}K'[2]\ (h[2] - h[1]) +{} \\ (K[1] + K[2])\ h'[2]\end{pmatrix} &
0 \\
0 &
\dfrac{g³³[2.5]}{2}\begin{pmatrix}K'[2]\ (h[3] - h[2]) -{} \\ (K[2] + K[3])\ h'[2]\end{pmatrix} &
\dfrac{g³³[2.5]}{2}\begin{pmatrix}K'[3]\ (h[3] - h[2]) +{} \\ (K[2] + K[3])\ h'[3]\end{pmatrix} \\
0 & 0 & g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft'(θ[3])
\end{pmatrix}
\end{gather}
SetBoundaryOperator
. We can do this by moving the boundary conditions from set_boundary_fluxes
to interp
and grad
, and by moving the corresponding matrix boundary conditions from set_matrix_boundary_rows
to interp_matrix
and grad_matrix
. For example, we could set
interp = Operators.InterpolateC2F(;
bottom = Operators.SetValue(1),
top = Operators.SetValue(1),
)
grad = Operators.GradientC2F(;
bottom = Operators.SetGradient(C3(∂x³∂ξ³[0.5] Fb(θ[1]))),
top = Operators.SetGradient(C3(∂x³∂ξ³[3.5] Ft(θ[3]))),
)
We could then obtain the same tendency of θ
by evaluating
θₜ = \text{div}.(\text{interp}.(K.(θ))\ .*\ \text{grad}.(h.(θ)))
The preferred way to express the derivative matrix would then be
div_matrix.(unit_CT3_field) .⋅ (
interp_matrix.(K'.(θ)) .* grad.(h.(θ)) .+
interp.(K.(θ)) .* grad_matrix.(h'.(θ))
)
In this case, the operator matrices would be
interp_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(
interp;
bottom = Operators.SetValue(UpperBidiagonalMatrixRow(0)),
top = Operators.SetValue(LowerBidiagonalMatrixRow(0)),
)
grad_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(
grad;
bottom = Operators.SetValue(UpperBidiagonalMatrixRow(C3(∂x³∂ξ³[0.5] Fb'(θ[1])))),
top = Operators.SetValue(LowerBidiagonalMatrixRow(C3(∂x³∂ξ³[3.5] Ft'(θ[3])))),
)
The second term in the matrix-matrix multiplication would then be
\begin{gather}
\text{interp\_matrix}.(K'.(θ))\ .*\ \text{grad}.(h.(θ))\ .+\ \text{interp}.(K.(θ))\ .*\ \text{grad\_matrix}.(h'.(θ)) =\\[1 em]
= \tiny\begin{matrix}
\begin{pmatrix}
0 & 0 & 0 \\
\dfrac{1}{2}\ K'[1] & \dfrac{1}{2}\ K'[2] & 0 \\
0 & \dfrac{1}{2}\ K'[2] & \dfrac{1}{2}\ K'[3] \\
0 & 0 & 0
\end{pmatrix}\ .*\ \begin{pmatrix}
g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb(θ[1]) \\
g³³[1.5]\ (h[2] - h[1]) \\
g³³[2.5]\ (h[3] - h[2]) \\
g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft(θ[3])
\end{pmatrix}\ .+ \\[1em]
\begin{pmatrix}
1 \\
\dfrac{1}{2}\ (K[1] + K[2]) \\
\dfrac{1}{2}\ (K[2] + K[3]) \\
1
\end{pmatrix}\ .*\ \begin{pmatrix}
g³³[0.5]\ ∂x³∂ξ³[0.5]\ Fb'(θ[1]) & 0 & 0 \\
-g³³[1.5]\ h'[1] & g³³[1.5]\ h'[2] & 0 \\
0 & -g³³[2.5]\ h'[2] & g³³[2.5]\ h'[3] \\
0 & 0 & g³³[3.5]\ ∂x³∂ξ³[3.5]\ Ft'(θ[3])
\end{pmatrix}
\end{matrix}
\end{gather}
In order to simplify things, we will have operator matrices contain 0's at the boundaries by default, so the boundary conditions for interp_matrix
will not actually need to be specified.
@tapios @charleskawczynski I have made the following updates to the SDI based on your comments:
After a lengthy discussion with Simon and Sriharsha today, it has become clear that the "permuted band matrix solve" algorithm will require a fair bit of work to ensure that it is performant and GPU-compatible. However, all of the tasks that come before the implementation of this new algorithm (up to testing the interface in ClimaAtmos) should be good as-is. Per the revised time estimates, these tasks should take 3--4 weeks, which should be enough time for us to settle on a decent implementation of the new algorithm. If I start later this week, this means that the interface will be tested in ClimaAtmos by around this time in June.
@simonbyrne @sriharshakandala Here is a list of all the matrix sparsity patterns that BlockMatrixSystemSolver
will need to support in the near future. For each case, I will specify the sparsity pattern of $J = \partial Y_t/\partial Y$, where $Y_t$ is the implicit tendency of $Y$, though the actual matrix required by the implicit solver will be $W = -I + \Delta t \gamma J$, where $I$ is the identity matrix and $\Delta t \gamma$ is a scalar. I will also use the following shorthands:
For ClimaLSM, we only need to support the sparsity pattern of the matrix specified in the comment above,
J = \begin{pmatrix} \dfrac{\partial Y_t.c.\theta}{\partial Y.c.\theta} \end{pmatrix} = \begin{pmatrix} \mathbb{T} \end{pmatrix}
In the future, ClimaLSM will also include the prognostic variable $Y.c.\psi$, but this is still several months away.
For the ClimaAtmos dycore, the sparsity pattern of $J$ with $T$ tracers (and no FCT, just upwinding or central differencing) is
J = \begin{pmatrix}
\dfrac{\partial Y_t.c.\rho}{\partial Y.c.\rho} & \dfrac{\partial Y_t.c.\rho}{\partial Y.c.\rho e_{tot}} & \dfrac{\partial Y_t.c.\rho}{\partial Y.c.\rho\chi_1} & \cdots & \dfrac{\partial Y_t.c.\rho}{\partial Y.c.\rho\chi_T} & \dfrac{\partial Y_t.c.\rho}{\partial Y.f.u_3\_\text{data}} \\
\dfrac{\partial Y_t.c.\rho e_{tot}}{\partial Y.c.\rho} & \dfrac{\partial Y_t.c.\rho e_{tot}}{\partial Y.c.\rho e_{tot}} & \dfrac{\partial Y_t.c.\rho e_{tot}}{\partial Y.c.\rho\chi_1} & \cdots & \dfrac{\partial Y_t.c.\rho e_{tot}}{\partial Y.c.\rho\chi_T} & \dfrac{\partial Y_t.c.\rho e_{tot}}{\partial Y.f.u_3\_\text{data}} \\
\dfrac{\partial Y_t.c.\rho\chi_1}{\partial Y.c.\rho} & \dfrac{\partial Y_t.c.\rho\chi_1}{\partial Y.c.\rho e_{tot}} & \dfrac{\partial Y_t.c.\rho\chi_1}{\partial Y.c.\rho\chi_1} & \cdots & \dfrac{\partial Y_t.c.\rho\chi_1}{\partial Y.c.\rho\chi_T} & \dfrac{\partial Y_t.c.\rho\chi_1}{\partial Y.f.u_3\_\text{data}} \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
\dfrac{\partial Y_t.c.\rho\chi_T}{\partial Y.c.\rho} & \dfrac{\partial Y_t.c.\rho\chi_T}{\partial Y.c.\rho e_{tot}} & \dfrac{\partial Y_t.c.\rho\chi_T}{\partial Y.c.\rho\chi_1} & \cdots & \dfrac{\partial Y_t.c.\rho\chi_T}{\partial Y.c.\rho\chi_T} & \dfrac{\partial Y_t.c.\rho\chi_T}{\partial Y.f.u_3\_\text{data}} \\
\dfrac{\partial Y_t.f.u_3\_\text{data}}{\partial Y.c.\rho} & \dfrac{\partial Y_t.f.u_3\_\text{data}}{\partial Y.c.\rho e_{tot}} & \dfrac{\partial Y_t.f.u_3\_\text{data}}{\partial Y.c.\rho\chi_1} & \cdots & \dfrac{\partial Y_t.f.u_3\_\text{data}}{\partial Y.c.\rho\chi_T} & \dfrac{\partial Y_t.f.u_3\_\text{data}}{\partial Y.f.u_3\_\text{data}}
\end{pmatrix} =
= \begin{pmatrix}
\mathbb{T}/\mathbb{P} & 0 & 0 & \cdots & 0 & \mathbb{B} \\
\mathbb{T}/\mathbb{P} & \mathbb{T}/\mathbb{P} & 0/\mathbb{T}/\mathbb{P} & \cdots & 0/\mathbb{T}/\mathbb{P} & \mathbb{Q} \\
\mathbb{T}/\mathbb{P} & 0 & \mathbb{T}/\mathbb{P} & \cdots & 0 & \mathbb{B} \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
\mathbb{T}/\mathbb{P} & 0 & 0 & \cdots & \mathbb{T}/\mathbb{P} & \mathbb{B} \\
\mathbb{B} & \mathbb{B} & 0/\mathbb{B} & \cdots & 0/\mathbb{B} & \mathbb{T}
\end{pmatrix}
However, instead of using the exact value of $J$, we approximate it so that its sparsity pattern simplifies to
J \approx \begin{pmatrix}
0 & 0 & 0 & \cdots & 0 & \mathbb{B} \\
0 & 0 & 0 & \cdots & 0 & \mathbb{B} \\
0 & 0 & 0 & \cdots & 0 & \mathbb{B} \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & 0 & \mathbb{B} \\
\mathbb{B} & \mathbb{B} & 0 & \cdots & 0 & \mathbb{T}
\end{pmatrix}
For AMIP, we only need to support $T = 1$, with $\chi = q{tot}$. After AMIP, though, we will need to support $T \gg 1$, with $q{liq}$, $q{ice}$, $q{rai}$, $q_{sno}$, and every aerosol represented by a prognostic variable. So, BlockMatrixSystemSolver
needs to perform well for the dycore with large values of $T$. At the moment, the best algorithm we have for this is the "Schur complement solve".
When we add EDMF to ClimaAtmos, we end up with a significantly more complicated Jacobian. We are not entirely certain of how sparse we can afford to make our approximation of $J$, but our best guess at the moment (with only one draft in in the EDMF model, which corresponds to only one sub-grid-scale copy of the grid-scale state) is
\begin{gather}
J = \begin{pmatrix}
\dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{GS State}} & \dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{SGS State}} \\
\dfrac{\partial\ \text{SGS Tendency}}{\partial\ \text{GS State}} & \dfrac{\partial\ \text{SGS Tendency}}{\partial\ \text{SGS State}}
\end{pmatrix},\ \text{where}\ \dfrac{\partial\ \text{SGS Tendency}}{\partial\ \text{GS State}} \approx 0, \\[1 em]
\dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{GS State}} \approx \begin{pmatrix}
0 & 0 & 0 & \cdots & 0 & \mathbb{B} \\
0 & \mathbb{T} & 0 & \cdots & 0 & \mathbb{B} \\
0 & 0 & \mathbb{T} & \cdots & 0 & \mathbb{B} \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & \mathbb{T} & \mathbb{B} \\
\mathbb{B} & \mathbb{B} & 0 & \cdots & 0 & \mathbb{T}
\end{pmatrix},\ \text{and}\ \dfrac{\partial\ \text{SGS Tendency}}{\partial\ \text{SGS State}} \approx \begin{pmatrix}
\mathbb{D} & 0 & 0 & \cdots & 0 & \mathbb{B} \\
\mathbb{D} & \mathbb{D} & 0 & \cdots & 0 & \mathbb{B} \\
\mathbb{D} & 0 & \mathbb{D} & \cdots & 0 & \mathbb{B} \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
\mathbb{D} & 0 & 0 & \cdots & \mathbb{D} & \mathbb{B} \\
\mathbb{B} & 0 & 0 & \cdots & 0 & \mathbb{T}
\end{pmatrix}
\end{gather}
Since we plan to approximate the bottom-left matrix block as 0, our strategy for solving the linear problem $W\ \Delta Y = b$ will be to first solve a linear problem that corresponds to the bottom-right block, and to then solve another linear problem that corresponds to the top-left block. That is to say,
\begin{gather}
W\ \Delta Y = b \implies \\[1 em]
\begin{pmatrix}
-I + \Delta t\gamma\ \dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{GS State}} & \Delta t\gamma\ \dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{SGS State}} \\
0 & -I + \Delta t\gamma\ \dfrac{\partial\ \text{SGS Tendency}}{\partial\ \text{SGS State}}
\end{pmatrix} \begin{pmatrix} \Delta\text{ GS State} \\ \Delta\text{ SGS State} \end{pmatrix} = \begin{pmatrix} \text{GS RHS} \\ \text{SGS RHS} \end{pmatrix} \implies \\[1 em]
\left(-I + \Delta t\gamma\ \dfrac{\partial\ \text{SGS Tendency}}{\partial\ \text{SGS State}}\right)\ \Delta\text{ SGS State} = \text{SGS RHS}\ \text{and} \\
\left(-I + \Delta t\gamma\ \dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{GS State}}\right)\ \Delta\text{ GS State} = \text{GS RHS} - \Delta t\gamma\ \dfrac{\partial\ \text{GS Tendency}}{\partial\ \text{SGS State}}\ \Delta\text{ SGS State}
\end{gather}
This means that our implicit solver will first compute changes at the sub-grid-scale ($\Delta\text{ SGS State}$), and then it will compute changes at the grid-scale ($\Delta\text{ GS State}$). We do not yet know how we will approximate the top-right matrix block of the Jacobian, so we do not know what its sparsity structure will be. Fortunately, this doesn't actually matter, since that block will only be used for computing a matrix-vector product, rather than for solving a linear problem.
In the near future, we only plan to use EDMF with $T = 1$ or $T \approx 1$, since there is still a fair bit of work that needs to be done before we are ready to test it with large numbers of tracers. So, we can afford to have BlockMatrixSystemSolver
scale poorly with $T$ when we use EDMF, which justifies using the "permuted matrix solve" algorithm proposed in this SDI.
I appreciate your thoroughness, @dennisYatunin!
One important simplification, though: In the dycore, we can treat all tracers fully explicitly, so T=0 is fine, or, if you want to include moisture, T=1 (but even that is not strictly necessary). We only need to treat the fast waves (acoustic and gravity) implicitly, and tracers play no role in them.
@tapios Thanks! We actually stopped making that assumption in ClimaAtmos sometime last year. I think the intent was to ensure that all forms of moisture get treated roughly the same way as energy in the implicit solve, but neither Daniel nor Zhaoyi nor I remember how significant the resulting timestep increase was. Should we revert to making that approximation, or add a flag to toggle between the two options? In either case, the "Schur complement solve" scales very well with $T$, so whether or not the vertical advection of tracers is treated implicitly should not have a significant impact on performance.
@dennisYatunin It's ok to treat the moisture tracers (q_t, q_l, q_i) implicitly and in the same way as energy. But no need to do the same for other tracers.
@dennisYatunin : Is there a reason we are deviating from the more traditional band storage format (https://www.ibm.com/docs/en/essl/6.2?topic=representation-blas-general-band-storage-mode) for storing band matrices, which essentially stores the bands? We can use an offset array and eliminate storing the "unused data" for off-diagonal bands!
@sriharshakandala We are indeed storing the matrices in a traditional band storage format, except we are using "row-major" storage instead of "column-major" storage (the link you sent uses "column-major" storage). We are storing the matrices row-by-row in order to encode them as Field
s, and we only store the nonzero entries in each row. The top and bottom rows are exceptions to this; since they have fewer nonzero entries than the interior rows, we need to pad them with several "dummy entries" that do not lie in the matrix (these are equivalent to the *
s in the link you sent). Also, we want our matrices to be Field
s so that we can include them in Field
broadcasts, which makes it much easier to specify Jacobian blocks.
Thanks, @dennisYatunin. For the quaddiagonal and pentadiagonal matrices, I am assuming we are either losing them in the Jacobian matrix approximation or the bands are located contiguously!
@tapios Time estimates have been updated!
Thanks, @dennisYatunin. Looking good!
Purpose
The interface we use to specify matrices and solve linear systems of equations for the implicit solver needs to be refactored and extended. This interface, which is currently spread across several files in ClimaCore and ClimaAtmos (
ClimaCore.jl/src/Operators/stencilcoefs.jl
,ClimaCore.jl/src/Operators/pointwisestencil.jl
,ClimaCore.jl/src/Operators/operator2stencil.jl
,ClimaAtmos.jl/src/tendencies/implicit/wfact.jl
, andClimaAtmos.jl/src/tendencies/implicit/schur_complement_W.jl
) is unnecessarily convoluted and involves a large number of hardcoded assumptions, which makes it extremely challenging for new users to experiment with the implicit solver and to add new implicit tendencies. In particular, the next stage of EDMF development will involve adding many implicit tendencies to the atmosphere model, and our approximation of the total implicit tendency's Jacobian matrix will end up with a highly non-trivial sparsity pattern. In order for more than a single developer to be able to understand and implement the implicit solver for the dycore+EDMF, we will first need to make the changes outlined in this SDI. In addition, the land modeling team has been experimenting with their own implicit solver, and these changes will speed up their development and add the new functionality they require.There is currently a draft PR that contains a detailed sketch of all the proposed changes: #1190
Cost/benefits/risks
The cost/risk is development time. The benefit will be a significantly reduced complexity of implicit solver implementations, both in ClimaAtmos and in ClimaLSM. There will be a simple, well-documented interface to all of the numerical algorithms required by implicit solvers, which will allow user-facing code to be relatively short and easily extensible.
Producers
@dennisYatunin
Components
Inputs
BandMatrixRow
andMultiplyColumnwiseBandMatrixField
The type we currently use to represent an element of a band matrix field is called
StencilCoefs
, and it is extremely confusing and poorly designed. Upon refactoring, this will becomeBandMatrixRow
, which will have the following improvements:DiagonalMatrixRow(1)
andTridiagonalMatrixRow(1, 2, 3)
BandMatrixRow
andLinearAlgebra.UniformScaling
, so that different types of matrices can be mixed together; e.g.,LinearAlgebra.I / 2 - TridiagonalMatrixRow(1, 2, 3) + 2 * PentadiagonalMatrixRow(1, 2, 3, 4, 5) == PentadiagonalMatrixRow(2, 3, 4.5, 5, 10)
These improvements will also be reflected in fields of
BandMatrixRow
s, which will be aliased asColumnwiseBandMatrixField
s for dispatch. (The alias name is meant to indicate that every column has its own set ofBandMatrixRow
s, which, when taken together, can be interpreted as a band matrix.) So, for example, users will be able to write(@. LinearAlgebra.I / 2 - tridiagonal_matrix_field + 2 * pentadiagonal_matrix_field) == (@. PentadiagonalMatrixRow(field1, field2, field3, field4, field5))
.The operators we currently use for matrix-matrix and matrix-vector multiplication are
Operators.ComposeStencils()
andOperators.ApplyStencil()
, respectively. Again, these are confusing and implemented in a rather roundabout way. Upon refactoring, these will both become⋅
, which is an alias forOperators.MultiplyColumnwiseBandMatrixField()
. This will allow users to write something like@. matrix_field1 ⋅ matrix_field2 ⋅ field
, instead of needing to write@. apply(compose(matrix_field1, matrix_field2), field)
. In addition, the amount of code used to implement matrix multiplication can be reduced roughly by a factor of 3 (as shown in the sketch), and this simplified code will be easier to update for GPUs in the near future.FiniteDifferenceOperatorTermsMatrix
When a finite difference operator is applied to a field (
@. op(field)
), the result is equivalent to multiplying some matrix by that field (@. matrix_field ⋅ field
). The operator we currently use to generate this matrix isOperators.Operator2Stencil(op)
; in order to clarify what this operator is doing, it will be renamed toOperators.FiniteDifferenceOperatorTermsMatrix(op)
. Ifop_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(op)
andones_field = ones(axes(field))
, users will be able to confirm that(@. op(field)) == (@. op_matrix(ones_field) ⋅ field)
. As a quirk of our implementation, it is also the case that(@. op_matrix(ones_field) ⋅ field) == (@. op_matrix(field) ⋅ ones_field)
, which allows us to somewhat simplify expressions involving products with operator matrices.Aside from the name change, there are two new features that we need to add to
FiniteDifferenceOperatorTermsMatrix
. First, for EDMF development, we need to add support for multi-argument operators, so thatFiniteDifferenceOperatorTermsMatrix(op)
will always generate the matrix that corresponds to the last argument ofop
. For example, given a two-argument operatorop
(such asWeightedInterpolateF2C
orUpwind3rdOrderBiasedProductC2F
), users will be able to defineop_matrix
and confirm that(@. op(field1, field2)) == (@. op_matrix(field1, ones_field2) ⋅ field2) == (@. op_matrix(field1, field2) ⋅ ones_field2)
.Second, for land model development, we need to add support for specifying the corner elements of matrices by adding special boundary conditions for
FiniteDifferenceOperatorTermsMatrix
. The simple expression presented earlier,(@. op(field)) == (@. op_matrix(ones_field) ⋅ field)
, is only true whenop
has trivial boundary conditions; i.e., whenop
is a center-to-face operator with boundary conditions that cause it to return 0 on the top and bottom faces, or whenop
is a face-to-center operator without any boundary conditions (which means that it uses the values at the top and bottom faces as-is). In general, operators are affine transformations at the boundaries, not linear transformations. This means that, for everyop
andfield
, there is someboundary_field
that is zero everywhere except at the boundaries, and(@. op(field)) == (@. op_matrix(ones_field) ⋅ field + boundary_field)
. However, we only use matrices to represent derivatives of operators with respect to their inputs, and, up until now, it has always been the case thatboundary_field
is a constant that does not depend onfield
. More specifically, ifboundary_matrix_field
represents∂(boundary_field)/∂(field)
, then∂(@. op(field))/∂(field)
can be expressed as@. op_matrix(ones_field) + boundary_matrix_field
. So, ifboundary_field
is a constant, thenboundary_matrix_field
is the zero matrix and can be ignored in our computations. This amounts to assuming that the corner elements of our matrices are always zero. Due to new requirements from the land model, we will no longer be able to make this assumption, so we will need to add support for theOperators.SetValue
boundary condition forFiniteDifferenceOperatorTermsMatrix
, which will allow users to specify nonzero elements fromboundary_matrix_field
that should be added to@. op_matrix(ones_field)
.Additionally, it would be good to refactor how finite difference operators are implemented so that the code for every operator does not need to be duplicated in order to implement the
FiniteDifferenceOperatorTermsMatrix
for that operator. Currently, every operator implements a method forstencil_interior
,stencil_left_boundary
, andstencil_right_boundary
, and each of these methods returns some value (the value of the operator's result at a particular point in space). TheFiniteDifferenceOperatorTermsMatrix
for every operator also implements the same three methods, but each of its methods returns aBandMatrixRow
(or, rather, a "StencilCoefs
") whose entries add up to the value returned by the operator's corresponding method. There are currently almost 700 lines of code required to implement these duplicated methods, and more will need to be added in order to support multi-argument operators. It should be fairly straightforward to refactor things so that:stencil_interior
,stencil_left_boundary
, andstencil_right_boundary
that get implemented for every operator return aBandMatrixRow
.FiniteDifferenceOperatorTermsMatrix
, thisBandMatrixRow
gets returned as-is. If matrix boundary conditions are specified, the first or last value of thisBandMatrixRow
may need to be modified.FiniteDifferenceOperatorTermsMatrix
, the entries of thisBandMatrixRow
get added together before being returned.However, this last change would only improve things "under the hood", without any immediate benefit to users. In order to avoid unnecessarily delaying EDMF and land model development, this change will be the last step of this SDI.
ColumnwiseBlockMatrix
The type we currently use to represent block matrices is the
SchurComplementW
object, which is hardcoded to only work for the dycore inClimaAtmos
. This can be generalized to aColumnwiseBlockMatrix
, which will be a simple dictionary that maps pairs of field names (one name for the row and another name for the column) to their corresponding matrix blocks, each of which can be aColumnwiseBandMatrixField
or aLinearAlgebra.UniformScaling
. TheColumnwiseBlockMatrix
will be used as follows:The only challenge in implementing the
ColumnwiseBlockMatrix
is ensuring type stability, particularly when it gets used in aBlockMatrixSystemSolver
(see the next section); without type stability, we will have long compilation times and unnecessary allocations. Fortunately, this has already been worked out and tested in the sketch. In particular, the@block_name
macro will return a type-stable generalization of what we currently call aproperty_chain
for both the row and column names, and the corresponding row and column fields will be accessed by using a type-stable generalization ofFields.single_field
.BlockMatrixSystemSolver
We are currently solving the linear system of equations specified by the
SchurComplementW
object by reducing it to a smaller tridiagonal system of equations, solving the reduced problem, and using the reduced problem's solution to compute the original problem's solution. However, this strategy will not work when we add the new implicit EDMF tendencies because the new sparsity pattern of the matrix will not allow us to specify the reduced problem without performing a computationally expensive dense matrix inversion. So, we will need to implement a new algorithm for solving sparse block matrix systems. Per @simonbyrne's advice, this algorithm will work as follows:ColumnwiseBlockMatrix
so that, instead of blocks that correspond to pairs of variables, we end up with blocks that correspond to pairs of cells.To illustrate how this permutation will work, here is a simplified example that illustrates how we would permute the Jacobian of a
FieldVector
$Y$ with total implicit tendency $Y_t$ that is defined on a single column with two cells, with a field $c$ defined on cell centers and a field $f$ defined on cell faces:As this example illustrates, the permutation requires us to drop all of the matrix elements that correspond to the top or bottom cell face from the linear solve, which we can do as long as the only nonzero matrix element for that cell face is the "identity element". In ClimaAtmos, we know that this will always be the case for the top cell face; in the example above, the "identity element" for this cell face is $\partial Y_t.f[2.5]/\partial Y.f[2.5]$. In our code, the value of $\partial Y_t.f[2.5]/\partial Y.f[2.5]$ (or, rather, the value of
-1 + Δtγ * ∂(Yₜ.f.w.components.data.:1[2.5])/∂(Y.f.w.components.data.:1[2.5])
, and the corresponding EDMF updraft velocity terms) will typically be $-1$, so, using the somewhat hand-wavy notation from above, we will typically have that $\partial Y_t.\square/\partial Y.f[2.5] = \partial Yt.f[2.5]/\partial Y.\square = -\delta{\square, f[2.5]}$. In ClimaLSM, there are currently no prognostic variables defined on cell faces, but, if there were, this would be the case for the bottom cell face, rather than the top cell face. In general, as long as we deal with the nonzero identity element separately, we can drop all of the matrix elements related to one cell face from the linear solve. There are also two other types of variables whose corresponding matrix elements we will be able to drop from the linear solve (because the only nonzero elements will be the identity elements): variables that do not have a nonzero implicit tendency, likeY.c.uₕ
in ClimaAtmos, and variables that do not lie on cell centers or on cell faces, like those related to the river model in ClimaLSM.After the matrix is permuted, the new linear system will be solved using a band matrix solver. Both the dycore and the land model will require a tridiagonal solver, and EDMF may also require a pentadiagonal solver. If $N$ denotes the number of cells and $V$ denotes the number of variables, then the band matrix solver will be applied to a matrix of $N \times N$ blocks, where each block is itself a $V \times V$ matrix. To simplify our initial implementation, we can treat each block as a dense matrix, which we will represent using a
StaticArrays.SMatrix
. Although this means that we will need to perform dense matrix inversions, these shouldn't be too expensive as long as $V$ is relatively small. If we were to instead modify the current "Schur complement solve" algorithm for EDMF, we would end up needing to perform a dense matrix inversion on a large block matrix, where each block is one of the original $N \times N$ blocks that represent pairs of variables. Since it will generally be the case that $N > V$, this means that the new "permuted band matrix solve" algorithm will be significantly more performant for EDMF than the current algorithm.Unfortunately, it will not be possible to eliminate dense matrix inversions from the new algorithm altogether by specializing on the sparsity structure of the $V \times V$ blocks. Even though each $V \times V$ block in ClimaAtmos will be an arrowhead matrix (both for the dycore and for EDMF), the band matrix solver will need to evaluate linear combinations of products of the blocks and their inverses, which will not have any particularly nice sparsity structure. Specifically, the inverse of an arrowhead matrix is a diagonal-plus-rank-one (DPR1) matrix, and all of the following matrix-matrix products are neither arrowhead nor DPR1 matrices: arrowhead times arrowhead, DPR1 times DPR1, and arrowhead times DPR1 (unless they are inverses of each other). This means that the new algorithm is likely to be slower than the current one for the dycore, since the current one does not involve any dense matrix inversions. So, in addition to implementing the new algorithm for
BlockMatrixSystemSolver
, we will also need to port over the algorithm currently implemented inClimaAtmos.jl/src/tendencies/implicit/schur_complement_W.jl
. This will require updating the current algorithm to use the new interface forBandMatrixRow
andMultiplyColumnwiseBandMatrixField
, and it will also require allowing theBlockMatrixSystemSolver
to determine which of the two algorithms it should use based on the type of theColumnwiseBlockMatrix
given to it.Task breakdown
BandMatrixRow
andMultiplyColumnwiseBandMatrixField
. Ensure that these objects have good docstrings.BandMatrixRow
andMultiplyColumnwiseBandMatrixField
in CI. These tests should ensure correctness, GPU compatibility, and type stability (i.e., no unexpected allocations). The tests should also be helpful for performance analysis.FiniteDifferenceOperatorTermsMatrix
. Add a helpful docstring, and add more unit tests to the test suite. PR: #1399 Estimated Completion Date: July 31stColumnwiseBlockMatrix
andBlockMatrixSystemSolver
, but only with the current "Schur complement solve" algorithm from ClimaAtmos. Add docstrings and unit tests. PR: #1436 Estimated Completion Date: July 25thAdd the new "permuted band matrix solve" algorithm toBlockMatrixSystemSolver
. Expand the docstring and add unit tests.Skipping this in favor of JFNK with clever preconditioning.Skipping this in favor of theSchurComplementReductionSolve
algorithm in conjunction with aBlockArrowheadSchurComplementPreconditioner
. PR: #1551Reviewers