Open thisrod opened 4 years ago
To handle multidimensional grids, these three types will have a method contract!(y, A, x, axis) as well as the usual *(A, x) and mul!(y, A, x). The contract! method applies the matrix A along the specified axis of the array x, and writes the result to the array y. This method will wrap an efficient convolution routine where possible.
This makes sense, but the main difficulty with the convolution is you want to do each direction all in one pass.
Along with the existing BandedMatrix, I'm proposing a BandedToeplitzMatrix and BandedCirculantMatrix to make evenly spaced grids efficient. There will also be an EdgeAffineArray to handle boundary conditions.
Makes sense.
This can do periodic boundary conditions properly. The current design, with a single ghost point, can't do that.
You could do more than a single ghost point.
What's a minimum viable solution to the curl problem? The sweet spot might be “contraction over x + contraction over y + contraction over z”. This is closed under composition, and it's not that hard to roll your own if you want something more general. Let someone else do a general lazy compositions and linear combinations library.
I'd just directly make a new operator type for it.
Also, the types are just matrices with special forms, not exotic things like GhostDerivativeOperator.
Seems like that might be hard to generate GPU code for though?
Thanks for reading that Chris.
the main difficulty with the convolution is you want to do each direction all in one pass
Then there should be contract!(y, x, A, axis1, B, axis2, ...) == contract!(y, x, A, axis1) + contract!(y, x, B, axis2) + ...
That sounds hard to implement efficiently. Have the machine learning people already written it?
Seems like that might be hard to generate GPU code for though?
I hadn't thought about that. The contract
methods for the structured arrays would need to call a GPU convolution kernel. No doubt that exists? The dense contractions can be done with stride fiddling, so a strided GPU matmul would be enough.
I'll start prototyping some of this.
Then there should be contract!(y, x, A, axis1, B, axis2, ...) == contract!(y, x, A, axis1) + contract!(y, x, B, axis2) + ... That sounds hard to implement efficiently. Have the machine learning people already written it?
conv
. That's why it's all about not building matrices but instead building representations of kernels to then call the right stencil compiler.
I hadn't thought about that. The contract methods for the structured arrays would need to call a GPU convolution kernel. No doubt that exists? The dense contractions can be done with stride fiddling, so a strided GPU matmul would be enough.
The optimal thing here is still conv
, which means you don't want to send any memory to the GPU. The exception of course is when dx
is not fixed though, in which case we really need a better stencil compiler.
I'm raising this issue for two reasons. The refactoring I'm proposing makes sense to me. But perhaps I am mistaken, because I have missed some important design goals or implementation constraints of
DiffEqOperators
, in which case it would be good to clarify and document them.Requirements of DiffEqOperators
The basic requirement is that the package represents finite-difference derivatives and operators that impose boundary conditions, and provides efficient ways to apply those operators to arrays of function values. There are also some more subtle requirements:
For compatibility with the interface exported by
DifferentialEquations
, the operators act on arrays, and can operate in place withmul!
. They can absorb multiplication by a constant.The package is intended to implement the method of lines, where the operators will be reused for millions of time steps. For efficiency, applications of the operators must avoid allocating storage, or be able to reuse storage that was allocated previously.
The operators can be contracted efficiently along any axis of a multidimensional array. Here I'm using “contraction” in the sense of “tensor contraction”, to mean applying an operator along one axis, independently at each point along the other axes. This normally refers to linear operators, but the generalisation to affine is straightforward. In practice, this means using the efficient convolution routines that are available for neural network applications.
There can be a different boundary condition at each point on the boundary, but the uniform case should be efficient.
Given operators for
∂/∂x
,∂/∂y
and∂/∂z
, it is easy to construct an efficient quantum angular momentum operator(x, y, z)×(∂/∂x, ∂/∂y, ∂/∂z)
. I'll refer to this as “the curl problem”. It strikes me as a good candidate to have its own package, and Sheehan Olver'sLazyArrays
might halfway there. Solving the problem is beyond the scope of this issue, but the requirement for it to be solvable lurks in the background.Current design
The operators that are evaluated in user code have type
GhostDerivative
. This associates a derivativeD
of typeDerivativeOperator
with a boundary conditionB
of typeRobinBC
orPeriodicBC
. TheGhostDerivative
inputs an arrayu
, appliesB
andD
, then outputsD(B(u))
. The valueB(u)
is aBoundaryPaddedVector
, containing the vectoru
and a ghost node value for each end. The operatorD
inputs thisBoundaryPaddedVector
and outputs a vector of derivative values.Multidimensional arrays are handled by
DerivativeOperator
using a type parameter. It is unclear how they are or will be handled by boundary conditions.There are also some composition and linear combination types that address the curl problem.
Limitations of this design
There are a lot of types, and it takes days to learn them well enough to help maintain the package.
Periodic boundary conditions can't be properly implemented, because they require multiple ghost nodes at each end of the axis.
To use these types efficiently, users will need to be very careful about allocations of
BoundaryPaddedVector
.Proposed refactoring
I like the design of
ApproxFun
, where the number crunching is done by sparse matrix types, and these are wrapped in zero cost operator abstractions to provide meaningful constructors, pretty printing, and other ergonomics. InDiffEqOperators
, the abstractions should provide in place constructors, to allow time-varying boundary conditions to reuse previously allocated sparse matrices. This issue focuses on the sparse matrices, and does not discuss the operator abstractions in detail.Along with the existing
BandedMatrix
, I'm proposing aBandedToeplitzMatrix
andBandedCirculantMatrix
to make evenly spaced grids efficient. There will also be anEdgeAffineArray
to handle boundary conditions.To handle multidimensional grids, these three types will have a method
contract!(y, A, x, axis)
as well as the usual*(A, x)
andmul!(y, A, x)
. Thecontract!
method applies the matrixA
along the specified axis of the arrayx
, and writes the result to the arrayy
. This method will wrap an efficient convolution routine where possible.Conceptually, an
EdgeAffineArray
is an array ofEdgeMatrix{T}
. AnEdgeMatrix{T}
has the structure shown in the diagram above. (I'm going to omit some element type parameters here.) Most elements come from a bulk matrix of typeT
, except that dense blocksA
andB
are added at the corners to allow for irregular elements near the boundary. I'm assuming that circular boundary conditions have the form of aBandedCirculantMatrix
, and there is never a need for dense blocks in the other two corners. This structure has two important features. It is closed under composition, and it can represent the linear part of every differential and boundary condition operator supplied byDiffEqOperators
.An
EdgeMatrix
lacks two things that are required for boundary conditions: theA
andB
blocks might vary at different points on the boundary, and operators with boundary conditions are generally affine, not linear. These are the responsibilities ofEdgeAffineArray{N,M}
. This represents anN-1
dimensional array of affine operators, whose linear parts have the edge matrix structure. These operators act along theM
th axis of anN
dimensional array. it would make sense to implementEdgeMatrix
as anEdgeAffineArray
with a single element.In general, the
A
andB
parts ofEdgeAffineArray
are2+N-1
dimensional arrays, interpreted asN-1
dimensional arrays of dense matrices. Uniform boundary conditions are included free with broadcasing.How this better satisfies the requirements
This can do periodic boundary conditions properly. The current design, with a single ghost point, can't do that.
It's simpler. It doesn't have special padded types, just ordinary arrays. If those types don't exist, their values will never have to be garbage collected. Also, the types are just matrices with special forms, not exotic things like
GhostDerivativeOperator
.Easier to teach and learn. What I've written here would suffice to document how it works. I have tried to document how
DiffEqOperators
works now, and that isn't easy.Open questions
Should an
EdgeAffineArray{N}
store anN
dimensional array of coefficients separately? The benefit is that things likef(x,y,z)·∂/∂x
might be evaluated with better cache efficiency and fewer allocations. The downside is that this structure is not generally closed under composition. It is for the special case of coefficient function times first derivative operator, because its commutator with a coefficient function is another coefficient function. Is that still the case with boundary conditions? Higher derivatives might still be closed, but will be more complicated. That might be a Version 2 feature.What's a minimum viable solution to the curl problem? The sweet spot might be “contraction over x + contraction over y + contraction over z”. This is closed under composition, and it's not that hard to roll your own if you want something more general. Let someone else do a general lazy compositions and linear combinations library.