SciML / DiffEqOperators.jl

Linear operators for discretizations of differential equations and scientific machine learning (SciML)
https://docs.sciml.ai/DiffEqOperators/stable/
Other
284 stars 74 forks source link

Refactoring to allow periodic boundary conditions #210

Open thisrod opened 4 years ago

thisrod commented 4 years ago

I'm raising this issue for two reasons. The refactoring I'm proposing makes sense to me. But perhaps I am mistaken, because I have missed some important design goals or implementation constraints of DiffEqOperators, in which case it would be good to clarify and document them.

Requirements of DiffEqOperators

The basic requirement is that the package represents finite-difference derivatives and operators that impose boundary conditions, and provides efficient ways to apply those operators to arrays of function values. There are also some more subtle requirements:

Current design

The operators that are evaluated in user code have type GhostDerivative. This associates a derivative D of type DerivativeOperator with a boundary condition B of type RobinBC or PeriodicBC. The GhostDerivative inputs an array u, applies B and D, then outputs D(B(u)). The value B(u) is a BoundaryPaddedVector, containing the vector u and a ghost node value for each end. The operator D inputs this BoundaryPaddedVector and outputs a vector of derivative values.

Multidimensional arrays are handled by DerivativeOperator using a type parameter. It is unclear how they are or will be handled by boundary conditions.

There are also some composition and linear combination types that address the curl problem.

Limitations of this design

There are a lot of types, and it takes days to learn them well enough to help maintain the package.

Periodic boundary conditions can't be properly implemented, because they require multiple ghost nodes at each end of the axis.

To use these types efficiently, users will need to be very careful about allocations of BoundaryPaddedVector.

Proposed refactoring

I like the design of ApproxFun, where the number crunching is done by sparse matrix types, and these are wrapped in zero cost operator abstractions to provide meaningful constructors, pretty printing, and other ergonomics. In DiffEqOperators, the abstractions should provide in place constructors, to allow time-varying boundary conditions to reuse previously allocated sparse matrices. This issue focuses on the sparse matrices, and does not discuss the operator abstractions in detail.

Along with the existing BandedMatrix, I'm proposing a BandedToeplitzMatrix and BandedCirculantMatrix to make evenly spaced grids efficient. There will also be an EdgeAffineArray to handle boundary conditions.

To handle multidimensional grids, these three types will have a method contract!(y, A, x, axis) as well as the usual *(A, x) and mul!(y, A, x). The contract! method applies the matrix A along the specified axis of the array x, and writes the result to the array y. This method will wrap an efficient convolution routine where possible.

edgematrix

Conceptually, an EdgeAffineArray is an array of EdgeMatrix{T}. An EdgeMatrix{T} has the structure shown in the diagram above. (I'm going to omit some element type parameters here.) Most elements come from a bulk matrix of type T, except that dense blocks A and B are added at the corners to allow for irregular elements near the boundary. I'm assuming that circular boundary conditions have the form of a BandedCirculantMatrix, and there is never a need for dense blocks in the other two corners. This structure has two important features. It is closed under composition, and it can represent the linear part of every differential and boundary condition operator supplied by DiffEqOperators.

An EdgeMatrix lacks two things that are required for boundary conditions: the A and B blocks might vary at different points on the boundary, and operators with boundary conditions are generally affine, not linear. These are the responsibilities of EdgeAffineArray{N,M}. This represents an N-1 dimensional array of affine operators, whose linear parts have the edge matrix structure. These operators act along the Mth axis of an N dimensional array. it would make sense to implement EdgeMatrix as an EdgeAffineArray with a single element.

In general, the A and B parts of EdgeAffineArray are 2+N-1 dimensional arrays, interpreted as N-1 dimensional arrays of dense matrices. Uniform boundary conditions are included free with broadcasing.

How this better satisfies the requirements

Open questions

Should an EdgeAffineArray{N} store an N dimensional array of coefficients separately? The benefit is that things like f(x,y,z)·∂/∂x might be evaluated with better cache efficiency and fewer allocations. The downside is that this structure is not generally closed under composition. It is for the special case of coefficient function times first derivative operator, because its commutator with a coefficient function is another coefficient function. Is that still the case with boundary conditions? Higher derivatives might still be closed, but will be more complicated. That might be a Version 2 feature.

What's a minimum viable solution to the curl problem? The sweet spot might be “contraction over x + contraction over y + contraction over z”. This is closed under composition, and it's not that hard to roll your own if you want something more general. Let someone else do a general lazy compositions and linear combinations library.

ChrisRackauckas commented 4 years ago

To handle multidimensional grids, these three types will have a method contract!(y, A, x, axis) as well as the usual *(A, x) and mul!(y, A, x). The contract! method applies the matrix A along the specified axis of the array x, and writes the result to the array y. This method will wrap an efficient convolution routine where possible.

This makes sense, but the main difficulty with the convolution is you want to do each direction all in one pass.

Along with the existing BandedMatrix, I'm proposing a BandedToeplitzMatrix and BandedCirculantMatrix to make evenly spaced grids efficient. There will also be an EdgeAffineArray to handle boundary conditions.

Makes sense.

This can do periodic boundary conditions properly. The current design, with a single ghost point, can't do that.

You could do more than a single ghost point.

What's a minimum viable solution to the curl problem? The sweet spot might be “contraction over x + contraction over y + contraction over z”. This is closed under composition, and it's not that hard to roll your own if you want something more general. Let someone else do a general lazy compositions and linear combinations library.

I'd just directly make a new operator type for it.

Also, the types are just matrices with special forms, not exotic things like GhostDerivativeOperator.

Seems like that might be hard to generate GPU code for though?

thisrod commented 4 years ago

Thanks for reading that Chris.

the main difficulty with the convolution is you want to do each direction all in one pass

Then there should be contract!(y, x, A, axis1, B, axis2, ...) == contract!(y, x, A, axis1) + contract!(y, x, B, axis2) + ... That sounds hard to implement efficiently. Have the machine learning people already written it?

Seems like that might be hard to generate GPU code for though?

I hadn't thought about that. The contract methods for the structured arrays would need to call a GPU convolution kernel. No doubt that exists? The dense contractions can be done with stride fiddling, so a strided GPU matmul would be enough.

I'll start prototyping some of this.

ChrisRackauckas commented 4 years ago

Then there should be contract!(y, x, A, axis1, B, axis2, ...) == contract!(y, x, A, axis1) + contract!(y, x, B, axis2) + ... That sounds hard to implement efficiently. Have the machine learning people already written it?

conv. That's why it's all about not building matrices but instead building representations of kernels to then call the right stencil compiler.

I hadn't thought about that. The contract methods for the structured arrays would need to call a GPU convolution kernel. No doubt that exists? The dense contractions can be done with stride fiddling, so a strided GPU matmul would be enough.

The optimal thing here is still conv, which means you don't want to send any memory to the GPU. The exception of course is when dx is not fixed though, in which case we really need a better stencil compiler.