Create DAE types and a promotion structure

SciML / DiffEqBase.jl

The lightweight Base library for shared types and functionality for defining differential equation and scientific machine learning (SciML) problems

Other

302 stars 106 forks source link

Create DAE types and a promotion structure #1

Closed ChrisRackauckas closed 7 years ago

ChrisRackauckas commented 7 years ago

As discussed in https://github.com/JuliaDiffEq/Roadmap/issues/5, we have set out for the DAE types something like DAEProblem for ODEs defined by an implicit function f(t,u,du,resid) where resid is the residual (does anyone have a better common name for that arg? Or is that good?), and MassMatrixODEProblem for problems with a mass matrix (extends the ODE problem to have a spot for the mass matrix). What should be name that field? mass_matrix? We would then implement a promotion structure:

ODEProblem -> MassMatrixODEProblem by making the mass_matrix = I the uniform scaling operator (allows one to do a type check as well to turn off extra calculations, and acts nicely in the common ways mass matrices are used).
MassMatrixODEProblem -> DAEProblem by defining the residual problem out = M*du - f(t,u,du) (as real code, of course this won't work)? In this promotion we also need to provide du0 and that can be determined by an evaluation of f.
Is there an intermediate common case we are missing?

Ideally you'd then just write your algorithm at the top level with enough promotion information exposed so that way you don't lose performance. So say for my Rosenbrock algorithms it can solve a MassMatrixODEProblem, so what I write those in a manner such that, if someone does

solve(prob,Rosenbrock23;kwargs)

and prob is an ODEProblem, it will be promoted to a MassMatrixODEProblem with mass_matrix = I and given to my Rosenbrock routine.

I don't know how to do promotions really at all, so could someone explain this @mauro3?

Note that some starter code is in AlgebraicDiffEq.jl, but this is probably easier to just re-write given what we have from ODEProblem in DiffEqBase since a lot has changed. For each of these we should also have an appropriate TestProblem and build_..._problem function for testing.

I can do this over the weekend if needed, but maybe someone else who's more interested in DAEs should take a stab at this to get to know the infrastructure well and identify potential problems. @pwl or @mauro3?

Once this is together, I'll make a common interface dispatch for Sundials' idasol to move that code out of AlgebraicDiffEq.jl. One that's done, AlgebraicDiffEq.jl will be a shell package, which is fine because it will be ready for people to add methods if they so please (I'll be leaving my Rosenbrock functions and creating new ones in OrdinaryDiffEq.jl since I tend to consider mass matrix problems more ODEs. That can be a discussion though).

mauro3 commented 7 years ago

I'll see whether I can look into it tomorrow night, but I'm pretty short on time.

ChrisRackauckas commented 7 years ago

I understand, it's fine. I really just wanted to the ODEs on lock, DAEs can take their time. With the interface hammered out and ODEs together, I'm just going to be spending time improving Rosenbrock and SDE methods for a bit. The interface discussion is more important. I just wanted to pull this out to a separate thread.

ChrisRackauckas commented 7 years ago

Hey, I setup the DAEProblem to match the ODEProblem but with the implicit ODE form f(t,u,du,out) (and initial du0). I did this because, well, with getting the function overloading stuff down, the last thing for "stability" was to do this and move the AlgebraicDiffEq.jl stuff to Sundials. So there's also a PR I'm putting into Sundials to get IDA onto the common interface through this, and that should be the last large change.

I'll be holding off on the promotion structure and the extra types and let you handle that because again, the only reason why I wanted to speedup a little on the full DAEs was to get Sundials down (and the funciton overloading for ParameterizedFunctions).

ChrisRackauckas commented 7 years ago

I think we were missing two cases:

Constant Mass Matrices
Constrained ODEs

Constant mass matrices show up in lots of discretizations of PDEs and you can do quite a bit more optimizations on them. Also, some methods only work with constant mass matrices (which is why some methods require the mass matrix is constant in MATLAB).

Constrained ODEs came up in the discussion https://github.com/ApproxFun/SpectralTimeStepping.jl/issues/1#issuecomment-261814725 . Constrained ODEs allow for more methods (projection methods) than the full implicit DAE, so it would be good to have it separately definable so that way methods could target those problems.

I think it should promote as ODEProblem -> Constant Mass Matrices -> General Mass Matrices -> DAEProblem since it's easy to handle the promotion there, and I don't see a use going to the other branch, which would be Constrained ODE -> DAE

Additionally we should probably talk about IMEX. Does any intermediate make sense for IMEX? Or just IMEX -> DAE?

mauro3 commented 7 years ago

I don't really know what "constrained ODEs" are. Are the constraints equality or inequality constraints?

Regarding IMEX, I know of two ways to define them: ARKode like M u' = f_e(t,u) + f_i(t,u) and PETSc like f_e(t,du) = f_i(t,u,du). Presumably for the first there is a distinction between no M, constant M and non-constant M. So, IMEX_noM -> IMEX_constM -> IMEX_M -> IMEX_PETSc. And also promoting to non-IMEX problems: IMEX_noM->ODEProblem, IMEX_constM -> Constant Mass Matrices, IMEX_M -> General Mass Matrices, IMEX_PETSc -> DAE.

ChrisRackauckas commented 7 years ago

I don't really know what "constrained ODEs" are. Are the constraints equality or inequality constraints?

Semi-explicit ODE. I think they're always equality constraints. du = f(t,u) and 0=g(t,u) (with some conditions). This formulation is sufficient for things like the boundary conditions on the Funs, differential equations on manifolds, and many physical problems. There are certain projection methods which can make these much easier to solve than a full DAE.

So, IMEX_noM -> IMEX_constM -> IMEX_M -> IMEX_PETSc. And also promoting to non-IMEX problems: IMEX_noM->ODEProblem, IMEX_constM -> Constant Mass Matrices, IMEX_M -> General Mass Matrices, IMEX_PETSc -> DAE.

Makes complete sense.

mauro3 commented 7 years ago

Are those constraints then not equivalent to a singular mass matrix? Edit: I guess it's more restrictive. g can be a vector-valued function?

ChrisRackauckas commented 7 years ago

Ahh, I see it now. I think there's an extra constraint that they have on the Jacobian of g? Would it be useful to keep them separate for and promote in most cases? I know the projection methods for these have to solve the algebraic equation on g and then use that to fix some variables and solve the ODE, which allows for any ODE method to be used in that case.

ChrisRackauckas commented 7 years ago

I think we can cut down on the types a little bit by using some standard dispatches. Another thing that came up for me is that a semi-linear IMEX problem is quite common, like a discretization of a semilinear PDE (reaction-diffusion equation). Being able to know that u' = f(u) + g(u) has f linear means Crank-Nicholson (Trapezoid) uses a linear solve instead of a nonlinear solve, greatly enhancing performance.

So I think that we can designate whether a part is linear via dispatch. I was thinking of using LinearMap from LinearMaps.jl since that's what IterativeSolvers.jl will be going towards. In DiffEqBase, we could allow a user to pass in a matrix and automatically wrap that into a linear map, or allow them to pass a LinearMap type (so that way they can pass a function which computes A*v for matrix-free methods, this would come in handy with Laplacians). Since it's all type information, the solvers could then just simply

if typeof(f) <: LinearMap
  # Use linear solve
else
  # Use nonlinear solve
end

As long as this is part of the agreed on interface, solvers can choose to support it. I think it's a conservative extension which would be quite useful.

Also instead of specifying constant mass matrices and such, we should also just allow mass matrices to be either a function or a matrix. The solvers can just do a type check and they'll know whether it's constant or not, and that has no runtime cost.

Lastly, there's no need in this style to distinguish between having a mass matrix and not having a mass matrix, since the default mass matrix is I, the UniformScaling operator. This is just a small immutable so there's no cost to having it around, and again a compile-time type-check for typeof(prob.M) == UniformScaling tells you whether there's a mass matrix or not.

Of course, all of this can be made easier and more sensible with a traits interface or at least a dispatch interface for things like is_constant_massmatrix(prob), but this shows that this can mostly be achieved by dispatch.

ChrisRackauckas commented 7 years ago

If we then go with this dispatch-based format, then we can change ODEProblem to have an optional argument for the mass-matrix and have another type parameter based off of that. Using that, our types are:

ODEProblem -> DAEProblem

ImplicitIMEXProblem (Is this a good name for the PETSc version? SemiImplicitIMEXProblem is too long) -> DAEProblem

ConstrainedODEProblem -> DAEProblem

(we may need to check for singularity of the mass matrix to actually say if a mass-matrix ODE problem is not a DAE.)

With that, the remaining question would be, what to do with IMEXProblem?

IMEXProblem -> ODEProblem

IMEXProblem -> ImplicitIMEXProblem

? Can you have two different promotions? Both make sense. The first would be nice because then you can just stick the problem into RK4() without care if needed (say you needed it to support some weird array all of the sudden). But the second would be nice because I presume that would make it work well with PETSc?

Then of course the type information on mass matrix:

UniformScaling
AbstractMatrix
Some kind of callable (maybe require <:Base.Callable here?

determines no mass matrix, constant, or function of time M(t,y,dy) for not in-place and M(t,y,dy,m) (in-place and can depend on dy? May it need to depend on dy or just M(t,y,m)?).

Then for linearity, we determine

<:LinearMap (and in the Problem constructor, dispatch so that way matrices are wrapped as linear maps)
Not a subtype of LinearMap

Then we document the problems, declaring linear functions, and declaring mass matrices. From there it's up to the solvers to handle as much or as little as they want.

I think this really cleans it up to something easily manageable and simple for users, but also lets a solver extract as much performance out as possible. Thoughts?

pwl commented 7 years ago

May it need to depend on dy or just M(t,y,m)?

I have never heart of a problem with a mass matrix depending on dy. I guess you can't call it a mass matrix anymore if it depends on dy as the equation looses the mass matrix structure. I think the most general mass matrix equation would be M(t,y)*dy=G(t,y) (at least this is the most general equation I've seen specialized methods for in Hairer&Wanner VI.6).

And for the promotions, this creates a two distinct paths of promotions:

IMEXProblem->ODEProblem -> DAEProblem
IMEXProblem -> ImplicitIMEXProblem -> DAEProblem

which may be confusing and lead to a difficult to maintain code. Either, both paths give identical objects or you could be solving different problems depending on the path an algorithm/user chose. If the instances are supposed to be identical it would also imply that the intermediate types ODEProblem and ImplicitIMEXProblem hold the same amount of information, which defeats the purpose of having both types.

ChrisRackauckas commented 7 years ago

I have never heart of a problem with a mass matrix depending on dy. I guess you can't call it a mass matrix anymore if it depends on dy as the equation looses the mass matrix structure. I think the most general mass matrix equation would be M(t,y)*dy=G(t,y) (at least this is the most general equation I've seen specialized methods for in Hairer&Wanner VI.6).

Alright, then just stick to M(t,y) and M(t,y,m) inplace.

If the instances are supposed to be identical it would also imply that the intermediate types ODEProblem and ImplicitIMEXProblem hold the same amount of information, which defeats the purpose of having both types.

But they aren't identical. One simplifies the IMEX problem by putting the functions together, the other incorporates the du into the implicit part (in the way shown by @mauro3). Those are very different ways to change an IMEX.

The rule for what works is pretty simple: an algorithm can solve any type below it. So given what @mauro3 said about the PETSc methods, they solve ImplicitIMEXProblem, and therefore they also solve IMEXProblem. The traditional ODE solvers can solve an IMEXProblem, but they cannot solve an ImplicitIMEXProblem. Algorithms which solve a DAEProblem can solve any problem (but maybe not efficiently).

For developers, you just choose a problem type in this hierarchy, and everything else is handled by the promotions. You can additionally have different code paths if you want to specialize on a part of the heirarchy using typeof(prob) <: ODEProblem in a DAE solver, or checking for linearity/constant mass matrices like shown above. But without doing any of this, a DAE solver will solve an IMEXProblem.

Can you give an example of how this may lead to difficult to maintain code? I don't see how the existence of another branch affects an algorithm. Also, you left out the ConstrainedODE branch (which is important for manifold methods).

ChrisRackauckas commented 7 years ago

Should pull @michakraus into this discussion as well.

Note that what I had as the ConstrainedODEProblem is like:

u' = f(t,u,v)
0 = g(t,u,v)

where I am writing v as the purely-algebraic variables. The GeometricIntegrators.jl version as featured here:

https://ddmgni.github.io/GeometricIntegrators.jl/latest/modules/equations.html

splits the first function:

u' = f(t,u) + h(t,u,v)
0 = g(t,u,v)

What's the reason for this splitting? Are there methods which take advantage of this? If there are, then it may make sense to just have this split version as ConstrainedODEProblem.

Also, I notice that your IODE is similar to what we have for DAEProblem, except you split it into the dynamical variables q, p, and p'. Do methods use this to some degree, or is this just a convenience feature you've implemented?

michakraus commented 7 years ago

What's the reason for this splitting?

The reason for the splitting is indeed that there are methods taking advantage of this like additive Runge-Kutta methods, most importantly SPARK methods.

The implicit ODE is mostly for convenience as this is the typical form of a variational Runge-Kutta method. In contrast to a standard symplectic partitioned Runge-Kutta method, where you have

q' = \partial H / \partial p
p' = - \partial H / \partial q

with H the Hamiltonian, for the variational kind (which is also symplectic), you have

p = \partial L / \partial v
p' = \partial L / \partial q

with L the Lagrangian. Such implicit ODEs also frequently occur in the realm of Dirac mechanics and the Hamilton-Pontryagin principle. Of course, you could write this as a standard DAE, but this kind of problem occurs so frequently in my work that it makes sense to have a special type.

ChrisRackauckas commented 7 years ago

The reason for the splitting is indeed that there are methods taking advantage of this like additive Runge-Kutta methods, most importantly SPARK methods.

Okay, then we should keep to that for ConstraintedODEProblem. Looking at the manifold projection methods that Mathematica uses, it should be easy match those onto this style, so there's no problem with having this default.

Such implicit ODEs also frequently occur in the realm of Dirac mechanics and the Hamilton-Pontryagin principle. Of course, you could write this as a standard DAE, but this kind of problem occurs so frequently in my work that it makes sense to have a special type.

I see, but do you know of methods which exploit this structure? Or do you just solve it like a normal DAE?

I think that, things which are just sugar like defining a problem could be a whole package. The ability to define a problem directly from a Hamiltonian or Legrangian (from https://github.com/JuliaDiffEq/DiffEqBase.jl/issues/14) could also be there to form a PhysicsModels.jl. Similar to FinancialModels.jl, it could just be a package which makes it easy to define common problems (FinancialModels.jl is for common SDEs in financial mathematics, PhysicsModels.jl could be for these physics-based ODEs/DAEs)

Then DiffEqBase.jl would be left for the Problem types which solvers directly use. This would also allow PhysicsModels.jl to have a dependency on DiffEqBase.jl for doing the autodifferentiation without adding a dep to the base package.

ChrisRackauckas commented 7 years ago

I created a set of types which encompasses these ideas. Thank you everyone for the input! I have learned a lot about how Julia's compiler works since then, so I made a few changes to accommodate that new knowledge and simplify this quite a bit. Specifically:

1) Type-checks vs UniformScaling are compile-time, so a parameterically typed mass matrix has essentially zero cost in the problem types. Thus each of the non-DAE problem types now have a field for it. Its parameter is not in the abstract type because I would like these to not have an "easy option" for throwing method errors: these should throw explicit errors if specific mass matrices (or lack of) are required.

2) IMEX would generalized to Split. Split problems can have a tuple of functions. Tuple size and the functions are compile-time constants, so it's essentially free to check that it fits the algorithm. Algorithms can then request specific setups, like IMEX can error if f is not a Tuple of length 2. In the docs for the algorithm it will specify the form necessary (it can be split by classes of forms). This allows for very complex PDE methods to be specified as ODE methods, and get the whole adaptivity + event handling for free.

3) Partitioned problems use the same tuple setup.

4) Constrained problems have split/partitioned forms as well.

These problem types encompass everything I can think of, both in ODE/DAE methods, and every PDE method I could find could be translated to one of these categories. Thus this setup gives us quite a bit to work with!

I'll close this because its first idea, to specify DAEs, is done. I'll open a new thread specifically for conversion/promotion. Conversion will not be ready for DiffEq ecosystem 2.0, so I'm letting that continue on in its own thread.