SciML / DifferentialEquations.jl

Multi-language suite for high-performance solvers of differential equations and scientific machine learning (SciML) components. Ordinary differential equations (ODEs), stochastic differential equations (SDEs), delay differential equations (DDEs), differential-algebraic equations (DAEs), and more in Julia.
https://docs.sciml.ai/DiffEqDocs/stable/
Other
2.8k stars 222 forks source link

DiffEq's PDE Story 2nd Try: Domains, BCs, Operators, and Problems #260

Closed ChrisRackauckas closed 5 years ago

ChrisRackauckas commented 6 years ago

DiffEq's first PDE attempt (which started the package) was a failure. The issue was that it was tied directly to an FEM subpackage I made internally, but then it couldn't scale to all of the problems it needed to face. The rest of DiffEq was then built up around decentralization. The core idea is that if you separate the data that specifies a problem and leave the solver support to dispatch, many different approaches and packages can step in to offer unique (and fast) ways to solve the problems. That has worked splendidly, so the FEM support was dropped in DiffEq 4.0 since it didn't conform to this.

What I want to return with is a PDE story worthy of the rest of DiffEq. So the goals it needs to meet is:

  1. It should specify problems in a manner that is distinct from implementation. But,
  2. It should hold the information required for algorithms to be fully optimized with their unique features.
  3. All of the components to "do it yourself" as a toolbox should be exposed. And
  4. It should interface with many different approaches and packages. But,
  5. There should be high-level automation to make the standard cases easy and efficient for "most users".

Here is a prototype for a design that could hopefully handle this.

General Flow: Heat Equation FDM

Let's talk about the Heat Equation. First we start with a domain type. Domain types are specified by types <:AbstractDomain. For example,

domain = RegularGrid((0,5),dx)

is a regular grid of [0,5] with spacing dx. But now we have to start talking about boundaries. It's the domain that knows what boundaries it has, and thus what has to be specified. So to the domain we need to add boundary functions.

Boundary conditions are specified by types <:AbstractBoundaryCondition. So we could have:

lbc = Dirichlet(0)
rbc = Neumann(g(t,x))

Now we'd package that together:

space = Space(domain,lbc,rbc)

Those domains are compatible with those boundary conditions, so it works without error. Once we have a space, our Heat Equation is specified via:

u' = D(t,x,u)*A*u + f(t,x,u)

so we specify:

tspan = (0,15)
prob = HeatProblem(D,f,space,tspan)

This is just a bundle of information that describes what we want to solve. But how do we want to solve it? We need to do a call that converts it into something with usable operators. Let's do it method of lines via a conversion call:

ode_prob = ODEProblem(prob,FiniteDifference())

where FiniteDifference dispatches this conversion to DiffEqOperators.jl with default order. FiniteDifference(order=4) can set the spatial discretization order, or we can choose other packages with adaptive space operators etc. This we can take to the ODE solver and solve.

Details

Now let's talk about details.

Domains

Domains are a specification of the domain to solve on. The basic ones will be provided by DiffEqPDEBase.jl. Wrapper libraries such as DiffEqApproxFun.jl or FEniCS.jl could provide extras, like SpectralGrid or FEMMesh. It can be as lean as possible, though in some cases like FEM it will need to hold node/mesh pairings.

Boundary Conditions

These are defined in DiffEqPDEBase.jl and have the minimal information describing a specific kind of boundary condition. Either holding constants, arrays, or functions, they just encapsulate the idea of the BC choice.

Space

Space is how domains and BCs come together. For a given domain, it will make sure the BCs are compatible with the domain. Can this be merged with Domains?

Operators

This part wasn't shown in the simple example. One we have a space, that is enough information to define operators. The high level interface is something like

DerivativeOperator(space,alg,derivative_order,approximation_order)

etc, and then via alg it dispatches to the appropriate package to spit out DiffEqOperators (this stuff is already defined) appropriate for that space. So like DiffEqOperators would just need to be changed to implement that dispatch.

Problems

Problems are just an abstraction over operators which basically knows what operators it should be building given the space, alg, and problem. They don't do anything until the conversion call, in which case they build these operators to then build something to solve.

Conversions

The obvious conversions are to ODE/SDE/DAE/DDEProblems and keep on going. We need two more problem types though. LinearProblem and NonlinearProblem. Then we cover it all. Everything becomes instantiates operators through some conversion.

High-Level Conversions

Then there can also be some standard conversions like DiffusionAdvection(prob,space,tspan) where prob=SDEProblem. This is just high level sugar to make everything boil down to conversions on PDE problems to then give linear, nonlinear, or diffeqs out in the end.

Big Questions

Documentation

I've learned by now to plan documentation levels at the same time as the interface. Where are things documented? I think the general workflow is documented, and then boundary conditions are documented generally, some general domains, the operator calls, and problems are all documented package-wide generic. Then unique domains, space, operator algorithms, and conversions are documented per-package. DiffEqOperators describes no unique domains, it describes what BCs it can support, choices for operator construction, and problem conversions it can perform. DiffEqApproxFun links over to ApproxFun for choices of domains, has a single Spectral() for generating the spectral operators (with choices in there), and then can generate ODEs for the Heat Equation, or a fully implicit discretization to a LinearProblem via a 2D discretization (can it?).

Representation Information?

When you get this ODEProblem... what does it actually represent? We need a way to describe back to the user what u0 means. Coefficients of ...? u0[i,j,k] means what (x,y,z)? Not sure how to handle this.

Time?

Time is kept separate because it usually acts differently when it exists. For algorithms like a fully implicit discretization, LinearProblem(prob,FullyFiniteDifference(dt)) or something.

Passing space

This is just minor bikeshedding, but instead of like making the Heat Equation require the user give f(u,p,t,x,y,z) for 3D, should it be f(u,p,t,x) for x a tuple/number? That keeps it the same regardless of dimension. Minor bikeshedding.

Conclusion

This allows us to specify our problem at a high level, interface with different packages in an extendable way to generate appropriate FDM/FVM/FEM/spectral discretizations, and spit out linear/nonlinear/diffeq problems to take to solvers. The hope is that this give enough places to inject information and package-specific components that it can be efficient, yet gives a good level of automation and transferrability to other setups.

Thanks. Pulling in some people to bikeshed this high level story. @dextorious @dlfivefifty @alanedelman @jlperla

ChrisRackauckas commented 6 years ago

To make this complete, let me describe the highly related DiffEqOperators interface. Basically, a DiffEqOperator is the discretization of an operator to L. There are many instantiations it can have, but the interface is that it can A_mul_B! and *, it can do conversions:

full(L) # Get a dense matrix
sparse(L) # Get a sparse matrix
banded(L) # Get a banded matrix
...

it can possibly expm or expmv, it acts as a standard DiffEq ODE function L(du,u,p,t) (for conversion purposes), and it has a dispatch for the following function:

update_coefficients!(L,u,p,t,x)

which updates internal coefficients of L to that setup. For example, the DiffEqArrayOperator can hold

L = a(u,p,t,x)*A

some quasi-semi-linear PDE discretization operator with a matrix A, and when you call update_coefficients! it will set it to the current time point and the current u values.

The reason why this is nice is that quasi-semi-time_dependent operators can all be appropriately evaluated via the sequence:

update_coefficients!(L,u,p,t,x)
L*u

Have a linear PDE and want the stationary solution (i.e. it's not time-dependent)?

update_coefficients!(L,nothing,p,nothing,x)
# Solve L*u = b

Want to write an Euler loop for a quasilinear PDE?

for i in 1:10
  t = t + dt
  update_coefficients!(L,u,p,t,x)
  u = u + dt*L*u
end

Etc. You see that these codes are operator independent, and independent of the implementation of the underlying operator. All it needs to know is that L* will be correct, and then it can use it as it needs to. You can update_coefficients and then call GMRES, or throw it into a nonlinear solver making sure you update_coefficients! each time. And this would take care of time-dependent BCs. While I showed this for a DiffEqArrayOperator which has an underlying matrix, the RegularFDMOperator from DiffEqOperators (it's getting a rename) is a lazy function for the FDM stencil, and update_coefficients!(L,u,p,t,x) sets the boundary conditions to the correct time to make L*u perform correctly (and then of course you can get sparse matrices and everything out as well). Since it holds the coefficients, RegularUpwindOperator can use update_coefficients! to calculate the directions and perform that correctly.

We plan to support special matrix types like BlockBandedMatrices.jl as well if that can be done generically, or that can be dispatching on the DiffEqOperator concrete type.

jlperla commented 6 years ago

Looks like a great start. I will put up more thoughts soon but a few to start... some are on general framework while some are on the gen:

dlfivefifty commented 6 years ago

Some comments:

  1. using lbc and rbc are bad ideas: they don't scale to higher dimensions, and many functional constraints are not "left" or "right".
  2. Space is not the right word if you are specifying non-zero boundary conditions.
  3. Why does Neumann(g(t,x)) depend on x?
  4. I don't understand why the domain is called RegularGrid but doesn't have a discretisation size.
jlperla commented 6 years ago

Sorry, just say that you answered the discretization question, but would love to see examples of alg

jlperla commented 6 years ago

It would be nice to have Space separated from the discretization... spectral methods could operate on the space and the boundaries in different ways. These are only combined with a Grid for finite difference methods? i.e.

domain = CartesianGrid((0,5)) #presumably could take the cartesian product of multiple tuples
space = CartesianSpace(domain, (lbc,rbc)) #Calling it cartesian because this would be specific to a particular higher dimension

Then the key thing I am missing is whether the DerivativeOperator is intended to be the abstract derivative, or to have the particular discretization involved (which I think is what your alg has in it) Would be nice to separate...

grid = RegularGrid(space, dx) #Potentially `ChebyshevDiscretization`, etc. 
op = DerivativeOperator(space, derivative_order)
L = FiniteDifferenceOperator(op, grid, alg, approximation_order).
update_coefficients!(L,u,p,t,x)

I don't have strong feelings about the separation of the DerivativeOperator and the particular discretization, but the space vs. domain would be helpful.

ChrisRackauckas commented 6 years ago

using lbc and rbc are bad ideas: they don't scale to higher dimensions, and many functional constraints are not "left" or "right".

Yes, that was just for RegularGrid which had only one tuple, i.e. a 1D regular grid. Space(domain,bcs...) would have to check to make sure there's enough BCs. For example, for RegularGrid((0,1),(0,1),dx,dy) it's a 2D regular grid, so

space(lbc,rbc,tbc,bbc,corner1,corner2,corner3,corner4)

would be what's required. Documenting this needs to be done by pattern for nD domain/space pairs, so there must be a better idea. But that crucial step is why I wanted to leave open the possibility to separate domain and space, maybe it's easier like this? What's gained? I don't really know that fully yet.

Space is not the right word if you are specifying non-zero boundary conditions.

I agree, but I didn't want the lack of a good name to pause the release of the idea any more.

Why does Neumann(g(t,x)) depend on x?

In case it's for a whole side of the domain. Left of a 2D grid needs a function.

I don't understand why the domain is called RegularGrid but doesn't have a discretisation size.

It has dx.

I am going to leave off any comments on the higher level interface of the solvers such as HeatProblem and ODEProblem since those wouldn't be directly useful for the sorts of applications I have in mind at this time.

I'm sure most people who read this won't be using that part, just the operators (or they will be building this level of the abstraction 👍 ). But this is how we support users who say "how do I solve the Diffusion-Advection equation?" Obviously they can do it themselves by building the operators, but it's not hard with this tooling to give them a function that spits out an ODE to solve. Also, this is a great place for undergrad and GSoC projects IMO.

As examples of the boundaries, I think that Dirichlet0 and Neumann0 are worthwhile as special cases because they are not affine (which can help do specialized algorithms when dispatching solutions).

Yes. They will be there.

What are examples of the alg you would have in mind

The "standard" one will be the current DiffEqOperators one. It might even good to have defaults for this. But for example, someone can write a package which computes fast Laplacians for regular girds, and so they can let FastLaplacian() dispatch to their package (or some wrapper over their package) and spit out a DiffEqOperator that does the discretization their way. Since we support lazy operators in this, there are so many different ways to define the same mathematical operator that we want to leave open the possibility of better data structures. DiffEqOperators.jl would do the 2D Laplacian for the Heat Equation via two of its operators and then Ay*u + u*Ax. Someone can write a FastLaplacians.jl could be created to make FastLaplacian() write the fused loop version for some very specific 2D operator choices, and thus it can be a good dispatch to use in that specific case. That's the point there.

How do I get the discretized operator from it?

See the stuff on the <:AbstractDiffEqOperator interface.

Then the key thing I am missing is whether the DerivativeOperator is intended to be the abstract derivative, or to have the particular discretization involved (which I think is what your alg has in it

If I make it lower case, would it be easier to understand that it's supposed to be a high level "give me a discretization of this derivative to this order that's appropriate for this space using this method". derivative_operator(space,2,2,FiniteDifference()).

dlfivefifty commented 6 years ago

It's not possible to know what boundary conditions are needed without knowing the differential operator, and even if you know the differential operator its non-trivial.

ChrisRackauckas commented 6 years ago

I see what you mean. So then it'll just have to error at the problem/operator construction call instead of the space construction time, meaning there's no reason for Space and we might as well add the BCs to the Domain?

dlfivefifty commented 6 years ago

Bcs should either be in the operator, or in the basis/space (for zero conditions). Putting them in the domain doesn't really make sense.

Also, there's some work on a general purpose Domains.jl package: https://github.com/daanhb/Domains.jl

jlperla commented 6 years ago

For stochastic processes (I can't say much about physics applications) the boundaries are connected to the domain: reflecting or absorbing barriers.

The problem with associating them with the operator itself is that Chris has the idea of composing the operators lazily. It is only at the level of the completely composed operator that you can apply the boundary values (for stochastic processes at least).

Are we sure the operator composition approach would work (for boundaries which end up affine)?

ChrisRackauckas commented 6 years ago

The problem with associating them with the operator itself is that Chris has the idea of composing the operators lazily. It is only at the level of the completely composed operator that you can apply the boundary values (for stochastic processes at least).

Yeah, this is where the idea comes from. The first approach I had in mind was:

domain = RegularGrid((0,5),dx)
lbc = Dirichlet(0)
rbc = Neumann(g(t,x))
bcs = BoundarySet(lbc,rbc)
coeff(u,p,t,x) = x^2 + t^2
A = derivative_operator(domain,bcs,FiniteDifference(),coeff,2,2)
L = derivative_operator(domain,bcs,LazyUpwind(),coeff,2,2)

but @jlperla asked whether there's a way to make sure the same BCs are satisfied, in which case I thought we might as well package the BCs into the "domain". Maybe that's still fine but we need a new name?

Are we sure the operator composition approach would work (for boundaries which end up affine)?

Yes. There's the odd issue that LinearProblem will have to modify your b to make the "real" linear operator actually linear, but that's just a detail. Whether composing lazy operators is a good idea is what I'm not so sure about.

dlfivefifty commented 6 years ago

As soon as you associate the boundary conditions with the domain, it (1) breaks generality, as you can't impose constraints like \int_Ω u(x) dx = 0 and (2) doesn't make sense in the ∞ -dimension setting, where bcs are either side constraints, or incorporated in the function space.

ChrisRackauckas commented 6 years ago

breaks generality, as you can't impose constraints like \int_Ω u(x) dx = 0

Wouldn't that just be IntegralConstraint(0) and stick that in the domain? Of course domain construction won't be able to error since it won't know whether it has enough BCs for something like that, but then it can error with that information at operator construction? It would simply have the BCs tagging along in the domain type without checking whether that's a sensible grouping, just to make sure every operator gets the same BCs. Is there a case where you'd want different operators to have different constraints? If there's a case of that, then yes let's split it out.

doesn't make sense in the ∞ -dimension setting, where bcs are either side constraints, or incorporated in the function space.

That's covered by the answer to the first, as either BCs are just a collection sent to every operator build (so it doesn't matter if there's some weird ones or none), or it needs to be at the operator instantiation call.

dlfivefifty commented 6 years ago

I don't think I understand: an FD second-derivative maps n points to n-2 points, then you add whatever boundary conditions you want (by convention at the top and bottom). So the boundary conditions should always be an after thought?

ChrisRackauckas commented 6 years ago

Thinking less mathematically and more practically,

A = derivative_operator(domain,bcs,...)
L = derivative_operator(domain,bcs,...)

would I ever want those bcs (or more generally, constraints) to be different? If every operator has to always satisfy the same constraints, do they need to be kept separate?

jlperla commented 6 years ago

The main question I have is whether the generality to allow lazily composing operators is worth the complexity in interface complexity and short-term performance. With loop fusion, etc. it may be possible to generate an overhead free implementation, but it is going to be tricky.

For economics and finance applications at least, there are very few finite difference discretizations we would need (e.g. for a diffusion process: central differences on a diffusive second derivative and upwind on the first-derivative drift).

dlfivefifty commented 6 years ago

I don’t understand why the derivative takes in bcs: it seems to be conflating the application of the operator with the inversion of the operator.

jlperla commented 6 years ago

For example, we could have a specialized operator for a diffusion process which generates the coefficients given the passed in boundary functions:

domain = RegularGrid((0,5),dx)
mu(x) = .... mydrift function
sigma(x) = ... my variance function
mu_x = mu.(domain.grid)
sigma_x = sigma.(domain.grid)
A = upwind_diffusion_operator(domain, mu_x, sigma_x, (Neuman0, Neuman0))
ChrisRackauckas commented 6 years ago

The main question I have is whether the generality to allow lazily composing operators is worth the complexity in interface complexity and short-term performance.

What can be dropped/simplified if we aren't composing operators lazily? None of the PDE story has that in it, that's the DiffEqOperator stuff which users can pretty much ignore.

ChrisRackauckas commented 6 years ago

I don’t understand why the derivative takes in bcs: it seems to be conflating the application of the operator with the inversion of the operator.

Even forward application of the operator requires the BCs. A*u for the Laplacian has a different result for periodic BCs vs Neumann vs Dirichlet etc. I'm not sure it can be well-defined without it.

jlperla commented 6 years ago

DiffEqOperator stuff which users can pretty much ignore.

For me at least, I have no use for the higher level "heat equation" and "odeproblem" solution methods, so it is the DiffEqOperator which I would directly use at this point.

What could be simplified:

Now, if in the underlying code for the operator you want to use lazily decomposed operators, that is an implementation detail, but we are discussing the higher level interface here.

dlfivefifty commented 6 years ago

The derivative of a function can’t “see” a single point (or rather it’s just not defined at a point where it’s not differentiable), so I don’t agree that the notion of derivative is different for Dirichlet and Neumann.

Maybe I’m missing the point of the conversation: is the point of this discussion to make a system for doing finite differences (and only finite differences), or is the point to make something general purpose where you can swap out the Discretization method?

ChrisRackauckas commented 6 years ago

this whole discussion of the connection between boundary values from the operators goes away as users only use the fully composed operator along with the boundary values they specify.

The stencil matrix still has to be modified at the edges to satisfy the BCs. That's what the naive loop is essentially doing.

We don't have to worry about the details of loop fusion when all we want is the sparse(A) etc.

Where do you have to worry about that? Show me an example of where you have to worry about the details of loop fusion. Just get and sparse the operator if all you want is the sprase operator?

I am still not convinced that boundary values that create affine setups are that easy to generalize with decomposed operators.

full(A)->(A,B) which is A*u+B when it has to.

People who are not generic programming geniuses can add to the library of discretization of differential operators if they just work through the finite-difference algebra.

Is this about the lazy implementation of DiffEqOperators via Fornberg's algorithm? That has nothing to do with the interface... but anyways, even that has been mostly built by undergrads so it's at least at that level. Most of the code is just a loop over a stencil array. But this has nothing to do with the interface.

ChrisRackauckas commented 6 years ago

Maybe I’m missing the point of the conversation: is the point of this discussion to make a system for doing finite differences (and only finite differences), or is the point to make something general purpose where you can swap out the Discretization method?

The latter. For example, this should be generic enough that I can have a wrapper over an ApproxFun domain, specify some BCs in the DiffEq style, derivative_operator kicks out some matrices for the discretized operator, and then use that to build an ODEProblem etc. If it can't do that, then it failed.

The derivative of a function can’t “see” a single point (or rather it’s just not defined at a point where it’s not differentiable), so I don’t agree that the notion of derivative is different for Dirichlet and Neumann.

But the differential operator is the derivative at each point in the discretization. At some point that operator has to be instantiated, and in order to know its action it has to have the BCs in it.

dlfivefifty commented 6 years ago

But in FD it shouldn’t be each point: each derivative drops one point. So D maps values at n points to values at n-1 points, D^2 maps values at n points to n-2 points and so on.

dlfivefifty commented 6 years ago

Only in the periodic case (which really is a different domain since the topology is a 1-torus) do you get a map from n points to n points.

jlperla commented 6 years ago

When I said

this whole discussion of the connection between boundary values from the operators goes away as users only use the fully composed operator along with the boundary values they specify.

What I meant was that if the user only directly works with fully composed operators in their higher level interface, then we don't have the current issue we are discussing with the boundary values being connected to the derivative. If the user works directly with the fully composed differential operator, then it makes sense to group that with the appropriate boundaries.

ChrisRackauckas commented 6 years ago

But in FD it shouldn’t be each point: each derivative drops one point. So D maps values at n points to values at n-1 points, D^2 maps values at n points to n-2 points and so on.

Only in the periodic case (which really is a different domain since the topology is a 1-torus) do you get a map from n points to n points.

Not necessarily. If you know the BCs then you can incorporate them into the operator to make the map not drop points. The periodic case is well known where you just put stencil coefficients in the upper right and lower left corner of the matrix (thinking about it as a matrix). But for Dirichlet/Neumann, in order for the BCs to be satisfied it defines what the values have to be at the boundaries, giving you an n -> n map by filling in the ends via the BCs. I'm trying to think of whether there's a case that can't be true for.

dlfivefifty commented 6 years ago

That incorporation is exactly equivalent to a step of Gaussian elimination ... I don’t see why you would want to try doing Gaussian elimination before you’ve setup the problem ..

jlperla commented 6 years ago

For a user in my field, what they really want is to (1) specify the stochastic process and (2) specify what happens at the boundaries of that stochastic process . There are only two types of boundaries that matter: (a) a reflecting barrier and (b) an absorbing barrier.

I would love to specify these things in an abstract way independent of the discretization, but it isn't necessary. In fact, there are practical considerations (i.e. in solving HJBE the drift changes all the time as part of the algorithm, and you solve for the new drifts at grid points) that make sticking with finite differences reasonable for this exact problem.

So, in my way of thinking you are having me specify the https://en.wikipedia.org/wiki/Infinitesimal_generator_(stochastic_processes) for the stochastic process directly, which is fine. If you want me to specify it in parts and have lazy composition then I think that is fine as well, but somehow you are going to have to attach the boundary types to it when you generate the full matrix that puts things together. The general algebra behind that is not at all obvious to me, let alone the numeric implementation or what happens when I want to take the adjoint (as you do when moving between solving the KBE and KFE).

dlfivefifty commented 6 years ago

In more detail: for u_xx = f with zero Dirichlet we can construct the "full" problem as [B_{-1}; D^2; B_1] = [0 ; f ; 0], or written out with h = 1:

[1                 ;
  1 -2 1         ;
       1 -2 1    ;
       … … …   ;
           1 -2 1 ;
                   1 ] * u =  [0; f; 0]

Or we can use the fact that 0 = u[1] = u[end], do two steps of Gaussian elimination to reduce the problem to

[ -2 1         ;
       1 -2 1    ;
       … … …   ;
           1 -2 ] * u[2:end-1] = f

But I think once you've done this Gaussian elimination step, composing operators, etc., no longer has meaning (you would need to justify why the Gaussian elimination commutes).

jlperla commented 6 years ago

@dlfivefifty I see. So you are saying that the boundary should only be applied when you use it in a particular PDE/ODE? If so, then it is the update_coefficients!(L,u,p,t,x) which needs to "boundary aware", not the operator itself?

jlperla commented 6 years ago

OK. I am 100% with you now. The boundaries should be left outside of the operator itself. In some ways, this works better for the stochastic process specification as well, since you would basically be writing down the infinitesimal generator for the stochastic process in the "continuation" or interior.

jlperla commented 6 years ago

I also see that the lazy composition of the operators is much more reasonable for the interior. It was always the boundaries which scared me, so if those are attached at the end when constructing the full discretized operator, then that is less scary. So @ChrisRackauckas geometric brownian motion might be something like:

domain = RegularGrid((0,5),dx)
variance(u,p,t,x) = sigma^2 * x^2
drift(u,p,t,x) = mu * x
L_diffusion = derivative_operator(FiniteDifferences(), variance, 2, 2)
L_drift = derivative_operator(LazyUpwind(), drift,1,1)
L_composed = L_diffusion + L_drift

#Then to update the `A` matrix with this operator subject to the boundaries,
lbc = Dirichlet0
rbc = Neumann0
update_coefficients!(A, L_composed, lbc, rbc, p, t, domain.grid)

And the update_coefficients would be overloaded for different types of domain sizes. The lbc and rbc would make sense here since it is a cartesian grid.

ChrisRackauckas commented 6 years ago

for uxx = f with zero Dirichlet we can construct the "full" problem as [B{-1}; D^2; B_1] = [0 ; f ; 0]

We are defining our u vector to be [B_{-1}; D^2; B_1] for that purpose. Dirichlet then means having the constant on the first/last columns. This also makes it more consistent with the way Neumann has to be done anyways and it needs to be done in order to do the application of the operator anyways.

But I see what @dlfivefifty is getting at. I made a bad error. For the Dirichlet problem u_xx = f, defining the boundary conditions as part of the variables, you get

A = [1 2 1 0 0 0 0
     0 1 2 1 0 0 0
     0 0 1 2 1 0 0
     0 0 0 1 2 1 0
     0 0 0 0 1 2 1]

For the Dirichlet problem u_x =f, defining the boundary conditions as part of the variables, you get something similar again with [-1,1] and the trailing 1 at the end. The question is then, what's the operator for u_x + u_xx? If it was Dirichlet0, then it can be both added together. If it's not Dirichlet0, I was thinking that by superposition you can just add the operators. But it's pretty clear if you do this you won't satisfy the boundary with it's not Dirichlet0 since you'll double the BC... (but the interior is fine).

So indeed the problem is that boundaries are only sensible on on the composed operator.

Another Attempt

Keep trying until it works?

Now I'm going further towards ApproxFun and FEniCS. I think we need a DSL so we can understand operator composition. I want to see if I can keep away complexity though. How about:

domain = Interval(0,5)
space = Space(domain,PointSpace(),dx) # PointSpace is an alias to HatFunction()
variance(u,p,t,x) = sigma^2 * x^2
drift(u,p,t,x) = mu * x
Dxx = derivative_operator(2,2,CentralDifference())
Dx = derivative_operator(1,1,Upwind())
bcs = BoundarySet(Dirichlet0(),Neumann0()) # Except more like FEniCS, don't feel like writing that out
L = DifferentialOperator(drift*Dx + variance*Dxx,Sparse(),space,bcs) # Banded() is the default, makes it allocate a sparse matrix as the backend. Other choices can be Sparse(), Dense(), Lazy(), etc.

# Using the operator
update_coefficients!(L, u, p, t)
L*u == L.A*u + L.b
L\f == L.A\(L.f - L.b)
L(u,p,t) == (update_coefficients!(L, u, p, t); L*u)

full(L_diffusion), sparse(L_diffusion), etc. should work on their own, but just return the stencil matrix. And then to get spectral operators from ApproxFun, that can dispatch at the operator generations:

domain = Interval(0,5)
space = Space(domain,Chebyshev(),100) # Truncated Chebyshev space to 100
variance(u,p,t,x) = sigma^2 * x^2
drift(u,p,t,x) = mu * x
Dxx = derivative_operator(2, 2, ApproxFun())
Dx = derivative_operator(1,1, ApproxFun())
lbc = Dirichlet0()
rbc = Neumann0()
L = DifferentialOperator(drift*Dx + variance*Dxx,space,bcs)

This also leaves open variational formulations down the line:

domain = ... # Some FEM Mesh
space = Space(domain,FEM_P())
dx = DifferentialElement()
u = TrialFunction(space)
v = TestFunction(space)
variance(u,p,t,x) = sigma^2 * x^2
drift(u,p,t,x) = mu * x
gu = grad_operator()
gv = trial_grad_operator()
f = ... # Constant
bcs = BoundarySet(Dirichlet0(),Neumann0())
L = DifferentialOperator(u*v*dx + dt*dot(gu, gv)*dx == (u0 + dt*f)*v*dx,domain,bcs)

This means that we can do away with the problems and just have "easy operators"

domain = Interval(0,5)
space = Space(domain,PointSpace(),dx)
variance(u,p,t,x) = sigma^2 * x^2
drift(u,p,t,x) = mu * x
bcs = BoundarySet(Dirichlet0(),Neumann0()) # Except more like FEniCS, don't feel like writing that out
order = 2
L = DiffusionAdvection(drift,variance,order,space,bcs)

The questions are then:

  1. How do we calculate boundary conditions on these composed finite difference operators?
  2. What do we need in the DSL for higher dimensional operators?

I think we start by dropping lazy default in the composed operators unless getting the BCs lazy there is easy enough. Probably a lot more questions, but I'm just going to throw this out there.

jlperla commented 6 years ago

Looking like a great step forward, and I always love DSLs. A few initial thoughts on what you have written

Finite differences discretization

space = Space(domain,PointSpace(),dx) # PointSpace is an alias to HatFunction()

Dxx = derivative_operator(D^2; CentralDifference(), approx_order=1) Dx = derivative_operator(D ,Upwind()) A_composed = driftDx + varianceDxx #Can compose linear operators L = DifferentialOperator(A_composed, Sparse(), space, bcs)

Using the operator

update_coefficients!(L, u, p, t) Lu == L.Au + L.b L\f == L.A(L.f - L.b) L(u,p,t) == (update_coefficients!(L, u, p, t); L*u)

jlperla commented 6 years ago

Now, to put the first wrinkle in the testing of the interface, I want my student to implement jump-diffusion processes. To keep things very simple, with an arrival rate $\lambda$, the value grows by a step size d. With this, the infinitesimal generator of the stochastic process is $$ A v= mu x D_x v+ sigma^/2 x^2 D_xx v+ lambda * (v((1+d) x) - v(x)) $$ To write this jump-diffusion in the code,

d = Interval(0,5)
D = Derivative(d)
variance(u,p,t,x) = sigma^2/2 * x^2
drift(u,p,t,x) = mu * x
arrival_rate(u,p,t,x) = lambda
jump_size(u,p,t,x) = (1 + d)*x #For deterministic jumps.  Otherwise need convolution.
Dj= JumpProcess(arrival_rate, jump_size)

# Finite differences discretization
space = Space(domain,PointSpace(),dx) # PointSpace is an alias to HatFunction()

Dxx = derivative_operator(D^2; CentralDifference(), approx_order=1)
Dx = derivative_operator(D ,Upwind())
A_composed = drift*Dx + variance*Dxx +Dj) #Can compose linear operators
L = DifferentialOperator(A_composed, Sparse(), space, bcs) 

# Using the operator
update_coefficients!(L, u, p, t)
L*u == L.A*u + L.b
L\f == L.A\(L.f - L.b)
L(u,p,t) == (update_coefficients!(L, u, p, t); L*u)

Now this all sounds good, but the issue is that while a discretized D maps n to n-1 points and a discretized D^2 maps n to n-2 points, I believe the discretized jump process maps n to n points in general. Moreover, I don't think the boundary value does not just change the discretized matrix locally if there is a reflection. If there is an absorbing vs. a reflecting barrier, it will change the values in the matrix within the radius of the jump near the boundaries?

ChrisRackauckas commented 6 years ago

I have completely switched over to @dlfivefifty thinking that the boundaries should only be applied with the Differential Operator is "used". So update_coefficients! or whatever. I think that what is listed as DifferentialOperator should be more like DiscretizedOperator because it only makes sense discretized and with the Boundaries applied.

Consider separating out the discretization from the abstract definition of the operator. For example, maybe exactly mimicking ApproxFun is the way to go - and could lead to a common wrapper interface down the road.

Yes, this would happen just by formalizing the DSL statement. It was relevant so I went ahead and unveiled that right after the change here: https://github.com/JuliaDiffEq/DifferentialEquations.jl/issues/261 . So the DSL statement makes an abstract definition of a differential operator, and then I was using DifferentialOperator to instantiate it. But DiscretizedOperator is probably a better term for that.

Now the blemish I thought you'd mention is:

Dxx = derivative_operator(D^2; CentralDifference(), approx_order=1)
Dx = derivative_operator(D ,Upwind())

Those are the abstract statements about what operator, and there we're throwing in discretization information. Of course, the reason is because

A_composed = drift*Dx + variance*Dxx #Can compose linear operators
L = DifferentialOperator(A_composed, Sparse(), space, bcs) 

at this stage it would be hard to apply that information individually. But it still feels odd.

dlfivefifty commented 6 years ago

If I were writing this in ApproxFun, backward Euler in time would look like the following:

S₀ = SplineSpace(0:Δx:5)  # solution space as linear splines
S₁ = SplineSpace(Δx:Δx:5)  # derivative space
S₂ = SplineSpace(Δx:Δx:5-Δx) # second derivative space

D₀ = Difference() : S₀ → S₁  # upwind
D₁ = Difference() : S₁ → S₂  # downwind
R₀ = I : S₀ → S₁  # restrict to sub grid
R₁ = I : S₁ → S₂  # restrict to sub grid
D² = D₁*D₀  # central differences, S₀ → S₂

A  = drift*S₁*Dx + variance*Dxx   # not sure about Dj

B₀ = Evaluation(0)
B₅ = Evaluation(5)
u[n+1] = [B₀ ; R₁*R₀ - Δt*A ; B₅] \ [0 ; u[n] ; 0]

Using the automatic space promotion, this could be simplified a bit more:

S₀ = SplineSpace(0:Δx:5)  # solution space as linear splines
S₁ = SplineSpace(Δx:Δx:5)  # derivative space
S₂ = SplineSpace(Δx:Δx:5-Δx) # second derivative space

D₀ = Difference() : S₀ → S₁  # upwind
D₁ = Difference() : S₁ → S₂  # downwind
D² = D₁*D₀  # central differences, S₀ → S₂

A  = drift*Dx + variance*Dxx   # the R₁ would get added automatically 

B₀ = Evaluation(0)
B₅ = Evaluation(5)
u[n+1] = [B₀ ; I - Δt*A ; B₅] \ [0 ; u[n] ; 0]
ChrisRackauckas commented 6 years ago

Moreover, I don't think the boundary value does not just change the discretized matrix locally if there is a reflection. If there is an absorbing vs. a reflecting barrier, it will change the values in the matrix within the radius of the jump near the boundaries?

Yes. The last n entires are effected, where n = (m-1)/2 has m as the stencil length for a central difference discretization and then it's just n=m for forward differencing.

ChrisRackauckas commented 6 years ago

If I were writing this in ApproxFun, backward Euler in time would look like the following:

Yes, that method of having to handle \ is exactly why I want to pack it into the discretized operator so it can be handled uniformly. Otherwise we cannot generically write that solver statement. This seems to be how FEniCS does it.

But how do you do the forward Euler?

u[n+1] = [B₀ ; Δt*A ; B₅] * [0 ; u[n] ; 0]

?

dlfivefifty commented 6 years ago

Forward Euler needs the Gaussian elimination step to reduce the number of unknowns...for Dirichlet this is straightforward as Gaussian elimination just removes the first and last column, so we would have:

à = A[:,2:end]
u[n+1] = Δt*Ã*u[n]

For general boundary conditions it becomes more complicated because the Gaussian elimination step changes the entries. You could imagine something like the following working:

Q = nullspace([B₀ ; B₅]) 
à = A*Q
u[n+1] = Δt*Ã*u[n]
ChrisRackauckas commented 6 years ago

The question though is how to get that uniform with Neumann conditions. For Neumann the boundaries are unknowns as well, so then A is nxn. So that's why I proposed before to just have the boundaries in u. This would mean that A for Dirichlet is nxn with the first and last columns zero for a derivative operator. This is like FEM where you keep the boundary nodes because you need them for plotting and other things anyways.

Then the Neumann operator is nxn, but the issue is it's alluding me how to build it generally. Of course I can write down how to do it for specific discretizations, but the general "here's a composed operator, you add Neumann on the ends by operator X" is what I'm missing. Neumann0 is easy since you can just reflect the stencil over. Otherwise one way to do it is to use the order-matching one-sided stencil near the boundary, and then you just have to handle the last point to somehow compose the derivative operations.

jlperla commented 6 years ago

By "that" I assume you mean the jump diffusion boundary?

I am not sure if you the boundary value for the reflected jump diffusion is really a Neumann0 (and that the absorbing barrier is really a dirichlet)... Maybe that is the issue.

This might be a general issue with differential operators that have things like delays and convolutions in them... Even if they are linear operators, the "boundaries" can effect the whole discretized matrix

ChrisRackauckas commented 6 years ago

BTW I found this book the other day:

https://atmos.washington.edu/~breth/classes/AM585/lect/rjl_585.pdf

It's one of the only ones I've found that treats higher order discretizations along with stencil generation via Fornberg. @jlperla you might want to take a look at it. They seem to take a similar approach to what I'm saying, where once you go to higher orders you just de-center the stencil to grab only the boundary value. If the boundary is in the array, that's a simple operation.

By "that" I assume you mean the jump diffusion boundary?

Let's not discuss jump discretizations quite yet. Those need an interpolation formula if you don't discretize by the jump size. That needs a lot more coefficient generation but isn't terrible. We'd still end up in the same place where we interpolate to get stencils which now have one boundary value in them. Though maybe the true solution to the FDM stencil generation is just to go back to coefficient generation via the interpolating polynomials at the boundaries.

ChrisRackauckas commented 6 years ago

I am not sure if you the boundary value for the reflected jump diffusion is really a Neumann0 (and that the absorbing barrier is really a dirichlet)... Maybe that is the issue.

What do you mean? Example?

jlperla commented 6 years ago

Now, if the grid size is n and after discretization the jumps can be up to m1 backwards and m2 forwards (note they are frequently not symmetric) then I get the differential operator maps n to min(n - m1 - m2, 0) nodes...

Of course, the m1 and m2 can only be known when the discretization is applied.

Furthermore when you have compound poisson processes it is frequent for the support of the jumps to be the entire space. So the operator maps n nodes to 0 nodes (ie the whole matrix is effected by boundary conditions)

dlfivefifty commented 6 years ago

The way I think of Neumann (and other BCs) is you start with the full system

R = eye(n)[2:end,:]
R u_t = D² u
B = [-1 1 0 … 0; 0 … 0 -1 1]
B*u = 0

We then use the fact that B*u = 0 to remove degrees of freedom via:

R u_t = (D² -  [-1 0; zeros(n-4,2); 0 1]*B)*u

The right-hand side operator A = D² - [-1 0; zeros(n-4); 0 1]*B is then

[0 -1 1 ; 
 0  1 -2 1;
      ….  …. … 
            1 -2 1 0
               -1 1 0]

This means u[1] and u[n] do not actually appear in the time-evolution part, so we get for à = A[:,2:end] the system without boundary conditions:

u_t[2:n-1] = Ã u[2:end-1]
ChrisRackauckas commented 6 years ago

Now, if the grid size is n and after discretization the jumps can be up to m1 backwards and m2 forwards (note they are frequently not symmetric) then I get the differential operator maps n to min(n - m1 - m2, 0) nodes...

Yes, there would need to be a dispatch to catch when there's a jump operator. What you described is why people normally don't solve the master equation for jump operators anyways since it's just impossible in most real cases (it's also quite common to have infinite states and combinations of m variables, https://arxiv.org/pdf/1410.1934.pdf). But we can add that coefficient generation later. The DSL already has a way to catch that (https://github.com/JuliaDiffEq/DifferentialEquations.jl/issues/261) and we should try to specialize the operator discretization as much as possible.

(Even then, I think it's good to try and specialize on the banded case for jumps at first. The dense operator case is a very very special case that we wouldn't want to impose on the entire setup just because it has a possibility of occuring. This is the whole point of dispatch and specialization!)

The pure differential operators are our very first goal. Let's keep the discussion there, and when we can actually sanely use them we can go back to asking how to auto-discretize operators which have jumps.

ChrisRackauckas commented 6 years ago

B = [-1 1 0 … 0; 0 … 0 -1 1]

That's quite nice because then that's easy to extend to higher order discretizations just by putting a higher order one-sided difference on the end. I am trying to see how to perform that lazily though (of course we can do sparse and banded first, just thinking).

The inverse of the restriction operator is then given by the BCs since you know that the ends have to add up to 1. I would just make that operator C and do

u_t = C*(D² -  [-1 0; zeros(n-4); 0 1]*B)*u

L = C*(D² -  [-1 0; zeros(n-4); 0 1]*B)
u_t = L*u
B*C*R*u = 0

I think we're getting to the point that Alex was making about Chebfun doing

Q*P*A*Q*P*u_n

to always make the differencing satisfy the BCs even when all the DoF are there. We're definitely touching on the same issue in a different context.

The theory works, but the one thing I'm starting to worry about is whether this is still numerically stable. even if u_t makes a BC-satifying u map to another BC-satisfying u, I wonder how numerical error builds up at the boundaries, or if the projection somehow "starts it over".

After the update, would we need u = C*R*u to fix the boundaries? I wonder how Chebfun handles it.