SciML / DifferentialEquations.jl

Multi-language suite for high-performance solvers of differential equations and scientific machine learning (SciML) components. Ordinary differential equations (ODEs), stochastic differential equations (SDEs), delay differential equations (DDEs), differential-algebraic equations (DAEs), and more in Julia.
https://docs.sciml.ai/DiffEqDocs/stable/
Other
2.8k stars 222 forks source link

DiffEq's PDE Story 2nd Try: Domains, BCs, Operators, and Problems #260

Closed ChrisRackauckas closed 5 years ago

ChrisRackauckas commented 6 years ago

DiffEq's first PDE attempt (which started the package) was a failure. The issue was that it was tied directly to an FEM subpackage I made internally, but then it couldn't scale to all of the problems it needed to face. The rest of DiffEq was then built up around decentralization. The core idea is that if you separate the data that specifies a problem and leave the solver support to dispatch, many different approaches and packages can step in to offer unique (and fast) ways to solve the problems. That has worked splendidly, so the FEM support was dropped in DiffEq 4.0 since it didn't conform to this.

What I want to return with is a PDE story worthy of the rest of DiffEq. So the goals it needs to meet is:

  1. It should specify problems in a manner that is distinct from implementation. But,
  2. It should hold the information required for algorithms to be fully optimized with their unique features.
  3. All of the components to "do it yourself" as a toolbox should be exposed. And
  4. It should interface with many different approaches and packages. But,
  5. There should be high-level automation to make the standard cases easy and efficient for "most users".

Here is a prototype for a design that could hopefully handle this.

General Flow: Heat Equation FDM

Let's talk about the Heat Equation. First we start with a domain type. Domain types are specified by types <:AbstractDomain. For example,

domain = RegularGrid((0,5),dx)

is a regular grid of [0,5] with spacing dx. But now we have to start talking about boundaries. It's the domain that knows what boundaries it has, and thus what has to be specified. So to the domain we need to add boundary functions.

Boundary conditions are specified by types <:AbstractBoundaryCondition. So we could have:

lbc = Dirichlet(0)
rbc = Neumann(g(t,x))

Now we'd package that together:

space = Space(domain,lbc,rbc)

Those domains are compatible with those boundary conditions, so it works without error. Once we have a space, our Heat Equation is specified via:

u' = D(t,x,u)*A*u + f(t,x,u)

so we specify:

tspan = (0,15)
prob = HeatProblem(D,f,space,tspan)

This is just a bundle of information that describes what we want to solve. But how do we want to solve it? We need to do a call that converts it into something with usable operators. Let's do it method of lines via a conversion call:

ode_prob = ODEProblem(prob,FiniteDifference())

where FiniteDifference dispatches this conversion to DiffEqOperators.jl with default order. FiniteDifference(order=4) can set the spatial discretization order, or we can choose other packages with adaptive space operators etc. This we can take to the ODE solver and solve.

Details

Now let's talk about details.

Domains

Domains are a specification of the domain to solve on. The basic ones will be provided by DiffEqPDEBase.jl. Wrapper libraries such as DiffEqApproxFun.jl or FEniCS.jl could provide extras, like SpectralGrid or FEMMesh. It can be as lean as possible, though in some cases like FEM it will need to hold node/mesh pairings.

Boundary Conditions

These are defined in DiffEqPDEBase.jl and have the minimal information describing a specific kind of boundary condition. Either holding constants, arrays, or functions, they just encapsulate the idea of the BC choice.

Space

Space is how domains and BCs come together. For a given domain, it will make sure the BCs are compatible with the domain. Can this be merged with Domains?

Operators

This part wasn't shown in the simple example. One we have a space, that is enough information to define operators. The high level interface is something like

DerivativeOperator(space,alg,derivative_order,approximation_order)

etc, and then via alg it dispatches to the appropriate package to spit out DiffEqOperators (this stuff is already defined) appropriate for that space. So like DiffEqOperators would just need to be changed to implement that dispatch.

Problems

Problems are just an abstraction over operators which basically knows what operators it should be building given the space, alg, and problem. They don't do anything until the conversion call, in which case they build these operators to then build something to solve.

Conversions

The obvious conversions are to ODE/SDE/DAE/DDEProblems and keep on going. We need two more problem types though. LinearProblem and NonlinearProblem. Then we cover it all. Everything becomes instantiates operators through some conversion.

High-Level Conversions

Then there can also be some standard conversions like DiffusionAdvection(prob,space,tspan) where prob=SDEProblem. This is just high level sugar to make everything boil down to conversions on PDE problems to then give linear, nonlinear, or diffeqs out in the end.

Big Questions

Documentation

I've learned by now to plan documentation levels at the same time as the interface. Where are things documented? I think the general workflow is documented, and then boundary conditions are documented generally, some general domains, the operator calls, and problems are all documented package-wide generic. Then unique domains, space, operator algorithms, and conversions are documented per-package. DiffEqOperators describes no unique domains, it describes what BCs it can support, choices for operator construction, and problem conversions it can perform. DiffEqApproxFun links over to ApproxFun for choices of domains, has a single Spectral() for generating the spectral operators (with choices in there), and then can generate ODEs for the Heat Equation, or a fully implicit discretization to a LinearProblem via a 2D discretization (can it?).

Representation Information?

When you get this ODEProblem... what does it actually represent? We need a way to describe back to the user what u0 means. Coefficients of ...? u0[i,j,k] means what (x,y,z)? Not sure how to handle this.

Time?

Time is kept separate because it usually acts differently when it exists. For algorithms like a fully implicit discretization, LinearProblem(prob,FullyFiniteDifference(dt)) or something.

Passing space

This is just minor bikeshedding, but instead of like making the Heat Equation require the user give f(u,p,t,x,y,z) for 3D, should it be f(u,p,t,x) for x a tuple/number? That keeps it the same regardless of dimension. Minor bikeshedding.

Conclusion

This allows us to specify our problem at a high level, interface with different packages in an extendable way to generate appropriate FDM/FVM/FEM/spectral discretizations, and spit out linear/nonlinear/diffeq problems to take to solvers. The hope is that this give enough places to inject information and package-specific components that it can be efficient, yet gives a good level of automation and transferrability to other setups.

Thanks. Pulling in some people to bikeshed this high level story. @dextorious @dlfivefifty @alanedelman @jlperla

ChrisRackauckas commented 6 years ago

Oh duh, your way makes it trivial to do lazily

ChrisRackauckas commented 6 years ago

Maybe we just solve the interior and just auto apply the extension operator at plotting time and stuff like that.

dlfivefifty commented 6 years ago

I think it works for higher-dimensional tensor product problems too: if we have

(R ⊗ R) u_t = (D² ⊗ R + R ⊗ D²) u
(B ⊗ I) u = 0 # left/right bcs
(I ⊗ B) u = 0 # top/bottom bcs

then we can do elimination to get

(R ⊗ R) u_t = ((D² - U*B) ⊗ R + R ⊗ (D² - U*B)) u

where U = [-1 0; zeros(n-4); 0 1] (or similar).

jlperla commented 6 years ago

Yes, there would need to be a dispatch to catch when there's a jump operator. What you described is why people normally don't solve the master equation for jump operators anyways

No, if by master equation you mean that coming from a continuous-time markov chain of a discrete number of variables, that is not what I mean.

A jump diffusion is a diffusion + a poisson jump process on a single dimension. Here is something that describes an example of the generator that comes out of it: https://math.stackexchange.com/questions/1652322/jump-diffusion-infinitesimal-generator where you will notice that in the general (and frequently common) case there is an integral in the setup over all of possible jump sizes.

As these operators are linear, you can treat the poisson jump process separately (i.e. https://en.wikipedia.org/wiki/Compound_Poisson_process), but you can see that this is where the boundary conditions get tricky.

For what the boundary values look like, see http://homepages.math.uic.edu/~hanson/pub/Slides/bk07fkebkefinal.pdf pages 15-17. It is a little out of my pay grade, but for reflecting boundaries at least, it seems to follow the ``stochastic reflecting principle''.

The pure differential operators are our very first goal. Let's keep the discussion there, and when we can actually sanely use them we can go back to asking how to auto-discretize operators which have jumps.

Sanity is relative! The first application I need for these stuff involves a 1 dimensional jump-diffusion model, so this is not a theoretical concern. If there is no hope of being able to extend this stuff to jump-diffusions anytime soon (and with an interface redesign)

The second application I have requires a continuous + a discrete state (which I think comes back to the master equation style with a continuous time Markov chain describing the jumps between discrete states (i.e. not the same as jumps within a continuous state).

In fact, most economists would say the place to start is not a 2nd order diffusion, but rather a 1 continuous dimension X with upwinding of the drift and a 2nd discrete dimension {where only two states L vs. H is absolutely necessary}, and no 2nd order diffusive term at all.

jlperla commented 6 years ago

A big question I have for DSLs here: We have been talking about having a DSL for the differential operators, with the assumption that they would be used to describe the infinitesimal generator of the stochastic process and the appropriate boundary values - which perfectly describe any reasonable stochastic process.

But for people solving stochastic optimal control problems, is that the right approach? Would a higher level DSL for stochastic processes be more appropriate, where in the background it could swap out various implementation details of the underlying operators?

ChrisRackauckas commented 6 years ago

In fact, most economists would say the place to start is not a 2nd order diffusion, but rather a 1 continuous dimension X with upwinding of the drift and a 2nd discrete dimension {where only two states L vs. H is absolutely necessary}, and no 2nd order diffusive term at all.

Yeah, and that's just tensor product with the transition matrix. Still, before getting to that, let's just focus on derivative operators. If we have derivative operators, then we can tensor and what not to other things.

I think it works for higher-dimensional tensor product problems too

Yes, and now this is pretty much where the DiffEqOperators.jl implementation is, with different handling of the boundary conditions.

So I think we've converged. I had the boundary conditions in the non-trivial cases in correct, but the other parts of the operator can be salvaged. Basically the idea is:

  1. The dependent variable needs to cutoff the boundary values.
  2. All operator actions are to be done on a composed operator.
  3. The operator needs to define an extension operator Q*u which adds in the BCs. Since the DoFs have to line up, this is determined by the BCs.
  4. The interior stencil is just the Fornberg stencils added together.
  5. The full multiplication appends to the stencil something for the BCs. The appending is done to the composed operator. Example is Neumann0 subtracts off [-1 0; zeros(n-4,2); 0 1]*B. Dirichet just applies R*A*Q*u, i.e. uses the boundary points computed by the extension operator, etc.
  6. The user's solution is the interior points, but then Q*u(t) gives the boundaries.
  7. The extension operator is actually just the inverse of the restriction operator R in this space, and as @dlfivefifty was writing above. the restriction operator is just the mass matrix of the ODE. So we can allow the problems to specify an inverse mass matrix operator for this purpose, and the plotting methods can tie in to apply it to make everything look nice to handle.
  8. This explicit use of restriction and extension will make it easy to use ApproxFun because that's ChebFun's solution to doing this.
  9. No numerical error accumulation at the boundaries since those are always defined by extension, so the discretized BC is always satisfied.
  10. Higher orders are just tensor products like we had before.

The only issue is that when defining operations like this the lazy operators won't naturally fuse loops, but that's just the same problem from before. If we can make this (a) act correctly and (b) have proper full(A), sparse(A), etc., then this is highly useful.

@jlperla To have your student something to concretely get started on, it would be good to have code which does exactly this but non-lazily. I.e. let's first start with Heat Equation, and use this methodology (extend it non-zero Dirichlet, non-zero Neumann, and non-zero Robin) to have code that generates the operators directly on the [1 -2 1] stencil matrix, not worrying about arbitrary order or any of that. Then do it on the upwinding operator (assuming the coefficient is always positive). Then apply this on the composed stencil. The resulting forms should be compared against the by-hand versions of the BC handling (the loop versions from things like your notes) and it should all match up. Those will become the test cases for our operator.

ChrisRackauckas commented 6 years ago

But for people solving stochastic optimal control problems, is that the right approach? Would a higher level DSL for stochastic processes be more appropriate, where in the background it could swap out various implementation details of the underlying operators?

Go comment on what's needed in the DSL thread. https://github.com/JuliaDiffEq/DifferentialEquations.jl/issues/261 . I didn't plan for optimal control and I'm not sure what you'd actually need, or what to do from it.

In fact, most economists would say the place to start is not a 2nd order diffusion, but rather a 1 continuous dimension X with upwinding of the drift and a 2nd discrete dimension {where only two states L vs. H is absolutely necessary}, and no 2nd order diffusive term at all.

But they aren't doing software development. The way it's taught, learned, or even used doesn't matter for the better software development strategy. Before we start composing derivative operators with some non-derivative operator, we should get the derivative operator working. If that can compose with an arbitrary operator and satisfy the BCs, then adding a transition matrix or some jump operator is fine. If the derivative operator doesn't work and tests fail, then you're screwed from a software development point of view since you won't know what's wrong. That's just not the right way to do it. Let's get the basic thing solid and then grow.

jlperla commented 6 years ago

I don't think we have converged on how the proposed DSL and lazy approach will deal with the poisson jump process boundaries. In the immediate application, I need both reflecting on the top (and absorbing on the bottom). For now, all I need is a deterministic, but state dependent jump size and a constant arrival ratem

Or at least I don't see the path right now, and it is essential for me to know we are implementing something which can fulfill that need.

The combined continuous+discrete state I have a much better feel of how to do it manually

jlperla commented 6 years ago

Before we start composing derivative operators with some non-derivative operator, we should get the derivative operator working.

All of your examples of using the operators and boundary values have been with ones where the boundary can just be added as an additional one or two localized equations at the corners. I believe that works fine.

But you have not convinced me that the design and approach will be able to handle the key requirement I have..where the boundary values can affect more than just the corners. This is the paper I need to have written. I am perfectly fine helping to fund detours outside of my immediate needs with spillovers to the rest of the community as long as the trajectory is clear.

If you can convince me that your approach works, then (of course) one starts with the simplest example of derivative operators. But I just don't see it yet.

dlfivefifty commented 6 years ago

What I wrote works fine for dense boundary conditions

ChrisRackauckas commented 6 years ago

What I wrote works fine for dense boundary conditions

Well there is a little bit of an issue here. If I'm understanding that source correctly, the stencil itself is dense, and to define the columns of said stencil requires taking into account the boundary conditions. Assuming the the lines do perfectly line up on the grid, what it seems to say is that probability of jumping to b+ndx would then get added to b-ndx for reflecting, and would be ignored (lost) if absorbing (and then it can keep reflecting, so if it was an infinite state space then the discretized space would actually be the result of an infinite sum). This process of constructing the discretized jump transition matrix is exact if the points are on the grid, and if the points don't line up on the grid then you interpolate to place the probability on the grid (I would guess in practice this is usually just via an O(dx) linear interpolation? i.e. weighing theta = how close it is to the left, and left gets thetaprob while right gets (1-theta)prob?)

I am lost on how @dlfivefifty 's method would handle this. [-1 0; zeros(n-4,2); 0 1]*B locally makes the derivative at the boundary work, and an arbitrarily higher order choice of B can be used, but is that really the same thing as that reflection process? The other issue is that if I don't know how to get the stencil on its own, then how do we compose it?

@jlperla do you have a piece of Kushner and Dupuis or Karlin and Taylor that you can share, or can you describe how you manually compose the operators?

Edit: the reason why the stencil cannot be computed is because there's new information which depends on the BCs. If all of the jumps are larger than the domain, then it's trivial to see that the "interior stencil" is zero or non-zero depending on reflection or not.

ChrisRackauckas commented 6 years ago

You can make it use the same technique. Basically, let's say that the furthest jump goes over the boundary by 3. Then what you can do is simply write the operator on the expanded space (i.e. do all of the jumps in the space even if they jump over, just add to the array for the over bits). The restriction operator is then I and then the reflected I, i.e.

R = [1 0 0 0
        0 1 0 0
        0 0 1 0
        0 0 0 1
        0 0 1 0
        0 1 0 0]

if the furthest jump could go two over the boundary. The differential operators trivially expand their reflection operator to add zeros past the boundary. Thus for the jump stencil A, R*A is how it would restrict to the interior.

So now

R*u_t = D²*u + R*A*u
B = [0 0 -1 1 0 … 0; 0 … 0 -1 1 0 0]
B*u = 0

But @dlfivefifty 's thing is just subtracting off what should be a zero at the boundary:

R*u_t = (D² + R*A -  [zeros(2,2);-1 0; zeros(n-4,2); 0 1; zeros(2,2)]*B)*u

Now that's an equation of change in the interior.

Now one thing I missed before though was that to actually use this in the integrator, we would be having R*u, so for the integrator we would be writing this as:

R*u_t = (D² + R*A -  [zeros(2,2);-1 0; zeros(n-4,2); 0 1; zeros(2,2)]*B)*Q*R*u

# v = R*u
v_t = (D² + R*A -  [zeros(2,2);-1 0; zeros(n-4,2); 0 1; zeros(2,2)]*B)*Q*v

And Q is uniquely determined by the BC (if the PDE has a unique solution, there's the whole Poisson equation Neumann0 thing I think comes up here but if the PDE is sane then that should be).

jlperla commented 6 years ago

Thanks. I will think about this during the week. I think I see how to do the Neumann0 condition, and because it is not affine we seem to be able able to apply it to the individual operator.

I will try to understand the restriction to the interior approach with the R matrix (which might be the trick I didn't know).

Will this framework for the boundary values also support nonzero dirichlet conditions for the jump diffusion process (imagine the compound poisson process can jump backwards and hit an absorbing barrier?

Just so you know where this comes from: consider pricing a bond pays a constant amount forever, but where default can occur. Default might occur when a particular state gets below a certain level, where the state evolves according to a jump diffusion process. If default occurs, the bondholders get a positive liquidation value of the firm. As for the top of the support of the state, we impose an artificial reflecting barrier to keep the support finite.

ChrisRackauckas commented 6 years ago

Thanks. I will think about this during the week. I think I see how to do the Neumann0 condition, and because it is not affine we seem to be able able to apply it to the individual operator.

I think we would just have to find out how the restriction operator is defined there. The "trick" is essentially that in the expanded space this is not affine but linear again (all of the information is in there, it's just adding more of itself), so you keep the operator in its "good space" and change the space that you're applying it against via restriction and extension operators. These restriction and extension operators are boundary dependent (extension has to use the fact that there's fewer DoF due to the BCs to fill in the boundary values). Then @dlfivefifty 's BC "trick" is just subtracting off B*Qb*u = 0 where Qb is the expansion operator that omits non-boundary terms (i.e. this is the boundary residual in some discretization), and so the boundary conditions have to be satisfied after doing this.

This kind of reliance is usually how FEM stuff tends to be written out, so it's good that we are matching there. Spectral methods also need to be explicit about space conversions, so we're essentially just doing this in FDM.

Seems like it all checks out, but I'll let @dlfivefifty comment. The thing we need to do right now though @jlperla is see how this compares to the operator you get when you "by hand" do the discretization, and ensure that the Neumann0 case preserves probability (we may need to do a numerical error fix somewhere, re-writing one of the coefficients as 1-... to remove the possible offset like what's done in Fornberg).

Will this framework for the boundary values also support nonzero dirichlet conditions for the jump diffusion process (imagine the compound poisson process can jump backwards and hit an absorbing barrier?

It should, we just need to find out how to write the restriction operator for a set non-zero boundary value.

ChrisRackauckas commented 6 years ago

Oops two corrections. Qb*B*u since it could potentially need the whole space to compute the discretized BCs. Also, it's the restriction and elongation operators that could be affine in the non-zero case @jlperla. But if we write out the abstract operator expressions and push the algebra around, then derive the discrete restriction and elongation operators, that gives the solution. So really, the the solution is about being clear with every change of space.

jlperla commented 6 years ago

This all sounds excellent. I love the idea of extending the space to include jumping off the grid and thinking through the boundary values along those lines. It fits completely with the Kushner and Dupuis approach of interpreting the finite differences as generating a continuous-time markov chain.

If I understand it, with the approach you are describing, the resulting discretization of absorbing barriers can be interpretted as being an absorbing in the resulting markov-chain on the expanded space (which we can test to make sure is a stochastic-matrix in order to check that it preserves probability, and can check that it is ergodic for reflecting barriers and non-ergodic for any absorbing barriers).

I will see about double-checking some of your sketch over the next few weeks, but you have convinced me that there is a general approach to dealing with the boundaries when we need it.

dlfivefifty commented 6 years ago

Maybe we should implement some examples by hand in the proposed unified approach and see how it goes. Maybe:

dlfivefifty commented 6 years ago

We should also make sure different matrix types fit in to this framework:

dlfivefifty commented 6 years ago

Zero Dirichlet heat:

n = 100;
x = linspace(-1,1,n)
Δx = step(x)

B = [1 zeros(1,n-2) 0;
     0 zeros(1,n-2) 1]

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
à = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)
A = Ã[:,2:n-1]

function heat(du,u,A,t)
    du .=  A*u
end

u₀ = @. (1-x^2)*exp(x)
prob = ODEProblem(heat, u₀[2:n-1], (0.0,8.0), A)
@time u  = solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5)

plot(x, [0; u(8.0); 0])
dlfivefifty commented 6 years ago

Zero Neumann heat:

n = 100;
x = linspace(-1,1,n)
Δx = step(x)

B = [-1 1 zeros(1,n-4)  0 0;
      0 0 zeros(1,n-4) -1 1]/Δx

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
à = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)
A = Ã[:,2:n-1]

function heat(du,u,A,t)
    du .=  A*u
end
e
u₀ = @. exp(x) + ((-2 + x)*x)/(4e) - e*x*(2 + x)/4 # satisfies zero Neumann

prob = ODEProblem(heat, u₀[2:n-1], (0.0,8.0), A)
@time ũ  = solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5)

u = [B; eye(n)[2:n-1,:]] \ [0; 0; ũ(8.0)]

plot(x, u) # ≈ mean(u₀)
ChrisRackauckas commented 6 years ago

Where is this Toeplitz coming from? The one in SpecialMatrices.jl doesn't seem to work for your code.

dlfivefifty commented 6 years ago

ToeplitzMatrices.jl

Didn't notice SpecialMatrices.jl had it's own Toeplitz (with a different input format)

dlfivefifty commented 6 years ago

That was just a quick way to make the central-difference matrix anyways. So you could use:

Δ = zeros(n-2,n)
Δ[diagind(Δ)] = 1/Δx^2
Δ[diagind(Δ,1)] = -2/Δx^2
Δ[diagind(Δ,2)] = 1/Δx^2
ChrisRackauckas commented 6 years ago

Yeah, it's cool 👍 .

dlfivefifty commented 6 years ago

By the way, replacing solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5) with solve(prob) or solve(prob, Tsit5()) produces something wrong: it has oscillation.

time

ChrisRackauckas commented 6 years ago

Yeah that's just CFL. This shows why:

using OrdinaryDiffEq, ToeplitzMatrices, Plots
gr()

n = 100;
x = linspace(-1,1,n)
Δx = step(x)

B = [1 zeros(1,n-2) 0;
     0 zeros(1,n-2) 1]

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
à = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)
A = Ã[:,2:n-1]

function heat(du,u,A,t)
    A_mul_B!(du,A,u)
end

u₀ = @. (1-x^2)*exp(x)
prob = ODEProblem(heat, u₀[2:n-1], (0.0,8.0), A)
@time u  = solve(prob, Tsit5(); reltol=1e-5,abstol=1e-5)
plot(x,[0;u[1];0])
plot!(x,[0;u(0.25);0])
plot!(x,[0;u(0.5);0])
plot!(x,[0;u(0.75);0])
plot!(x,[0;u(1.0);0])
plot!(x,[0;u(2.0);0])
plot!(x,[0;u(4.0);0])
plot!(x,[0;u(8.0);0])

example

then

plot(x,[0;u(8.0);0])

example2

Basically what happens is as the solver approaches steady state, the error in its steps goes to zero. But because the error in its steps goes to zero, over time it will increase its step size more and more because it knows it's not getting much error. But it will increase the step size more than the CFL limit, which will sooner or later start causing oscillation. Setting a maximum dt stops that:

@time u  = solve(prob, Tsit5(); dtmax = Δx^2/2)
plot(x,[0;u(8.0);0])

example3

That's actually a pretty interesting and practical example I might want to use as a homework problem haha (since the explanation requires knowing a little bit about the behavior of adaptive time stepping mixed in with the theory of stability of RK methods on PDE discretizations, good stuff!)

So we should throw dtmax on there for any tests with RK methods just to not mix CFL instability with numerical instability due to the operator discretization. In fact, for the latter we should try some really precise calculations (with BigFloats).

dlfivefifty commented 6 years ago

Cool! And adding non-zero BCs avoids seeing this:


## Non-zero Dirichlet

n = 100;
x = linspace(-1,1,n)
Δx = step(x)

B = [1 0 zeros(1,n-4)  0 0;
      0 0 zeros(1,n-4) 0 1]/Δx

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
à = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)
A = Ã[:,2:n-1]

u₀ = @. exp(x)
r = Δ[:,[1,n]]*(B[:,[1,n]]\(B*u₀))

p = (A,r)

function heat(du,u,p,t)
    (A,r) = p
    du .=  A*u .+ r
end

prob = ODEProblem(heat, u₀[2:n-1], (0.0,8.0), p)
@time ũ  = solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5)
u_r = [B; eye(n)[2:n-1,:]] \ [B*u₀;  ũ(8.0)]
plot(x, u_r; label="Rodas4") # ≈ mean(u₀)

@time ũ  = solve(prob)
u_s = [B; eye(n)[2:n-1,:]] \ [B*u₀;  ũ(8.0)]
plot!(x, u_s; label="solve") # ≈ mean(u₀)

norm(u_r-u_s) # < 0.0009
ChrisRackauckas commented 6 years ago

I translated some of @dlfivefifty 's calculations over to the affine operator algebra. Part of this is to explain it, but most of it is to explain it to myself haha.

Dirichlet0 of course is trivial since restriction and extension are trivial:

n = 100;
x = linspace(-1,1,n)
Δx = step(x)

B = [1 zeros(1,n-2) 0;
     0 zeros(1,n-2) 1]

Q = [zeros(n-2)';I;zeros(n-2)']
R = [zeros(n-2) I zeros(n-2)]
u₀ = @. (1-x^2)*exp(x)
R*u₀
Q*R*u₀ == u₀

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
A = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)

function heat(du,u,A,t)
    du .=  A*Q*u
end

prob = ODEProblem(heat, R*u₀, (0.0,8.0), A)
@time u  = solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5)

plot(x, Q*u(8.0))

The interesting case is non-zero Dirichlet. Here's what it looks like:

using OrdinaryDiffEq, ToeplitzMatrices, Plots
gr()

n = 100;
x = linspace(-1,1,n)
Δx = step(x)

B = [1 0 zeros(1,n-4)  0 0;
      0 0 zeros(1,n-4) 0 1]/Δx
u₀ = @. exp(x)

QA = [zeros(n-2)';I;zeros(n-2)']
Qb = zeros(n)
Qb[1] = u₀[1]
Qb[end] = u₀[end]

R = [zeros(n-2) I zeros(n-2)]
R*u₀
QA*R*u₀ + Qb == u₀

Q = (QA,Qb)

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
à = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)
A = Ã[:,2:n-1]

Δ*QA == A

r = Δ[:,[1,n]]*(B[:,[1,n]]\(B*u₀))
r ≈ Δ*Qb # true

p = (A,r)

function heat(du,u,p,t)
    (A,r) = p
    du .=  Δ*QA*u .+ Δ*Qb
end

prob = ODEProblem(heat, R*u₀, (0.0,8.0), p)
@time ũ  = solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5)
u_r = [B; eye(n)[2:n-1,:]] \ [B*u₀;  ũ(8.0)]
QA*ũ(8.0) + Qb ≈ u_r # true
plot(x,QA*ũ[1] + Qb)
plot!(x,QA*ũ(1.0) + Qb)
plot!(x,QA*ũ(8.0) + Qb)

Basically all that's done is that I manually wrote out Q*u = QA*u + Qb since Q is an affine operator. What very naturally happens is that Δ*Qb is what @dlfivefifty defined as r and Δ*QA is the stencil restricted to the interior, A.

Now the way that this extends is that Δ can be any composed operator. Δ[:,[1,n]]*(B[:,[1,n]]\B) is just the Gaussian elimination step to restrict the operator to the interior as @dlfivefifty explained above. For specific boundary conditions we can analytically/lazily compute B[:,[1,n]]\B since that expression is only dependent on the BCs. If we write C = (B[:,[1,n]]\B), and X is the operator which restricts to the boundary nodes, then we get via this algebra:

Δ*Q*R*u = Δ*QA*R*u + Δ*X*C*Qb

as the generic way to write down this discretization process. So then the differential equation

u_t = Lu

with discretized L ≈ Δ, we have that

u_t = Q*(Δ*QA*R*u + Δ*X*C*Qb)

which is not a good way to write it numerically because it has less degrees of freedom than the actual evolution equation, and so we instead use the identity in this space that Q = R^(-1) (that's BC dependent!) that:

R u_t = Δ*QA*R*u + Δ*X*C*Qb

# Define v = R*u
v_t = Δ*QA*v + Δ*X*C*Qb

gives the evolution on the interior, with the full solution given by u(t) = Q*v(t).

Side question: is there any books or such on affine operators? I've never read any theory on this but am going off of what seems to be correct. I'm sure this can get formalized in function-space theory, and the space transformations to Dirichlet/Neumann 0 which are done in theory essentially bring in correction terms which are these.

jlperla commented 6 years ago

Thanks guys, and sorry I am slow. These fulfill all the "forward equations" needs that I can see. For the backwards equations it is best to start with the stationary setup. I have a simple setup of one example in https://github.com/JuliaDiffEq/DiffEqOperators.jl/blob/univariate_tests/docs/example_operators_for_economics.md (which you should see correctly with the https://chrome.google.com/webstore/detail/github-with-mathjax/ioemnmodlmafdkllaclgeombjnmnbima plugin or can be converted with pandoc/etc.

A simple example for the stationary ODE to solve is: r u(x) = x^alpha + A u(x) where A = sigma * D_xx and reflecting barriers u'(0) = 0 and u'(1)=0.

Here is an attempt to modify the code you gave me

using ToeplitzMatrices, Plots
n = 100;
x = linspace(0.0, 1.0, n)
Δx = step(x)

B = [-1 1 zeros(1,n-4)  0 0;
      0 0 zeros(1,n-4) -1 1]

Q = [zeros(n-2)' ;I; zeros(n-2)']
R = [zeros(n-2) I zeros(n-2)]

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
A = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)

#Now solve the stationary ODE r u(x) = x^alpha + \sigma * D_xx u(x)
alpha = 0.5
r = 0.05
sigma = 0.1

#Did I do this wrong?
C = r * I - sigma^2 * A * Q
u =C \ x[2:end-1].^alpha
plot(x[2:end-1], u)
gr()

You can take a look at it to tell me if it looks correct? (in particular, the construction of the B. If I did things in the right order, then just to verify: the function v is not defined at x[1] and x[end]? Is this because we are finding a solution for the augmented state (i.e. x should contain the ghost nodes and not just the ones we are solving for?

ChrisRackauckas commented 6 years ago

If I did things in the right order, then just to verify: the function v is not defined at x[1] and x[end]? Is this because we are finding a solution for the augmented state (i.e. x should contain the ghost nodes and not just the ones we are solving for?

Yeah, and there's a few ways to think about it. It came up first in @dlfivefifty 's representation

R u_t = Dxx*u

since notice that the operator applied to u actually changes the function space. In function space theory, it's mapping H^n to H^(n-2). In the discrete space, Dxx maps n points to n-2. So its output is the restriction to the interior. This makes sense because in the specification, the PDE actually only holds in the interior. So you have to have the R restriction operator in the R u_t because otherwise you're stating equality between two different function spaces, or between vectors of two different sizes.

But then another way to think about it is that the boundary conditions get rid of two degrees of freedom. For the first order discretization of the Neumann BCs, v[1] = v[2] and v[end-1] = v[end] have to hold in order for the BCs to be satisfied. Because of those two equalities, the n-2 interior points determine the full n points in the space. Because of that, the computation should only take place on the n-2 interior points. That's why we then define v = R*u, and then using Q = R^(-1) we write:

R u_t = Dxx*u
R u_t = Dxx*Q*R*u
v_t = Dxx*Q*v
v_t = A*v

as the operator for the interior (and the algebra is the same when it's affine, you just have to keep track of the additive term when applying Q). So in some sense this is ghost nodes since we "create" the boundary points when we need them, i.e. we go from v -> u to do some calculations that naturally go u -> v.

I think I am leaving this conversation with a completely different understanding of PDE discretizations then when I started haha.

When we go to the stationary equation, it's

b = Dxx*u

where b is in H^(n-2), or b is n-2 points. If you don't have a forcing function, then b is n-2 zeros, so it's the same as before where in this sense we're already writing that b is in the restricted space here. So to get a linear equation we can solve, we get a linear map of n-2 points to n-2 points via

b = Dxx*Q*v

For your example, let me work it out a bit. Neumann0 for the Poisson equation only has a unique solution if you satisfy some integral constraint though, which I thought had to be zero? (Maybe it just has to be some c constraint, in which case this makes more sense. Gonna go check old notes...)

ChrisRackauckas commented 6 years ago

I think you just had the extension operator incorrect, but everything else is the same.

using ToeplitzMatrices, Plots
gr()
n = 100;
x = linspace(0.0, 1.0, n)
Δx = step(x)

B = [-1 1 zeros(1,n-4)  0 0;
      0 0 zeros(1,n-4) -1 1]

Q = [zeros(n-2)' ;I; zeros(n-2)']
Q[1] = 1
Q[end] = 1
R = [zeros(n-2) I zeros(n-2)]

u = exp.(x)

Δ = Toeplitz([1; zeros(n-3)], [1; -2; 1; zeros(n-3)])/Δx^2
A = Δ - Δ[:,[1,n]]*(B[:,[1,n]]\B)

#Now solve the stationary ODE r u(x) = x^alpha + \sigma * D_xx u(x)
alpha = 0.5
r = 0.05
sigma = 0.1

B = r * I - sigma^2 * A * Q
u = B \ x[2:end-1].^alpha

# Double check the operators
R*Q*u == u # true

plot(x, Q*u)

For Dirichlet0 you have the trivial extension that you just add zeros to the end of the vector. For Neumann0 you have the extension that, to get a zero derivative at the end, you have to have v[1] = v[2] and v[end-1] = v[end] which is how you get Q.

dlfivefifty commented 6 years ago

It works for Spectral methods too, but so far with dense matrices (Chebyshev -> Chebyshev instead of Chebyshev -> Ultraspherical(2)):

S = Chebyshev()

n = 50

D = Derivative() : S   #  Chebyshev() -> Ultraspherical(1)  derivative (∞ x ∞)
C = eye(D^2)    #  Chebyshev() -> Ultraspherical(2)  conversion (∞ x ∞)

B = Matrix((Dirichlet() : S)[:,1:n])   # Chebyshev Dirichlet conditions (2 x n)
Cn = Matrix(C[1:n-2,1:n-2]))   #  Chebyshev() -> Ultraspherical(2)  conversion (n-2 x n-2)
D2n = Matrix((D^2)[1:n-2,1:n])  #  Chebyshev() -> Ultraspherical(2)  2nd derivative (n-2 x n)
Δ = inv(Cn)*D2n # Chebyshev -> Chebyshev  2nd derivative  (n-2 x n)
R = eye(n)[1:n-2,:]  # Chebyshev -> Chebyshev identity (n-2 x n)

à = Δ - Δ[:,n-1:n]*(B[:,n-1:n]\B)  # remove last two columns of Δ (they are zero in R already)  (n-2 x n)
A = Ã[:,1:n-2]  # chop off zero columns (n-2 x n-2)

function heat(du,u,p,t)
    (A,r) = p
    du .=  A*u .+ r
end

x = Fun()
u₀ = Fun(x -> exp(x), Chebyshev(), n).coefficients # n Chebyshev coefficients of exp(x)

r = Δ[:,n-1:n]*(B[:,n-1:n]\(B*u₀)) 

prob = ODEProblem(heat, u₀[1:n-2], (0.0,8.0), (A,r))
@time ũ  = solve(prob, Rodas4(autodiff=false); reltol=1e-5,abstol=1e-5)
u = [B; eye(n)[1:n-2,:]] \ [B*u₀; ũ(2.0)]
plot(Fun(Chebyshev(), u))
untitled
dlfivefifty commented 6 years ago

@jaurentz was working on fast algorithms using the semi-separability of Chebyshev -> Chebyshev operators, but I couldn't find the preprint.

EDIT: http://emis.ams.org/journals/ETNA/vol.44.2015/pp281-288.dir/pp281-288.pdf

dlfivefifty commented 6 years ago

If you support mass matrices, then we can adapt this trick to Chebyshev -> Ultraspherical, as we have that

B*u = c  ⇒ B*u_t = 0

Hence (I'll assume B[:,n-1:n] == eye(2) to simplify the expression)

R*u_t = Δ*u  ⇒ (R-R[:,n-1:n]*B)*u_t = (Δ-Δ[:,n-1:n]*B)*u + c
ChrisRackauckas commented 6 years ago

Yeah the methods for stiff equations support mass matrices (well, a lot of them. Some others have it as coming "soon"). Since this handling is just on the ends it's probably nice to do it without mass matrices though. That's why I've been defining and computing Q = R^(-1) in this space. I'm not entirely sure how to get those relations for Chebyshev space though. At when going between Chebyshev and point space the BCs are given by sum conditions and so it's simple to write down the expansion operation, but I'm not sure how to query ApproxFun for that. But yeah, what we should end up with is almost exactly what Chebfun is doing here with its Q*P.

I have to say though, I don't quite understand the last example you have here so I'll need a minute to digest it.

dlfivefifty commented 6 years ago

I've added comments to my example that should clarify. The coefficient space analogue of removing the first and last grid points is to remove the last two coefficients.

ChrisRackauckas commented 6 years ago

The operator that's missing is Q: Chebyshev -> Chebyshev (n x n-2) s.t. Q*R*u = u for any u that satisfies the BCs. That way it's just Q*u (it has an affine part though, but just do the affine operator algebra as it would intuitively be defined) instead of u = [B; eye(n)[1:n-2,:]] \ [B*u₀; ũ(2.0)]. For FDM I've been deriving it by hand, but am trying to find out how to systematically create it.

jlperla commented 6 years ago

I have put in a few of these tests into the https://github.com/JuliaDiffEq/DiffEqOperators.jl/tree/univariate_tests branch (undre the /docs directory). We will try to get the math/etc. written up there to reflect the algebra.

ChrisRackauckas commented 6 years ago

Adding this Gist of ApproxFun + DiffEq for spectral solving:

https://gist.github.com/ChrisRackauckas/ddbeac88cdaa779c8da991f66e1c3823

At the bottom you see that the old way of doing the BCs didn't work (projecting directly to the BC-satisfying space after each step, instead of keeping every term BC-satisfying). The bottom of the script should get updated to add in the Q and R operators to perform the projection like described in this issue and it should work.

ChrisRackauckas commented 6 years ago

So here's the gist of it. I think I have most of it clear now. Copying my notes for future reference.

The domain is discretized and u is a vector of points on that space. The indices for the boundary are bidxs defined by the domain and for the interior iidxs s.t. their union is the full u.

R is the restriction operator which is defined by the domain. It is the operation which is defined as R*u = u[iidxs]. This is unique since it's given by the definition of the domain's discretization, and is thus I where you 0 remove columns which aren't in the interior. u is in the space of (discretized) functions which satisfy the boundary conditions, and v is the space of R*u, the interior functions which could satisfy the BCs. For a 1D discretization, it's n-2 x n.

B is the boundary condition operator. In 1D, it is defined as satisfying B*u = [bc1;bc2] for any u in the space of functions that satisfy the BCs, for the conditions bc1 and bc2. B is not necessarily unique and is instead a choice. The choice of B is exactly the choice of boundary value discretization, i.e. choosing to do 1st or second order Neumann BCs is simply the choice of the operator B (first order for example is the choice that bc1 = u[1] - u[2] for Neumann0). For Dirichlet, or after a discretization of the derivative is chosen in Neumann or Robin, this B is unique. From looking at the relation it has to satisfy, you see it's 2 x n.

Q is the operator that is defined as Q*R*u = u for any u that satisfies the BCs (or Q=R^(-1), but be careful there since it's not square and this is only true as maps on functions which satisfy the boundary). It has a second relation B*Q*v = [bc1;v;bc2], basically saying that it's a map from discretizations of functions in the interior to functions which satisfy the BCs. Q is actually in general affine, meaning that Q*u = Qa*u + Qb. You can show some nice properties of affine operators, like that the definition given of * will make Q*x=b in an iterative solver converge to Qa*x = b-Qb (so if we actually define a type that does this in Julia, it will work with iterative solvers without a fuss. Just the same thing idea as writing in residual form). From the relations you get that Qa is n x n-2 and Qb is of length n. In order for Q*R*u = u to hold for trivial u, you need that Qb is zero except on Qb[bidxs] (so 2 points for 1D). For the relation B*Q*v = [bc1;v;bc2] to hold on trivial v, you need that Q is I on the interior, so it's defined by its first and last rows. I believe that makes it 4 linear conditions to map to 4 points so Q is unique.

Now let's say we want to solve u_t = A*u, abuse notation so that it means both the discretized and non-discretized forms. Then we have two relations:

A[:,bidxs]*(B[:,bidxs]\[bc1;bc2]) = A*Qb

and

R*(A - A[:,bidxs*(B[:,bidxs]\B)) = A*Qa

I don't have a good way showing why that should be true though, so maybe @dlfivefifty can chime in on how that definition actually works. So that can be used to derive Q systematically, or in special cases it can be done more directly (later).

But once we're at this point, to solve the actual differential equation we only want to solve the interior, since the boundaries are given directly by the interior u = Q*v and it would be numerically unstable to solve an equation with more points than there are degrees of freedom in the problem. Therefore we actually want to solve v_t = A*Q*v. Notice that the discretized A maps from the full domain to the interior (since the PDE is only defined on the interior). Notably, that means it's not square. By the operator algebra we have:

v_t = A*Qa*v + A*Qb

is how the operator is applied. For a steady state problem, it's the same idea. Discretize some f in the domain for f = A*u, and you get f = A*Qa*v + A*Qb. Those are the linear equations which define the ODE or whose solution is the solution to the PDE.

This shows that what's important and non-trivial is Q. It contains all of the information about both our boundary and our domain from the two relations. So what is Q? Here's all of the relevant cases for finite differencing.

Dirichlet0. This case is easy. If you have any v in the interior, its unique extension to satisfying the BCs is sticking 0s on the end, so Q = [0;I;0] is linear, meaning Qa=Q and Qb=0.

Dirichlet. The boundary is not dependent on any points in the interior, and therefore Qa = [0;I;0], but to satisfy the BCs you need Qb=[bc1;0s;bc2].

Neumann. You have to pick a discretization of B. This defines linear relation between the interior points and the BC. For example, 1st order discretization of the boundary derivative means that db/dx = (u[2] - u[1])/dx = bc1, and therefore u[1] = u[2] - dx*bc1. You can see that this means Qb = [dx*bc1;0s;dx*bc2], while Qa is I for the interior but [1,0,0,...] for the first row (so that Qa*v = v[1] = u[2]) and [...,0,0,1] for the last row.

Robin. Left as an exercise for the reader :P. It's just the same ideas from before to write out the affine relation. It needs a choice of B just like Neumann.

And that's all of the FDM discretizations. Let's take a quick look at the spectral discretizations then. In this case, let u be the discretization using n coefficients. We then define the interior as chopping off the last two coefficients, so R = I[1:n-2,:]. Now for Q the boundary determining nodes are at the end. Q needs to take into account the space. Let c[i] = T[i](-1) for T[i] the ith Chebyschev function (or it could be any expansion function). Let's do Dirichlet dot(v,c) = bc1 (the evaluation relation) and for d on the other end dot(v,d) = bc2. Thus the relation of extension is linear, meaning Qb = 0 and Qa = [I;c;d] (Qa*v = [v;bc1;bc2]). What's nice is that in this space, arbitrary Robin is actually linear. Using dc[i] = dT[i](-1) for the derivative, it's clear how that works out, and then Robin is just a linear combination of those two sums, so then it's just a linear combination of those two Qa.

We can actually make clear where the affine-ness comes from from this. If you let T[i] be the hat function basis, then the spectral discretization is the same as the FDM discretization. Why does it come out affine then? It's because it's degenerate: T[i](bc)=0 for every hat function for the interior (since they are zero at every node other than themselves), so this relation comes out to dot(v,0) = 0 ?= bc1 and so you have to go affine to add the bc on. So, in other words, Q is linear unless its interior basis expansion is degenerate at the boundaries and the condition is non-zero Dirichlet. We can also imagine this as being true for Neumann if the basis has zero derivative at the boundary points.

Of course this extends to finite element methods since it's usually discussed in an operator form much like this. So this covers the different discretizations as long as you can find Q (and maybe that way above is a nice way to generally do it?). Either way, what's left is just getting the interior stencil for the discretization for A. For even and uneven grid FDM that's just the Fornberg method which is the interior part in DiffEqOperators.jl, and for spectral discretizations that's just querying ApproxFun.jl.

ChrisRackauckas commented 6 years ago

I wanted to finish up with the discussion of how exponentiation of the operator works. Using the notation of the previous post, we have that our diffeq is given via:

v_t = A*Q*v

where Q is possibly affine, but assume it's linear right now. Then A*Q is a square n-2 x n-2 operator, so if we define that as L then

v_t = L*v

and it's clear that we just use exp(dt*L) in the exponential integrators.

But now let's take Q being affine. We can expand it out. If the original PDE was semilinear, then we would get

v_t = A*Qa*v + A*Qb + f(t,v)

but then if we define L=A*Qa and f̃ = A*Qb + f(t,v), then we get

v_t = L*v + f̃(t,v)

which is the semilinear ODE with a square L that exponential integrators want. Thus in this framework it's not difficult to understand why it would work, how to change the implementations, and how to handle the affine part (we'll need an expmv to only do the linear part, and have a way to query the Qb part and have the integrator add it to the user's f. Annoying but not difficult).

I think that covers the questions for how to get it right. Now we need to implement it all....

Reference: the Chebfun paper https://academic.oup.com/imajna/article/36/1/108/2363563

ChrisRackauckas commented 6 years ago

The abstract definition of the operator, along with domains, has been punted to SciCompDSL.jl. Operator instantiation will be the dispatches discussed above for DiffEqOperators.jl and DiffEqApproxFun.jl.

dlfivefifty commented 6 years ago

At some point we should talk about unifying ApproxFun.Operator with DiffEqOperators, etc., with an eye to both lazy and non-lazy variants. Even if it's only at the level of an OperatorsBase.jl that standardises the interface. (I always mean linear operator between spaces when I say "operator", which is different than LinearOperators.jl which seems to mean lazy matrices.)

ChrisRackauckas commented 6 years ago

I think we're going to have to make such an interface in SciCompDSL.jl, so that's where it will likely happen. It's getting hard to predict though, I think it just needs to get done and then refactored after we have something working.

dlfivefifty commented 6 years ago

Maybe the bests steps are:

  1. You design things the way you think they should be
  2. I see how your design works and try to find common ground
  3. I make an OperatorsBase.jl and submit a PR to SciCompDSL.jl
  4. We iterate on the PR
ChrisRackauckas commented 6 years ago

Yup, sounds good.

dlfivefifty commented 6 years ago

FYI I think this approach to boundary conditions works for eigenvalue problems too.

ranocha commented 6 years ago

I am not sure whether this approach works for general PDEs and discretisations. As far as I've understood the discussion above, the boundary conditions are imposed in a strong sense, i.e. the numerical solution has to fulfil the boundary conditions exactly. For hyperbolic conservation laws (linear advection, wave equation, Burgers' equation, Euler, ...), the imposition of boundary conditions might be a little bit different. Such a strong enforcement of boundary conditions can result in unstable methods. Therefore, summation-by-parts (SBP) operators using simultaneos approximation terms (SATs) have been designed to construct provably stable semidiscretisations (https://doi.org/10.1016/j.jcp.2014.02.031). The basic idea is to mimic integration by parts discretely by SBP operators, i.e. (finite difference) derivative operators that are defined on the complete grid, including the boundary nodes. In the interior, these derivative operators can be just the classical centered derivative operators. Near the boundary, special boundary stencils are chosen to mimic integration by parts discretely. Thus, the derivative is computed without using boundary conditions.

To enforce boundary conditions, the SATs are added at the boundary nodes. In the simplest case, these terms contain some difference of the value of the numerical solution at the boundary and the desired boundary value. Thus, the numerical solution does not necessarily satisfy the boundary conditions exactly and the numerical solution has to be computed at all nodes (including the boundary nodes). However, this procedure allows the creation of provably stable semidiscretisations for hyperbolic equations.

If I understood the discussion above correctly, I don't know how such an approach can be represented in the framework discussed above. For hyperbolic equations, well-posed boundary conditions might be specified only at certain boundaries and not everywhere (inflow or outflow). Moreover, whether it is possible to enforce (Dirichlet) boundary conditions can also depend on the numerical solution itself. But this might be handled by the corresponding semidiscretisation.

I can see how the approach discussed above works for parabolic PDEs. Please, correct me if I'm wrong, but I don't see immediately how this approach can be extended to weakly imposed boundary conditions using SATs.

ChrisRackauckas commented 5 years ago

@ranocha brings up a good point. In fact, what it basically points out is that this version of boundary condition handling we have been discussing is only compatible with linear boundary conditions, basically boundary conditions of the form

c_0*u(x) + c_1*u'(x) + c_2*u''(x) + ...

since then you can discretize the derivative terms and then the expansion operator has to be a linear operator. But if you have a global boundary condition or something which is nonlinear, like Integral ||u^2(x)||=1, then it cannot be handled like this.

But I think it highlights that what really needs to be discussed is what we want out of the PDE story. (Note: I'm mostly using this to prepare my thoughts for today's workshop). Essentially, we want two things:

1) A high level user interface so that way people who don't know PDEs well can solve the ones that they need to. 2) Developer level tools that make it easy to discretize a PDE.

What we tried to layout was a formulation for (2), which ran into multiple issues which we could get different answers for, but I think that it's ultimately doomed to failure because the developers need the flexibility in order to handle whatever wonky PDEs and boundary conditions that they are interested in. So I think it makes sense to abandon (2) as seemingly impossible without giving up both performance and feature-completeness. FEM, FDM, etc. libraries are going to have to act differently in order to exploit their unique features, and anyone working at that level will need to make tough but isolated decisions. DiffEq can hold a variety of tools like this, with DiffEqOperators being the FEM operators with linear boundary conditions which is core to many discretizations, while others like ApproxFun play a central role in the surrounding ecosystem as well.

But (1) is the goal we should be trying to attain. The question is stated as: how can someone specify a high level what PDE they want to solve without simultaneously specifying a discretization and solution method? Mathematically the PDE doesn't have a discretization, that's just the numerical method, so it should be possible to setup an interface where a user writes down a PDE, chooses a method, and then a solver method runs. There's two possible levels this can be written at. First of all, it can be written at a level like AbstractHeatProblem (Semilinear Heat Equation with Robin BCs), where there's set problems to discretize and solve, and people can offer up methods for those specific PDEs. The domain can be a required type with standard domains set in the standard library, and other packages (like FEM packages) can add domains as necessary. This method is simple to implement, everyone can easily join in and use their favorite methods/tools to solve the problem, etc. The issue here is granularity: what PDEs do you supply?

The second option is to use a DSL like ModelingToolkit.jl to abstractly specify the PDEs. It can define D = Differential(x), and then users can build, in the parlance of ModelingToolkit.jl, a PDE system with equation D^2 * u + D * u = f and boundary conditions a*D*u + b*u = c and give a domain type. Then someone could write dispatches to solve for AbstractPDEProblem which check if they can handle the domain and boundary conditions and automatically discretizes and solves. This is infinitely harder to do arbitrarily, but is the PDE solver that Julia deserves.

So I think the intermediate solution is to do both. Setup the AbstractPDEProblem interface and and add some concrete problems. JuliaDiffEq can start by offering some FEM and (pseudo)spectral methods to solve it. Then we can build GenericPDEProblem which uses the ModelingToolkit.jl stuff to abstractly specify that, and we can have DiffEqOperators.jl be the first package to handle that mostly automatically if the domain is square and if the BCs are linear. That will be quite a project but can be a next year GSoC goal.

From there a nice ecosystem could evolve. More AbstractPDEProblems for specific well-known PDEs can be proposed and people can offer solution methods, and then as we get more advanced we can build out the more fully automated generic pipelines. This then doesn't restrict anything about how the PDE is actually solved, or what PDEs can be solved, just how the PDE is specified and what the output looks like (i.e. make sure the solutions all subscribe to some solution interface so it's easy to swap around).

I think we will want to separate out the PDE docs though since I imagine this will grow quickly. I see it sectioning out like the DiffEq docs: each AbstractPDEProblem has a page describing the PDE to be solved and the problem options, and then there's a page describing all of the solvers available, and anyone can PR to add their solver package to the list with a description. There needs to be some tutorials specifically on specifying, solving, and handling PDE solutions. There needs to be pages describing the abstract domain and the concrete types. And there needs to be pages describing whatever extra options. That seems to need its own documentation since otherwise half of the DiffEqDocs would be PDEs all of the sudden (of course it would like over!)

dlfivefifty commented 5 years ago

Integral constraints like $\int_a^b u(x) dx = c$ are linear so should be fine. Non-linear constraints Are harder without a full DAE solver.

Are there actually cases where people have non-linear constraints that are reducible to explicit solvers in an adhoc way?