SciML / PDERoadmap

A repository for the discussion of PDE tooling for scientific machine learning (SciML) and physics-informed machine learning
https://tutorials.sciml.ai/
18 stars 6 forks source link

First major target: diffusions with scalar coefficients, irregular grids, and homogenous boundary conditions #33

Open jlperla opened 6 years ago

jlperla commented 6 years ago

There is a specific example that we can implement in a paper immediately. Take the linear operator,

L = r I - L_diffusion 

where

L_diffusion = mu(t) L_1 + sigma^2/2 L_2

where:

Furthermore, we need this defined on an irregular grid, though I understand if you want to implement the regular grid first as a testcase.

Finally, for the boundary conditions, can you implement the generic homogenous Robin boundary conditions? We would need them in the target paper. That is

a u(x_min) + b du(x_min)/dx = 0

and

a u(x_max) + b du(x_max)/dx = 0

where I guess the a and b could be different.

MSeeker1340 commented 6 years ago

Let me first get the Robin BC sorted out, since it is needed anyways.

I did some research on the topic (e.g. http://www.math.toronto.edu/mpugh/Teaching/Mat1062/notes2.pdf and http://iacs-courses.seas.harvard.edu/courses/am205/fall13/AM205_unit_3_chapter_4.pdf) and it seems that the common way to handle the Robin BC is as follows:

@jlperla I'm not familiar with Robin BC so can you confirm this is correct? The implementation of this should not be any different than absorbing and reflecting though. However, originally I thought it would be possible to only implement Robin and have reflecting/absorbing be special cases ($a = 0$ / $b = 0$). However, this is not the case:

jlperla commented 6 years ago

No idea. I think we just used either forward or backwards differences, depending on which corner we were at, but this sounds better grounded if you found it in a book. I also have always forward or backwards differences depending on the corner for reflecting barriers, but maybe that isn't right either? It certainly is what people in economics seem to have used

I didn't even realize the boundary condition was a Robin at first... It comes out of a chance of variables with a reflecting barrier.

Regardless, I don't think it is possible to nest all of them at once, and they probably require different implementations. From an API point of view, we would probably want some sugar like in ApproxFun.jl sooner rather than later for the absorbing and reflecting barriers (which, shouldn't be called that in the API, of course)

ChrisRackauckas commented 6 years ago

Then $u{-1}$ can be solved as $u{-1} = u_1 + \frac{2a\Delta}{b}u_0$, and by extension we can work out $Q$ if needed.

Yeah, this is how Q is defined for Robin conditions, and the linearity of this statement is exactly why Q is well-defined for these types of boundary conditions. The type of differencing done here is the choice of B: the choice of boundary value discretization. Using a first order discretization at the boundary is consistant with getting a second order result, but it's usually more stable to do a second order discretization with a second order operator. I think it would be best to just offer it all and generate it via Fornburg, just like the derivative operators themselves.

And yes you do need to handle the zero cases differently in some ways. That's why before we just found it easiest to have a few BCs.

jlperla commented 6 years ago

Is central differences also the better way to deal with the Neuman boundary conditions in nearly all circumstances? Does it interact in any way with the upwind procedure, or is the upwind convergence results to the viscosity solution independent of the boundary value discretization?

ChrisRackauckas commented 6 years ago

No, you usually can't do it since central differences higher than order 2 will use points further off the boundary. Non-centered differencing schemes are perfectly fine there though.

Does it interact in any way with the upwind procedure, or is the upwind convergence results to the viscosity solution independent of the boundary value discretization?

As long as the information flow follows the characteristics then it's fine.

MSeeker1340 commented 6 years ago

Here's my thoughts on the low-level interface:

Chris thinks building the square operator $L^B$ is easier and I agree with him. From slack:

I meant it's easier to handle since you don't need an intermediate cache In most cases we only need to vcat scalars on the sides though so as long as we have special boundary handling still then it's still performant and we do need special boundary handling for any higher order stencil

Basically we take what is currently implemented in DiffEqOperators and make them more consistent, e.g. having all operators be defined on the interior. An operator defined in this way does not actually have a clear-cutted $L$ and $Q^B$ component, rather they are fused together. As an example: https://github.com/JuliaDiffEq/DiffEqOperators.jl/blob/master/src/derivative_operators/abstract_operator_functions.jl#L123.

As for generic derivative operators that do have a separate $L$ and $Q^B$ component, a type like this could work:

struct GenericDerivativeOperator{LT,QT}
  L::LT
  QB::QT
  # other fields
end
Base.:*(LB::GenericDerivativeOperator, x) = LB.L * (LB.QB * x)

The user-constructed stencil operator L shares much of its structure as the square derivative operators, except that no BC information is encoded. In particular, the construction of the stencil is the same via Fornberg. Stencil operators of the same size can be added using the linear combination functionality of DiffEqOperators. We also need to provide a "zero-padding" method to extend smaller operators, for example

struct ZeroPaddedOperator{LT}
  L::LT
  idx
end
Base.:*(L::ZeroPaddedOperator, x) = L.L * x[idx]

Alternatively, we can specialize on the addition of stencil operators and instead of lazy combination, create a new stencil operator with fused-together coefficient. However this might be difficult if things are time-dependent, as the fused update function would be difficult to write.

As for QB, I think we should make it as generic as possible and just leave it as a generic diffeq operator. We can certainly provide pre-defined QBs for common BCs (and make them efficient by using LazyArrays.jl), but this isn't really necessary because the common BCs are handled by the square derivative operators that has special BC handling.

Now one thing missing is constructing an L with special BC handling but composite stencil, e.g. $L^B = (L_1 + L_2)Q^B$ with $Q^B$ say the Q for reflecting boundary. What we can do it distribute $Q^B$ can construct $L^B$ as $L_1Q^B + L_2Q^B$. This is enough in application, though I'd like to see if this can be solved.

jlperla commented 6 years ago

Sounds good. I think that the key is that those can't be composed with scalar multiplication and addition. We do not need to have a state dependent vector times the derivative operators quite yet.

Are you planning to defer doing the upwind first difference which automatically changes direction? I would prefer it to be avoided for now... If you really want to do it, though, I realized that the design with composition is slightly more nuanced than we discussed before. Better to wait, as far as I am concerned.

MSeeker1340 commented 6 years ago

Are you planning to defer doing the upwind first difference which automatically changes direction? I would prefer it to be avoided for now... If you really want to do it, though, I realized that the design with composition is slightly more nuanced than we discussed before. Better to wait, as far as I am concerned.

Yeah that complicates things a bit, and seeing that it's not used in the example I think it would be best to defer this for now.

jlperla commented 6 years ago

And don't forget to have the (generic) grid as part of the GenericDerivativeOperator struct or interface, even for the regular ones. I assume you meant for it to be in the # other fields but I didn't see any of your comments on the suggested interface for the AbatractRange vs AbstractArray on the other issue

ChrisRackauckas commented 6 years ago

I think @MSeeker1340 and I are on the same page now. Whether it's easier to do the square operators or do the LazyArray Q is still up in the air. The advantage of the square operators is that then you just have a single stencil operation and no other things to handle, but then you have to implement the BCs for each derivative operator (central, upwind, and jump). Having the Q separate handles it in every case, but then you have this funky length change with LazyArrays that possibly allocates. I am thinking the first may be worth the extra work since the later could get some odd type issues.

Are you planning to defer doing the upwind first difference which automatically changes direction? I would prefer it to be avoided for now... If you really want to do it, though, I realized that the design with composition is slightly more nuanced than we discussed before. Better to wait, as far as I am concerned.

With composition that fuses loops it's more nuanced. But with just standard composition it's fine. And there's already a start for the upwinding operators if you want to do them in the square form in the repo. Now that we have a solid way to derive how to do each of the BCs more generically we can get that corrected since I think that the Neuman parts are incorrect (and it again has the definition issues)

ChrisRackauckas commented 6 years ago

I'd say leave the loop fusion as an optional performance gain that can be done if you have time, and get the operators with all of the BCs implemented. I am looking for a GSoC student next year to work on performance aspects of this. I assumed it would be to make the operators work well on GPUArrays, DistributedArrays, and MPIArrays, but adding loop fusion of composed derivative operators would be a nice feature as well.

And as @jlperla mentioned, we should make sure we have a good way to handle the coefficient scalar/vector as well. I think it needs to be part of the operator so that way that loop can trivially fuse, and then it can be referenced directly for the upwinding term. The current implementation of said upwinding term doesn't match with how the operator interface evolved so it will need a rework, but the general idea is there.

jlperla commented 6 years ago

I think it needs to be part of the operator so that way that loop can trivially fuse, and then it can be referenced directly for the upwinding term.

Remember when we discussed this at length in May. What we realized that it was way better to simply dispatch on the multiplication overload. Having the scalar and vector terms in the operators themselves makes recursion of the update coefficients /etc an unnecessarily complicated case...and the DSL for composing operators ugly. I believe the proof of concept was in that old PR

ChrisRackauckas commented 6 years ago

We played with it a lot more since then though, with the lazy W compositions. I think reversed the decision after actually trying to implement it?

jlperla commented 6 years ago

I hope not. I still don't understand how you can handle it with arbitrary expression trees creating the multiplicative constants, if that tree needs to be maintained and stored with each operator... Please rethink this carefully because we spent a whole lot of time exploring it before

ChrisRackauckas commented 6 years ago

We spent quite a bit of time revisiting it to get the implementation of the composed operator as well.

jlperla commented 6 years ago

Oh, you are saying that it already has scalar multiplication built into the operator definition? I guess what I just don't understand is what this looks like if the composition has a nontrivial expression for the scalar or vector multiplication...

Everything about this smells like a design inconsistent with how generic linear operators can work with the Julia expression syntax

ChrisRackauckas commented 6 years ago

Everything about this smells like a design inconsistent with how generic linear operators can work with the Julia expression syntax

Not really. FillArrays is doing this. UniformScaling is doing this in Base. You can keep finding more and more examples of this. The only issue is then you have to implement the coefficient handling with each type, which kind of sucks but does have its upsides in terms of how it fuses.

MSeeker1340 commented 6 years ago

I hope not. I still don't understand how you can handle it with arbitrary expression trees creating the multiplicative constants, if that tree needs to be maintained and stored with each operator...

The current implementation does not require a multiplicative constant to be stored in each operator. What we do is have regular diffeq operators and DiffEqScaledOperator, which stores a time-dependent scalar coefficient and a diffeq operator (https://github.com/JuliaDiffEq/DiffEqOperators.jl/blob/master/src/composite_operators.jl#L14). I think this is lightweight enough and does not complicate the expression tree much.

Another thing is that linear combinations now are just additions, i.e. instead of A = c1A1 + c2A2 + ... + cnAn we just have A = A1 + A2 + ... + An in the expression tree. the scalar coefficients are handled using DiffEqScaledOperator.

Edit: The rationale is that the previous prototype, which stores both the component operators and the coefficients as in LinearMaps.jl, would not fit well with time-dependent coefficients. The hope is that "composite" types such as DiffEqScaledOperator and DiffEqLinearCombination do not contain separate time-dependent entities directly, so update_coefficients! can just propagate down the expression tree. In the case of time-dependent coefficients, for example, the only thing that requires an update function is the coefficients themselves, and neither DiffEqScaledOperator nor DiffEqLinearCombination require an update function.

jlperla commented 6 years ago

Maybe I am misunderstanding what you are saying then. Uniform scaling is exactly the sort of thing I like, because it composes well. As far as I am understanding you, it is the opposite... Uniform scaling is great because you can write a * I, but I thought you were saying the scalar a would have to be part of the operator, not just part of how multiplication is defined.

And if, using uniform scaling, I wanted to do (a + b(x)) *I, it also works great.

Further more, lets say I wanted to have a vector y times a the I and add a matrix A, then I could go (a * y) *(I + A) no problem.

MSeeker1340 commented 6 years ago

We still have a diffeq operator type representing the identity operator:

https://github.com/JuliaDiffEq/DiffEqOperators.jl/blob/master/src/basic_operators.jl

So compositions like (a + b(x)) *I has no problem.

jlperla commented 6 years ago

Another thing is that linear combinations now are just additions, i.e. instead of A = c1A1 + c2A2 + ... + cnAn we just have A = A1 + A2 + ... + An in the expression tree. the scalar coefficients are handled using DiffEqScaledOperator.

In that trivial example, it doesn't look so bad .. But what if the math is naturally (c1 + c2(t)) *(A1 + c3 * A2). If I am understanding that correctly, then I would say we do not really have composition of linear operators with the current design.

jlperla commented 6 years ago

So compositions like (a + b(x)) *I has no problem.

My point was that it doesn't work for the differential operators, like it does for I. Or am I missing something?

MSeeker1340 commented 6 years ago

I don't see a problem with (c1 + c2(t)) *(A1 + c3 * A2). The expression tree:

DiffEqScaledOperator
  c12 # DiffEqScalar representing c1 + c2(t)
  DiffEqOperatorCombination
    A1
    DiffEqScaledOperator
      c3
      A2
jlperla commented 6 years ago

Maybe the disconnect is the difference between the types you use to store expression trees (which I don't care about) vs how the user constructs the pieces of the operators (which I care deeply about)

From this, are you saying that if I want to create the operator L = a * D_x + c * D_xx that it would be something like

c = 0.01
a = - 0.02
x = range (0.0, stop=1.0, length = 10)
L_2 = Derivative(x, 2)  # or  Derivative(x, 2, direction = :central) or Diffusion(x) 
L_1 = Derivative(x, 1, direction= :backwards) #or perhaps with upwind default then Derivative(x, 1)
L = a * L_1 + c * L2

As opposed to

c = 0.01
a = - 0.02
x = range (0.0, stop=1.0, length = 10)
L_2 = Derivative(x, 2, c)  # Note I am storing the constant in the operator in the construction. 
L_1 = Derivative(x, 1, direction= :backwards, a) #again, has the "a" in it
L = L_1 + L_2

Which I absolutely, positively, despise. If the combinations end up stored like that somewhere after the user interface, though, then that is an internal design decision.

This was my interpretation of chris's comment that "we should make sure we have a good way to handle the coefficient scalar/vector as well.". If I interpreted it incorrectly, then nothing to worry about.

MSeeker1340 commented 6 years ago

The interface for constructing derivative operators is still WIP at the moment.

c = 0.01
a = - 0.02
x = range (0.0, stop=1.0, length = 10)
L_2 = Derivative(x, 2)  # or  Derivative(x, 2, direction = :central) or Diffusion(x) 
L_1 = Derivative(x, 1, direction= :backwards) #or perhaps with upwind default then Derivative(x, 1)
L = a * L_1 + c * L2

I'm also in favor of this interface (currently we do need to wrap a and c in DiffEqScalar though but this is a minor inconvenience, and I can probably overload * so that left multiplication by a scalar is automatically handled). The interface is already in place and defined here:

https://github.com/JuliaDiffEq/DiffEqOperators.jl/blob/master/src/composite_operators.jl#L19

There shouldn't be a case where DiffEqScaledOperator's constructor needs to be explicitly called, nor should any embedded coefficient be needed in a derivative operator's constructor as in the second example you provided. They shouldn't have one to begin with; all coefficients can be handled by DiffEqScaledOperator.

jlperla commented 6 years ago

OK, I think I misunderstood what Chris was saying, as this is consistent with what we had discussed previously as the intended interface. And just to make sure that I am not missing anything with arbitrary compositions, the following can be supported as well?

c = 0.01
a = - 0.02
x = range (0.0, stop=1.0, length = 10)
L_2 = Derivative(x, 2) 
L_1 = Derivative(x, 1, direction= :backwards)
L = a * L_1 + c * L2
f(t, x, u, p) = 0.01 * x 
d =  0.05
L3 = (0.01 + d) * L + f * L_2 #Note composing multiple operators

The 4 parameters for f are based on what I thought was discussed before to be compatible with update_coefficients!, but hopefully you get the point. And I understand you would need an overload of both scalar and function left multiplication.

Of course, in this case I could do the algebra to simplify it myself, but if the expression trees you are implementing can handle that sort of code. If so, then no worries at all from my side.

MSeeker1340 commented 6 years ago

The 4 parameters for f are based on what I thought was discussed before to be compatible with update_coefficients!, but hopefully you get the point. And I understand you would need an overload of both scalar and function left multiplication.

Yep, adding the overload would be easy as essentially we are just creating a `DiffEqScalar(zero(T); update_func=f).

BTW I'm working on a prototype for regular grid using the generic L*Q^B route, I'll let you know of my progress.

jlperla commented 6 years ago

See https://github.com/JuliaDiffEq/DiffEqOperators.jl/pull/76 for the proof of concept.