SciML / PDERoadmap

A repository for the discussion of PDE tooling for scientific machine learning (SciML) and physics-informed machine learning
https://tutorials.sciml.ai/
18 stars 6 forks source link

Coefficients embedded operators #9

Closed MSeeker1340 closed 6 years ago

MSeeker1340 commented 6 years ago

section link

The main point is that as @ChrisRackauckas points out, separating $\mu(x)\partialx$ into $\mu+(x)\partial+(x) + \mu-(x)\partial_-(x)$ does not generalize well to multiple dimensions (also it's not very elegant and slower by a factor of two). What we should really do is something similar to the existing implementation of upwind operators, i.e. include the directions/coefficients in the operator itself.

Now there are two ways going forward from here: either we use this as a basis and have every differential operators include a field for coefficients, or we constrain this approach only to upwind operators and keep the central differencing operators coefficients-free. The first way is more consistent in terms of interface, but would result in a lot of redundant code (after all, the need to embed coefficients only exist for upwind operators). Personally I prefer the second approach, but I should leave this up to discussion.

Do note that although the construction of L in the last cell looks very concise, in general we still need to use operator compositions now and then. For example, for the forward Kolmogorov operators the derivatives come after the coefficients are applied, so we need something like L = DiffusionOperator(dx,N) * DiffEqArrayOperator(Diagonal(D)).

(By the way, for the forward Kolmogorov equation, how does upwind operators work? Specifically, do we use Leibniz rule on $\partial_x(\mu(x)u)$ = $\mu(x)\partial_xu + \partial_x\mu(x)u$ and convert to the case when the coefficients are prepended? And what direction should we take to evaluate $\partial_x\mu(x)$?)

@jlperla I have included an example of how an iterative algorithm with coefficients update would work. As I discussed in the last PR, because mu and D are stored as references their changes will be reflected in L, so there's no need for update_coefficients!. On the other hand if you're worried about the external dependency, we can go back to the update_coefficients! route.

jlperla commented 6 years ago

First from some of the discussion like: https://github.com/JuliaDiffEq/PDERoadmap/pull/1#issuecomment-385038564 I think that the L = DiffusionOperator(dx,N) * DiffEqArrayOperator(Diagonal(D)) style approach is much less preferred than one of the other two approach, which cover most cases:

Those leave them as functions, which is the real goal. If we needed to have it assigned to a vector then in the short term we could

By the way, for the forward Kolmogorov equation, how does upwind operators work?

I don't think you do. The L = DiffusionOperator(dx,N) * DiffEqArrayOperator(Diagonal(D)) is used in solving the Kolmogorov Backward Equation directly. If someone wants to solve the KFE, they should do the calculus themselves... which is not meaningful with the DiffEqArrayOperator(Diagonal(D)) approach.

The other problem is that when you solve the KFE in many applications you need to change the operator around for birth/death, so even if an adjoint of L was possible, it would be incomplete. Also, for the generic framework it would need to handle jump-diffusions, continuous-time markov-chains, etc. where the algebra for state/time varying jump sizes and arrival rates gets tricky.

Down the road there are two possible approaches to automate them:

  1. We can have lazy adjoints of the operator post-discretization So they could specify the L, and call a Lstar = adjoint(L) + birth + death terms. I am not sure of exactly how that would look (and the algebra for adding in the birth/death terms) so better to leave it off.
  2. Presumably a symbolic DSL could come at some point, but
jlperla commented 6 years ago

Looking at the code, my general feeling is that the interface in the current example is too tied to underlying vectors of coefficients, and that it would be very difficult to make this work with irregular grids.

Also, is there no generic way to get this to work (ultimately lazy) without actually storing the vector?

For example, imagine the following interface

mu(u,p,t,x) = x - 0.5 #An example where the drift switches at x = 0.5
L = mu * UpwindDriftOperator(dx,N)

Now, why don't you push the calculation of the directions to the dispatch of the * operator, where you probably push it to the grid-point? If calls mu(u,p,t,x) at whatever x it is looking at in the grid and the sign of it

Does this store the coefficients? Yes, in the mu wrapper... The upwind part is in the implementation of the *, which makes sense since it is how upwind is defined in a mathematical sense, in my understanding?

MSeeker1340 commented 6 years ago

If the operator is σ(x)∂xx then, σ(u,p,t,x) = 1.1 x^2 L = σ DiffusionOperator(dx,N) Those leave them as functions, which is the real goal. If we needed to have it assigned to a vector then in the short term we could

Seems I missed this in the original comment. Let me see if I can go along with this route then. (If σ is a time-independent scalar then σ * DiffusionOperator(dx,N) already works.)

What I'm not comfortable with though is writing a *(::Function, ::DiffusionOperator)... There's just too little we can control over what the function argument can be, whether its signature conforms with what we have in mind, etc. Might be better if we wrap σ in a functor of sort.

Also since we have x in the fray, σ needs to know what the underlying grid is used, which DiffusionOperator currently don't know (but this is a minor detail; for example if we're using non-uniform grid, then we need the grid points in the derivative operators anyway).

jlperla commented 6 years ago

What I'm not comfortable with though is writing a *(::Function, ::DiffusionOperator)... There's just too little we can control over what the function argument can be, whether its signature conforms with what we have in mind, etc. Might be better if we wrap σ in a functor of sort.

Up to you guys. When I suggested this to @ChrisRackauckas , he seemed to suggest it isn't necessary, but I don't think calling a single wrapper function is a problem at all.

Also since we have x in the fray, σ needs to know what the underlying grid is used, which DiffusionOperator currently don't know

Yes. I really think that we should associate a grid with all operators. See my suggested notation in https://github.com/JuliaDiffEq/PDERoadmap/pull/1#issuecomment-384721027 but to copy the example (without the upwind part)

x̃ = Interval(1.0, 10.0) #Continuous domain on the interior.  Later can take product with discrete
N = 10
x = RegularGrid(x̃, N) #Note: I think it is better to give points than dx, but I am open.
L⁺ = DriftOperator(x, :forwards) #Note passing in grid, so interface could allow irregular later.
L⁻ = DriftOperator(x, :backwards)
L₂ = DiffusionOperator(x)

Note that by separating out the grid, the interface is much cleaner and easier to imagine extending to irregular grids.

jlperla commented 6 years ago

Also, see https://github.com/JuliaDiffEq/PDERoadmap/issues/4 for the place to discuss the design of the domain and grid interface itself... which we eventually want to have compatible with ApproxFun.jl

MSeeker1340 commented 6 years ago

Not sure if I have got this wrong, but isn't the FKE in the no-jump case just the Fokker-Planck equation? Specifically

$$ \partial_tu = L^*u = -\partial_x(\mu(x)u) + \partial^2_x(\frac{\sigma(x)^2}{2}u) $$

Not doing anything fancy like adjoint(L) at the moment, I'm just asking from a numerical methods point how we should discretize $-\partial_x(\mu(x)u)$.

ChrisRackauckas commented 6 years ago

What I'm not comfortable with though is writing a *(::Function, ::DiffusionOperator)... There's just too little we can control over what the function argument can be, whether its signature conforms with what we have in mind, etc. Might be better if we wrap σ in a functor of sort.

The interface would just have to impose that the function has a specific signature. This isn't bad if we don't change it.

Yes. I really think that we should associate a grid with all operators.

Agreed. We've only gotten away without it since we were doing uniform grids. But multi-dimension, unstructured grids (FEM), etc. all need to have this information. In fact, we also need to have a function that can specify which points are in the boundary.

Not sure if I have got this wrong, but isn't the FKE in the no-jump case just the Fokker-Planck equation?

There's an FKE for any stochastic equation. It's just the weak formulation of the stochastic process, talking about evolution of probability distributions instead of trajectories.

jlperla commented 6 years ago

Not sure if I have got this wrong, but isn't the FKE in the no-jump case just the Fokker-Planck equation

Very close. For economists we would then need to add in birth and death terms to the right hand side - which are not present in the KBE. My point is simply that if you want to solve the KFE, you can just do the calculus in the `$-\partial_x(\mu(x)u) + \partial^2_x(\frac{\sigma(x)^2}{2}u) $ yourself, get a linear differential operator and implement that directly. We don't need an automatic way to do that adjoint calculation.

MSeeker1340 commented 6 years ago

Closed in favor of the function composition interface #10.