SciML / PDERoadmap

A repository for the discussion of PDE tooling for scientific machine learning (SciML) and physics-informed machine learning
https://tutorials.sciml.ai/
18 stars 6 forks source link

Function composition interface #10

Closed MSeeker1340 closed 6 years ago

MSeeker1340 commented 6 years ago

notebook link

Main focus is to support the function composition interface suggested by @jlperla:

α1 = (u,p,t,x) -> ...
α2 = (u,p,t,x) -> ...
L = α1 * FirstDifferenceUpwind(grid) + α2 * SecondDifferenceCentral(grid)

Whereas for SecondDifferenceCentral this is more of a syntax sugar, for FirstDifferenceUpwind this is required. The idea is that FirstDifferenceUpwind itself is just a placeholder (contains relevant info such as the stencil coefficients) but by itself cannot be applied to a vector (because direction is unknown). The application of an upwind operator happens at the * level, which in the background is converted to an UpwindOperator.

UpwindOperator is never exposed to the user, so the user can use the function composition interface intuitively as if the central and upwind operators work the same.

MSeeker1340 commented 6 years ago

Now one concern I have with the function composition interface, for the iterative HJBE solver, is how we can incorporate the iterative algorithm in α(u,p,t,x).

@jlperla Correct me if I'm wrong: on a high level, what we do with the iterative solver is we choose some initial drift and diffusion terms, construct an operator using these terms, update the drift and diffusion using calculations involving the operator and repeat. But we always do the calculations at a discretized level, which means the new drift and diffusion field is also given at the grid points. If we use the L = DiffEqArrayOperator(Diagonal(mu)) * SecondDifferenceCentral(grid) approach, then we can just substitute the old (discretized) mu with the new mu; but what about

mu = (u,p,t,x) -> ...
L = mu * SecondDifferenceCentral(grid)

? How do we define the function mu? The only way that I can think of is too plug in the new discretized mu values as p and define mu to be a linear interpolation of p on grid... but would that be too convoluted?

I'm not saying that the function composition interface shouldn't be used --- it is a very natural way to construct the operators, and naturally fit PDE problems with time-dependent coefficients/fields. Just that in the case of iterative HJBE solvers it may not be the best way to formulate the problem.

jlperla commented 6 years ago

@MSeeker1340 For the iterative algorithms, I think we can do it in three stages.

  1. If we really need to do an iterative approach because we have a nonlinear setup (e.g. HJBE with a hamiltonian choosing the drift) then we can put the current drifts into the p parameter structure.
    • The mu = (u,p,t,x) could then access it as required, looking into the vector of drifts through various means. Linear interpolation, etc. is fine for this for now.
    • Between iterations, we could either completely reconstruct the composed operator, or we could just change the vector in the p and call update_coefficients! on the composed operator - which I believe would fill in the different drifts when it recursively hits the wrapped section.
    • But if the underlying constants are only being used when we call as_array or use similar functions, then I am not even sure we need to call update_coefficients! or recompose the operator manually. Shouldn't it call the mu(...) function again whenever it needs to?
  2. We could write a special case of a wrapped vector exactly at the nodes rather than a function.
    • I think this is basically the DiffEqArray type you had before in the first version of the prototype.
    • As before, we could either completely recompose the operator between iterations, or could somehow call update_coefficients! if required to tell things to refresh. Same comment as before, though, that I am not sure it is necessary.
  3. Chris mentioned putting fixed point iterations/etc. directly into the interface for calling the operator. We can examine that one down the road.
ChrisRackauckas commented 6 years ago

Looks good. Looks like it will work.

We could write a special case of a wrapped vector exactly at the nodes rather than a function.

Yes, that should be compatible with what's here and that would be nice for the "manual updating" version.

Chris mentioned putting fixed point iterations/etc. directly into the interface for calling the operator. We can examine that one down the road.

Not exactly, but with what we have we should be able to build tooling to do that as a library function.

jlperla commented 6 years ago

This all looks great!

A small thing on the interface, which doesn't change all that much in most of your implementation, is to support Ranges for the grids rather than forcing things into a vector. This is consistent with the discussions in especially the https://github.com/JuliaDiffEq/PDERoadmap/issues/4#issuecomment-385744935 one

Without testing it fully, I think this is done with something like the following (for the SecondDifference example)

#Grid is now parametric for any AbstractArray
struct SecondDifferenceCentral{Grid} <: DiffEqLinearOperator where {Grid <: AbstractArray}
    grid::Grid
end

#This is a a specialization for the Range type, which has a step() function associated with it
function *(L::SecondDifferenceCentral{Grid}, x::Vector{Float64}) where {Grid <: Range}
    m = length(L.grid)
    dx = step(L.grid)
    [x[i] + x[i+2] - 2*x[i+1] for i in 1:m] / dx^2
end

function as_array(L::SecondDifferenceCentral{Grid}) where {Grid <: Range}
    m = length(L.grid)
    dx = step(L.grid)
    spdiagm((ones(m), -2*ones(m), ones(m)), (0,1,2)) / dx^2
end

Note that the type of 1.0:.01:2.0 and linspace(0.0, 2.0, 100) are both of type StepRangeLen <: Range. The step function is valid for any Range.

Then later with the irregular grid, which falls back to the AbstractArray interface. Something like

function *(L::SecondDifferenceCentral{Grid}, x::Vector{Float64}) where {Grid <: AbstractArray}
    m = length(L.grid)
    dx = diff(L.grid)
    #... Use the dx vector
end
ChrisRackauckas commented 6 years ago

Then later with the irregular grid, which falls back to the AbstractArray interface.

AbstractVector with elements of scalars or abstract vectors (SVectors). But yes, we should keep the grids lazy and that's a pretty simple fix.

jlperla commented 6 years ago

AbstractVector with elements of scalars or abstract vectors (SVectors). But yes, we should keep the grids lazy and that's a pretty simple fix.

Perfect. Just to check (for my own knowlege as much as anything else)

julia> typeof(@SVector [1,2,3]) |> supertype
StaticArrays.StaticArray{Tuple{3},Int64,1}

julia> typeof(@SVector [1,2,3]) |> supertype |> supertype
AbstractArray{Int64,1}

julia> typeof(linspace(1.0, 2.0, 10) |> collect) <: AbstractVector
true

julia> AbstractVector == AbstractArray{T,1} where T
true

I think that you are right, if the fallback is for a Grid is the AbstractVector alias, it seems to be the winner for ensuring that static vectors are covered. But uniform grids are perfectly fine for now

jlperla commented 6 years ago

Sorry, I forgot the scalar check you talked about. Is the best check for scalars that they are Real? The issue with Number is that it contains Complex.

ChrisRackauckas commented 6 years ago

Complex domains are fine.

jlperla commented 6 years ago

I am not sure how that works, but I will take your word for it. I assume this requires the IrregularGrid interface, because isless is not defined for Complex, and hence the lazy : ranges are not defined. Certainly low priority for any applications I work with.

ChrisRackauckas commented 6 years ago

isless isn't part of the interface. It can't be because it won't work with points in any N-dimensional domain where N>1. Complex is just isomorphic to 2D reals (with a multiplication). Of course that means that the operators for complex are quite different (so our current operators would require real grids), but there's nothing that makes defining an operator on complex impossible. If your points are a square in the complex plane, then the 2nd derivative operator is equivalent to the Laplacian and has a lot of nice properties for complex.

MSeeker1340 commented 6 years ago

But if the underlying constants are only being used when we call as_array or use similar functions, then I am not even sure we need to call update_coefficients! or recompose the operator manually. Shouldn't it call the mu(...) function again whenever it needs to?

Is what you are suggesting something like

interp_func = interp1d(grid, MU)
mu(u,p,t,x) = interp_func(x)

? (not sure if there is an interpolation function in base Julia. Just assume interp1d(X,Y) returns an interpolation function on the points (X[i], Y[i])). MU is an external vector, so changing MU directly changes mu. And the hope is that there will be no needs for update_coefficients! in this case.

To do this, as you have suggested, the operator should be modified to call mu whenever it needs to (in L*u and as_array(L). This would definitely work, but the problem is that there may be situations when we desire to evaluate L*u for different u but the same L. Having L*u call mu repeatedly is not very efficient.

We could write a special case of a wrapped vector exactly at the nodes rather than a function.

I think this is the way to go for the iterative algorithm. The external dependency in this case works without problem (i.e. no update_coefficients! needed if the wrapped vector is changed directly; any changes will be immediately reflected in the operator.

MSeeker1340 commented 6 years ago

A small thing on the interface, which doesn't change all that much in most of your implementation, is to support Ranges for the grids rather than forcing things into a vector.

Yes I haven't given much thought to grids, and the Vector{Float64} grid in the notebook is just an ad-hoc implementation.

I see a lot of discussions going on at #4, l can take a closer look at them and maybe join in the conversion.

jlperla commented 6 years ago

Is what you are suggesting something like interp_func = interp1d(grid, MU) mu(u,p,t,x) = interp_func(x)

Yes, something like that. Or if you wanted to have the vectors in the parameters, mu(u,p,t,x) = interp1d(p.grid, p.MU)(x)

Certainly there are much cleaner ways to do this, but since we have workaround, it is lower priority. My guess is that just re-composing the lazy operator between iterations is good enough for now.

jlperla commented 6 years ago

Yes I haven't given much thought to grids, and the Vector{Float64} grid in the notebook is just an ad-hoc implementation.

I think the key for now is that we should only write the version with uniform grids. As long as we know the specializations that we would add down the road for irregular grids, then we are golden.

The main issue with using Vector{Float64} as the grid type is that there is no way to know if it is a regular or irregular grid. Luckily, we can rely on the Range types which are necessarily uniform because they have step defined.

MSeeker1340 commented 6 years ago

Yes, something like that. Or if you wanted to have the vectors in the parameters, mu(u,p,t,x) = interp1d(p.grid, p.MU)(x)

The difference is that this requires update_coefficients! to bring in the p. I actually prefer this approach: having a few update_coefficients! here and there doesn't complicate things much, but logically this is clearer than using external dependencies.

My guess is that just re-composing the lazy operator between iterations is good enough for now.

Not sure what you mean by re-composing. The operator only needs to be constructed once and that's it. All we need is call update_coefficients! along the way to introduce modifications, but the structure of the composition remains the same.

jlperla commented 6 years ago

Not sure what you mean by re-composing.

Basically, in pseudocode

mu = mu_initial_guess
sigma = 1.1 #Leave constant
while !converged
    L = mu * UpwindOperator + sigma^2/2 *CentralDifferences
    c = ...#create payoffs

   Q = ...#create boundary operator...

   # Solve the equation....
    A = Q * (r - L)
    u = A\c

   # Use the u() output to find the new mu,
    mu .= ...new calculations.... maybe filling in an interpolated function....
end

Since it is cheap to compose the operator, there isn't much worry about just recreating the L and Q in the loop. Later we could get fancier.

MSeeker1340 commented 6 years ago

Archiving this.