Closed jlperla closed 6 years ago
if the operators naturally operate on different size vectors, then we can choose one of the following:
you can either say rI_MC - A
where this C
restricts to the interior so that way it's sized correctly
or you can say r*R*I_M - A
or you can say you have P*u
where P = r*I_M*u - A*Q2*u
so Q1 = I_M
extends exactly how far it's needed for the first operator
My gut doesn't say that the "L
is the largest square" method is the cleanest.
Take an arbitrary L
operator composed of all sorts of things
L = r I - L_2 - L_3
, note just a constant times IL_2
= a jump diffusion backwardsL_3 = mu L_uw + sigma^2/2 L_cd + L_4
L_4
= another jump diffusion backwards and forwardsL_uw
= the upwind first differencesL_CD
= central differences.Now, when a user defines that operator, are you saying that they would have to manually go through and figure out the right interior restriction C
to put in front of every one of those compositions?
L = r I *C_1 - L_2 *C_2 - L_3 * C_3
L_3 = mu L_uw * C_4 + sigma^2/2 L_cd * C_5 + L_4
Alternatively you could define it lazily as specified above, but then anytime you want to use the expression tree some insane generic code would have to go through and figure out the right C_1
, etc. to put in there? I just don't see how either of those are tractable without insane effort.
My alternative:
*
operator is called with the L
, it passes in the size of the domain it needs (i.e., the fully extended domain).Q
and R
that it becomes important what part of the domain was an extension.Here is an implementation of the generic stencil for the central differences object (which is creating the square). Note that it has to take in a grid of interest when it actually wants to generate things... In the case we are describing, this would be the extended domain.
julia> L_2(grid) = SymmetricToeplitz([-2.0; 1.0; zeros(length(grid)-2)])/(step(grid)^2)
julia> x_bar = -0.1:0.1:1.1 #Note that it has 13 points.
-0.1:0.1:1.1
julia> L_2(x_bar)
13×13 Array{Float64,2}:
-200.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
100.0 -200.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 100.0 -200.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 100.0 -200.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 100.0 -200.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 100.0 -200.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 100.0 -200.0 100.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 100.0 -200.0 100.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 -200.0 100.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 -200.0 100.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 -200.0 100.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 -200.0 100.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 -200.0
@dlfivefifty did I get the "square" version of the Central differences correct here? Of course, in any reasonable calculation the top and bottom rows would not enter any calculations.
If it isn't clear from this discussion, we are having trouble composing different operators of different "sizes". My suggestion was to find the biggest domain we would need that encompasses all of them, tell each operator to give its stencil on the whole domain, and then we can put in restrictions/boundary values on the composed and extended operator.
did I get the "square" version of the Central differences correct here? Of course, in any reasonable calculation the top and bottom rows would not enter any calculations.
It's not unique. You can also just append a column of zeros on the sides and it'll work.
You require every operator you compose to be able to automatically work on an arbitrarily sized grid (note that I did not say "extend" because they do not know or care what is an extension and what is the original domain).
This is already false for arrays, unless you just append a column of zeros as needed which is memory intensive.
Now, when a user defines that operator, are you saying that they would have to manually go through and figure out the right interior restriction C to put in front of every one of those compositions?
You don't have to do this if you just use a separate Q
. There can be a Q
for derivative operators, and a Q
for jump diffusions. That BC works for both. Then the full operator is L = L1*Q1 + L2*Q2
. That works perfectly fine.
The nice thing that is gained by @jlperla 's approach is that operators do just add together with an O(1)
extra computation in the lazy case. In the non-lazy case, you just AbstractMatrix(L)
the whole thing and the issue goes away anyways. I don't like it because of the non-uniqueness of the operators in this approach, but it does have some nice usability properties for asymptotically zero cost.
Two things to mention here. 1 is that the O(1)
cost is essentially nothing for large matrices, and this would allow us to very easily automatically generate things like efficient GPU and distributed parallel kernels, so that's a great gain.
However, there is still more that would need to be done to make @jlperla 's way work. It assumes that we can infinitely expand any operator. That is hard to do with uneven grids. If you extend, you have to know the grid that Q
is defined on, not the original grid. This may not even be known to the user. These other operators need to be defined on that extended grid, not the original grid.
This is already false for arrays, unless you just append a column of zeros as needed which is memory intensive.
I am not sure the expansion ever needs to happen for concrete arrays. Looking at our basic multiplication, we start on the RHS with a vector on the domain (and not the extension) and then the Q extrapolates it. I don't think we would want to (or need to) have a arrays expand?
It seems a little like using I
eventually interacts with a concretely sized vector, but those vectors themselves don't automatically change size
That is hard to do with uneven grids. If you extend, you have to know the grid that Q is defined on, not the original grid.
My feeling is that whenever it hits a lazy operator with a concretely sized vector, the operator itself would be able to get the grid associated with that vector. It wouldn't need to know what was an extrapolation vs the domain itself.. Which could make the implementation of that operator easy enough itself.
But this doesn't make the implementation of Q any easier... Given a nonuniform grid, the extrapolation needs to figure out the number of points to add, etc.
Getting the number of points to add could come from recursively calling a function given the L (as discussed in #11) but that is only part of the design.
I think we need to try to write the algebra down for the diffusion cases with this approach and see what the Q, R, etc look like.
Regardless, it sure feels like in order to create the affine Q operator automatically, it would need to know:
You don't have to do this if you just use a separate Q. There can be a Q for derivative operators, and a Q for jump diffusions. That BC works for both. Then the full operator is L = L1Q1 + L2Q2. That works perfectly fine.
But then we are basically back to having the boundary values associated with individual components of the operator, rather than the composed opera which was we needed to avoid when we started on this project and @dlfivefifty convinced us wouldn't work. You need to have the boundary values applied when you actually solve things, and for the operator as a whole. How else can you saintly write the boundary condition?
A reflecting or absorbing barrier for a jump diffusion requires all of the possible ways that the boundary could be hit, and I am not sure there is a natural way to decompose them...
But then we are basically back to having the boundary values associated with individual components of the operator, rather than the composed opera which was we needed to avoid when we started on this project and @dlfivefifty convinced us wouldn't work. You need to have the boundary values applied when you actually solve things, and for the operator as a whole. How else can you saintly write the boundary condition?
No, using this algebra we just proved that this Q1 and Q2 approach is equivalent. It just so happens that Q1 = R*Q
where R
is an appropriately sized restriction operator. There are some differences you get from this from the naive approach, but this approach with appropriate Q1 and Q2 is exactly equivalent and can be checked / defined via the same operator algebra, but is just a different way of implementing it.
My feeling is that whenever it hits a lazy operator with a concretely sized vector, the operator itself would be able to get the grid associated with that vector. It wouldn't need to know what was an extrapolation vs the domain itself.. Which could make the implementation of that operator easy enough itself.
No, because we want to precompute and cache the non-uniform grid coefficients. For example, what would you change in https://github.com/JuliaDiffEq/DiffEqOperators.jl/pull/49 such that you wouldn't get hit with a ton of costs due to re-running Fornberg? How would you know what to run it on?
We know that writing down Q and R's will work so that's too trivial of an exercise. Instead, could you think about how to modify that PR such that it could handle the non-local grid change determined by Q, but without a direct interaction by Q in *
? That's the real hard question.
No, using this algebra we just proved that this Q1 and Q2 approach is equivalent.
Fair enough, there will be some Q1 and Q2...of different sizes depending on how much extension L1 and L2 require, that makes it equivalent.
But since the combination of all of that is required to make a particular affine boundary value hold for the ODE, it could be that the Q1 and Q2 are extremely complicated and require details of the individual L1 and L2 (note both!) or that the particular Q1 and Q2 make need to be jointly determined given the boundaries. Not sure, but sounds scary
We know that writing down Q and R's will work so that's too trivial of an exercise.
Trivial for you, maybe😀 I am not yet convinced that the math for this can work in the uniform grid case, with jump diffusions. Let's convince us everything lines up correctly there first.
But I hear you on the need to cache things for the irregular grid. We can add an issue to check what the algebra would look like before we get too far.
After a great deal of discussion, the idea will be to first try expanding the lazy operators with rows of 0s. The hope is that this will still let the boundary conditions be reasonably easy to write.
FYI https://github.com/JuliaApproximation/InfiniteArrays.jl has a lazy Vcat
and Hcat
:
julia> Vcat(Zeros(1,10), Diagonal(Ones(10)))
11×10 Vcat{Float64,2,Tuple{Zeros{Float64,2,Tuple{Int64,Int64}},Diagonal{Float64,Ones{Float64,1,Tuple{Int64}}}}}:
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
Though it requires Julia v0.7.
Thanks! I think I am finally getting closer to understanding what is going on here, and am now starting to think that, while having arbitrary infinite expansion would make it easier to compose the operators, it might make generating the boundary extrapolation operator trickier for many cases..but that lazily padding with zeros might make it much more sane to do the "gaussian elimination" steps you are talking about. Or it might be irrelevant in the general case. Let me type things up
I am getting closer...
Closing since #18 with expanding operators with 0
s is the preferred method.
From a long slack discussion with @ChrisRackauckas @mschauer and to make #11 possible we want to write out the algebra more carefully.
The basic idea is to try to make it so the underlying operators do not need to be aware of whether they are on the boundary or the extension, which makes composition of operators on different stencil sizes possible.
At first we thought this was unique to the jump-diffusion example, but now it is clear that it comes up in other places. The easiest one is, $$ L = r - \partial{xx} $$ and solve $$ L u(x) = x $$ Subject to reflecting barrers at x=0 and x=1. To discretize the composed $L = r - \partial{xx}$ operator is a little tricky to do by pure composition. The issue is that, given the current discretization, just
r I - L_CD
withL_CD
as the central differences discretization. The issue is that the sizes don't conform. But what if we just expand everything to be as large (and as square) as we need them to be? Think of the central-differences as being an infinite stencil and just taking a square piece of it. This is evident in the algebra for the $P$ matrix in the https://www.overleaf.com/read/bxdzjxbjdyvp#/59040615/ document, which has special cases.A few notes.
This also is related to teh