SciML / StochasticDiffEq.jl

Solvers for stochastic differential equations which connect with the scientific machine learning (SciML) ecosystem
Other
237 stars 65 forks source link

Automatic ggprime handling #54

Open ChrisRackauckas opened 6 years ago

ChrisRackauckas commented 6 years ago

https://github.com/JuliaDiffEq/StochasticDiffEq.jl/pull/53

@onoderat 's Pr introduces ggprime which is g times the Jacobian of g. This is defined in the PCE paper (see https://github.com/JuliaDiffEq/DifferentialEquations.jl/issues/247 )

screen-shot-2018-02-10-at-5 31 29-pm
ChrisRackauckas commented 6 years ago

There are a few things that need to be done to handle this. When it's diagonal noise, it's just the noise Jacobian of the vector function, so yay that works out well! Then it's just g*J. But for non-diagonal noise the Jacobian of g is a 3-dimensional object. Basically, the derivative of the vector function of each column of g is a Jacobian, so gprime[i,j,k] is the Jacobian of g[:,j] by u_k. Then it's ggprime = sum(g[:,k]*gprime[:,:,k]).

There's two levels we want to take to implement this. First we want to setup numerical and autodiff for calculating gprime. Then we can setup the loop to calculate ggprime. Then allow the user to define the Jacobian of g and use it instead of calculating it. Then on top of that we allow the user to define ggprime to skip this whole handling. This would then go into the performance overloads as how Jacobians for g are handled.

I wonder if we should be calling it gjac and ggjac to match that we're calling other performance overloads jac. I didn't notice that before but maybe it's more consistent.

onoderat commented 6 years ago

This is going to sound controversial @ChrisRackauckas , but I realized that the formula in the textbook/paper is wrong for the ggprime while I was doing the testing for the non-diagonal case. I think based on mathematical intuition that that term is related to the Stratonovich to Ito conversion since when we set eta = 0.5, we are "solving" the problem from the midpoint of the center of the interval like in Stratonovich.

If that interpretation were to be taken, the ggprime should be codecogseqn-2

where is the noise vector due to the j'th noise channel and $L^{j}$ is the usual differential operators that appear in stochastic taylor expansions.

Using the definition in the textbook gave the wrong ggprime of $4 \sigma^2 u$ while this proposed one gave $ 2 \sigma^2 u$.

What do you think @ChrisRackauckas ? I personally am not 100% sure, so I am going to ask some SDE professors at Stanford to confirm. That being I believe that the code is fine for now - it just asks for some ggprime - I think the docs and automatic ggprime handling can wait until this proposed form has been confirmed.

ChrisRackauckas commented 6 years ago

That makes sense: it should line up with the 1st order term in the stochastic Taylor expansion since it should be the term that's added to the Milstein method. That said, I don't see how your definition is different from the one in the book. Is that not the same double sum?

onoderat commented 6 years ago

The definition in the textbook has a triple sum over $i$,$ j_1$ and $j_2$. The one that I am proposing only has i and j.

ChrisRackauckas commented 6 years ago

In matrix notation, what does yours correspond to? Trying to get a better sense of the difference.

onoderat commented 6 years ago

In matrix notation this will be, codecogseqn-3

where $gj$ is the j'th column vector of the diffusion matrix $g{ij}$.

ChrisRackauckas commented 6 years ago

I can't see how to write this. g*gprime[:,:,k] is wrong because it gives a vector out. Then sum that vector? That's what I wrote above. Did I accidentally write down your summation definition?

onoderat commented 6 years ago

ggprime(u) is a vector indexed by k. It has to be a vector because it is subtracted against the drift term $f$ which is a vector.

I actually am not familiar enough with the gprime tensor to parse whether what you have is the same definition as mine. Could you define the gprime object in terms of the jacobian explicitly?

ChrisRackauckas commented 6 years ago

That's what I defined as:

Basically, the derivative of the vector function of each column of g is a Jacobian, so gprime[i,j,k] is the Jacobian of g[:,j] by u_k.

Since here we define each column of g to be the separate g_j so that it multiplies a dW vector for the summation.

onoderat commented 6 years ago

Erm i see there is still one thing I need to pin down, is the index of i and k. is it codecogseqn-4

or is it codecogseqn-5

ChrisRackauckas commented 6 years ago

I think that's interacting weird with my plugin:

https://chrome.google.com/webstore/detail/github-with-mathjax/ioemnmodlmafdkllaclgeombjnmnbima

onoderat commented 6 years ago

Just updated comment with a gif.