Allow control over discretisation (and number of observations) of DEs within regression formulation?

To be able to relate the approach to SDEs, we may need to control the discretisation more carefully.

Basic proposal:

First

Use basis functions to expand all state vars, observed or not.
Each state variable can then be evaluated at any desired point via its spline representation

Second

Collocating a differential equation corresponds to taking 'observations' of a DE
However, do not use spline derivatives to expand derivatives
Instead, essentially use (or enable use of) finite differences of evaluations of the state vars (via spline functions) for all derivatives
Can then carefully implement appropriate SDE discretisation e.g. Euler-Maruyama.

Example: r_i = (Dy)_i - f_i + sigma*(dw/dt)_i where (Dy)i = (y{i+1}-y_i)/dt f_i = f(y_i) dt = T/m,

all y_i etc terms are evaluated using spline expansions and

sigma(dw/dt)_i = sigma(1/sqrt(dt))e_i, e_i ~ N(0,1) i.e. sigma(dw/dt)_i ~ N(0,sigma^2/dt), so dw_i = [(dw/dt)_i]dt ~ N(0,dtsigma^2)

Note that (dw/dt)_i = (1/sqrt(dt))*e_i, e_i ~ N(0,1), i.e. (dw/dt)_i ~ N(0,1/dt) is essentially a formal representation of the fact that the derivative of brownian motion is a white noise process with delta function covariance...So we can embed in standard regression framework as

r_i = (Dy)_i - f_i + E_i where (Dy)i = (y{i+1}-y_i)/dt f_i = f(y_i) E_i ~ N(0,sigma^2/dt)

and r_i is 'measured' as 0. Which is a form of nonlinear regression

r_i = F_i(y) + E_i

Alternatively, in differential form:

R_i = r_idt = (Dy)_idt - f_idt + E_idt where (Dy)i*dt = (y{i+1}-y_i) f_i = f(y_i)dt E_idt ~ N(0,dt*sigma^2)

etc. Same again - nonlinear regression. Both are fine for finite dt = T/m, though obviously the first has E_i -> infinity as dt -> 0.

Key: As long as in form of nonlinear regression where we can evaluate at any observation locations we want, and as long as we can treat derivatives as finite difference observations (etc) we should be fine.

Other notes, unrelated to implementation. We still control trade-off with data via sigma^2 = sigma_obs^2/lambda.

Also: data assimilation book reformulates observation/data equation in 'accumulation' form too so looks similar to process DE. E.g.

z_{j+1} = z_j + y_j*T/N

May or may not be useful for helping choose lambda in a way independent of sample size. Alternative is to just define lambda via something like sigma^2 = N*sigma_obs^2/lambda for N number of data observations.

dwu402 / pypei

Allow control over discretisation (and number of observations) of DEs within regression formulation? #6