SciML / StochasticDiffEq.jl

Solvers for stochastic differential equations which connect with the scientific machine learning (SciML) ecosystem
Other
248 stars 66 forks source link

What means diagonal noise exactly? #261

Open rvignolo opened 4 years ago

rvignolo commented 4 years ago

Hi @ChrisRackauckas,

I just wanted to know what means diagonal noise in the context of DifferentialEquations.jl.

For example, Peter E. Kloeden and Eckhard Platen in "Numerical Solution of Stochastic Differential Equations" refer to diagonal noise as:

"... each component X^k of the Ito process X is disturbed only by the corresponding component W^k of the Wiener process W and the diagonal diffusion coefficient b^{k,k} depends only on x^k."

This means that the diffusion matrix is diagonal + each element k,k in the diagonal only depends on the process k so that partial derivatives with respect to other processes are 0.

Both conditions greatly reduce the number of terms that appear in each scheme (or algorithm) for SDE solving. However, I am not sure if DifferentialEquations.jl is taking into account the second condition, i.e.: elements k,k in the diagonal must only depend on process x^k.

For example, here it seems that diagonal noise refers ONLY to a diagonal matrix. If this is the case, I need to be sure that diagonal noise algorithms don't neglect the terms with partial derivatives.

Thanks!

ChrisRackauckas commented 4 years ago

I'll have to double check the requirements for Order 1.5 by Rossler.

rvignolo commented 4 years ago

Are you referring to this paper? If this is the case, I am afraid Rossler used the same definition for diagonal noise as Kloeden and Platen (check out Section 6.4).

This would mean that many examples, including the one presented in the showcase (the second part), are not fully correct?

Thanks Chris!

ChrisRackauckas commented 4 years ago

I see. So the example isn't incorrect per-say, but there's probably order loss with the method. It would be good to try putting together a convergence test to see what the order loss is like. It probably only gets strong order 0.5 on examples like that. Could you run an analyticless convergence test on some other system to find out? (Some other system because a convergence test on a chaotic system is hard unless it's short)

https://github.com/SciML/StochasticDiffEq.jl/blob/master/test/noncommutative_tests.jl#L79-L82

rvignolo commented 4 years ago

Exactly, what is going to happen is that there will be order loss!

I don't know why the theory always jumps from diagonal noise with such restrictive conditions to commutative noise. In the middle of those two noises, there is the case where the diffusion matrix is diagonal and each component can depend on any other processes.

What I have done in the past is to add the terms in the discretization scheme that do not vanish if the matrix is diagonal but each element can depend on any process. Maybe we can achieve that for some discretization schemes.

I am sorry for my delay! I will test all these cases once I get time, following your suggestions!

ChrisRackauckas commented 4 years ago

I don't know why the theory always jumps from diagonal noise with such restrictive conditions to commutative noise. In the middle of those two noises, there is the case where the diffusion matrix is diagonal and each component can depend on any other processes.

I think no one has specialized a method on it. The common case here is always multiplicative noise, so that's handled. Commutative noise is natural because that's the case where the iterated integrals vanish. But indeed, if you can find a way to exploit the middle ground, that might be interesting.

rvignolo commented 4 years ago

No one specialized methods on it, that is correct. I think that what I did was to take commutative noise and remove some terms or take the nonautonomous general multi-dimensional version of the discretization and remove some terms. However, I have some documentation and it is already coded. For example, this is one out of six discretizations I have and it might be incorporated (if it has not already been included) in DifferentialEquations.jl:

Click to expand! ```C // Explicit, Order 1.5 Strong int kosmos_compute_ExplicitPlatenWagner_step(process_t *processes, process_t *process, int n, int i, double *xt) { int p, j, k; double dt; double *t; double sqdt; double x_prev; double **phi_plus; double **phi_minus; double **upsilon_plus; double **upsilon_minus; double mu, mu_exp, sigma; double kappa_exp, kappa_imp, kappa_dot; double theta_exp, theta_imp, theta_dot; double sigma_exp, sigma_imp, sigma_dot; double alpha_exp, alpha_imp, alpha_dot; double beta_exp, beta_imp, beta_dot; equity_t *equity; process_t *process_j; process_t *process_k; short_rate_t *short_rate; // handlers t = kosmos.mc_solver.t; dt = kosmos.mc_solver.dt; sqdt = kosmos.mc_solver.sqdt; x_prev = process->xt[n][i - 1]; phi_plus = kosmos.mc_solver.phi_plus; phi_minus = kosmos.mc_solver.phi_minus; upsilon_plus = kosmos.mc_solver.upsilon_plus; upsilon_minus = kosmos.mc_solver.upsilon_minus; // specialize is process types because we can avoid using numerical derivatives switch(process->type) { case General: equity = (equity_t *) process->params; // set the wasora time equal to previous time step wasora_var(wasora_special_var(time)) = t[i - 1]; // update all processes holder variables to their previous time step values wasora_call(kosmos_set_processes_holder_variables_to_specific_mc_time_step_values(processes, n, i - 1)); // compute upsilon vectors (this way reduces the number of coefficient evaluations) for (k = 0, process_k = processes; process_k != NULL; process_k = process_k->hh.next, k++) { wasora_call(kosmos_compute_coefficients(process_k, &mu, &sigma)); for (j = 0, process_j = processes; process_j != NULL; process_j = process_j->hh.next, j++) { upsilon_plus[j][k] = upsilon_minus[j][k] = process_k->xt[n][i - 1] + 1 / kosmos.mc_solver.n_processes * mu * dt; if (j == k) { upsilon_plus[j][k] += sigma * sqdt; upsilon_minus[j][k] -= sigma * sqdt; } } // store useful index and evaluations if (process_k == process) { p = k; mu_exp = mu; sigma_exp = sigma; } } // first terms *xt = x_prev + sigma_exp * sqdt * process->Z1[n][i - 1]; // summation terms for (j = 0; j < kosmos.mc_solver.n_processes; j++) { wasora_call(kosmos_set_processes_holder_variables_to_auxiliary_array_values(processes, upsilon_plus[j])); wasora_call(kosmos_compute_coefficients(process, &mu, NULL)); *xt += mu * dt / 4.0 * (1.0 + process->Z2[n][i - 1]); wasora_call(kosmos_set_processes_holder_variables_to_auxiliary_array_values(processes, upsilon_minus[j])); wasora_call(kosmos_compute_coefficients(process, &mu, NULL)); *xt += mu * dt / 4.0 * (1.0 - process->Z2[n][i - 1]); } // a term removed from the summation *xt -= mu_exp * dt / 2.0 * (kosmos.mc_solver.n_processes - 2); // another term *xt -= sigma_exp * sqdt * (process->Z1[n][i - 1] - process->Z2[n][i - 1] / 2.0); wasora_call(kosmos_set_processes_holder_variables_to_auxiliary_array_values(processes, upsilon_plus[p])); wasora_call(kosmos_compute_coefficients(process, NULL, &sigma)); *xt -= sigma * sqdt / 4.0 * (1.0 - 3.0 * process->Z1[n][i - 1] - pow(process->Z1[n][i - 1], 2.0) + pow(process->Z1[n][i - 1], 3.0) / 3.0 + process->Z2[n][i - 1]); // compute phi vector using the current value of sigma for (k = 0; k < kosmos.mc_solver.n_processes; k++) { phi_plus[p][k] = phi_minus[p][k] = upsilon_plus[p][k]; if (p == k) { phi_plus[p][k] += sigma * sqdt; phi_minus[p][k] -= sigma * sqdt; } } wasora_call(kosmos_set_processes_holder_variables_to_auxiliary_array_values(processes, upsilon_minus[p])); wasora_call(kosmos_compute_coefficients(process, NULL, &sigma)); *xt += sigma * sqdt / 4.0 * (1.0 + process->Z1[n][i - 1] - pow(process->Z1[n][i - 1], 2.0) + pow(process->Z1[n][i - 1], 3.0) / 3.0 - process->Z2[n][i - 1]); wasora_call(kosmos_set_processes_holder_variables_to_auxiliary_array_values(processes, phi_plus[p])); wasora_call(kosmos_compute_coefficients(process, NULL, &sigma)); *xt -= sigma * sqdt / 4.0 * process->Z1[n][i - 1] * (1.0 - pow(process->Z1[n][i - 1], 2.0) / 3.0); wasora_call(kosmos_set_processes_holder_variables_to_auxiliary_array_values(processes, phi_minus[p])); wasora_call(kosmos_compute_coefficients(process, NULL, &sigma)); *xt += sigma * sqdt / 4.0 * process->Z1[n][i - 1] * (1.0 - pow(process->Z1[n][i - 1], 2.0) / 3.0); // set the wasora time equal to the current time step wasora_var(wasora_special_var(time)) = t[i]; // update all processes holder variables to their previous time step values wasora_call(kosmos_set_processes_holder_variables_to_specific_mc_time_step_values(processes, n, i - 1)); wasora_call(kosmos_compute_coefficients(process, &mu, &sigma)); // \frac{1}{2} \frac{\partial \mu}{\partial t} \Delta^2 *xt += (mu - mu_exp) / 2.0 * dt; // \frac{\partial \sigma}{\partial t} \right( \Delta W \Delta - \Delta Z \left) *xt += (sigma - sigma_exp) * sqdt * (process->Z1[n][i - 1] - process->Z2[n][i - 1] / 2.0); break; case ShortRate: short_rate = (short_rate_t *) process->params; // set the wasora time equal to previous time step wasora_var(wasora_special_var(time)) = t[i - 1]; // evaluate using previous time wasora_call(kosmos_evaluate_short_rate_time_dependent_expressions(short_rate)); // handlers kappa_exp = short_rate->kappa; theta_exp = short_rate->theta; sigma_exp = short_rate->sigma; alpha_exp = short_rate->alpha; beta_exp = short_rate->beta; // term by term as in the theory *xt = x_prev; *xt += kappa_exp * (theta_exp - x_prev) * dt; *xt += sigma_exp * sqrt(alpha_exp + beta_exp * x_prev) * sqdt * process->Z1[n][i - 1]; *xt += pow(kappa_exp * dt, 2.0) * (x_prev - theta_exp) / 2.0; *xt -= kappa_exp / 2.0 * sigma_exp * sqrt(alpha_exp + beta_exp * x_prev) * dt * sqdt * process->Z2[n][i - 1]; *xt -= beta_exp * pow(sigma_exp, 2.0) / 4.0 * dt * (1.0 - pow(process->Z1[n][i - 1], 2.0)); *xt -= beta_exp * sigma_exp / 8.0 / sqrt(alpha_exp + beta_exp * x_prev) * (beta_exp * pow(sigma_exp, 2.0) - 4.0 * kappa_exp * (theta_exp - x_prev)) * dt * sqdt * (process->Z1[n][i - 1] - process->Z2[n][i - 1] / 2.0); // set the wasora time equal the current time step wasora_var(wasora_special_var(time)) = t[i]; // evaluate using current time step wasora_call(kosmos_evaluate_short_rate_time_dependent_expressions(short_rate)); // handlers kappa_imp = short_rate->kappa; theta_imp = short_rate->theta; sigma_imp = short_rate->sigma; alpha_imp = short_rate->alpha; beta_imp = short_rate->beta; // the missing terms are the ones that include the non-autonomous part of mu and sigma kappa_dot = (kappa_imp - kappa_exp) / dt; theta_dot = (theta_imp - theta_exp) / dt; sigma_dot = (sigma_imp - sigma_exp) / dt; alpha_dot = (alpha_imp - alpha_exp) / dt; beta_dot = (beta_imp - beta_exp) / dt; // \frac{1}{2} \frac{\partial \mu}{\partial t} \Delta^2 *xt += pow(dt, 2.0) / 2.0 * (kappa_exp * theta_dot + (theta_exp - x_prev) * kappa_dot); // \frac{\partial \sigma}{\partial t} \right( \Delta W \Delta - \Delta Z \left) *xt += 1.0 / 2.0 / sqrt(alpha_exp + beta_exp * x_prev) * (sigma_exp * (alpha_dot + beta_dot * x_prev) + 2.0 * sigma_dot * (alpha_exp + beta_exp * x_prev)) * dt * sqdt * (process->Z1[n][i - 1] - process->Z2[n][i - 1] / 2.0); break; } return WASORA_RUNTIME_OK; } ```

Multiplicative noise is always the more common case. However, in finance, many fundamental processes do not fall in multiplicative noise either diagonal noise, such as the Heston Model:

https://github.com/SciML/DiffEqFinancial.jl/blob/8d1b8766cf9dbd9eee7f3372ac642d44c1e97d43/src/problems.jl#L16

Thanks!