SciML / DifferentialEquations.jl

Multi-language suite for high-performance solvers of differential equations and scientific machine learning (SciML) components. Ordinary differential equations (ODEs), stochastic differential equations (SDEs), delay differential equations (DDEs), differential-algebraic equations (DAEs), and more in Julia.
https://docs.sciml.ai/DiffEqDocs/stable/
Other
2.86k stars 230 forks source link

Do the autodiff options have any meaning when the Jacobi matrix is provided? #773

Open gstein3m opened 3 years ago

gstein3m commented 3 years ago

When I provide the Jacobian

f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)

why do I get different results (number of time steps,...) when I use different options for autodiff:

sol = solve(prob_ode,Rodas4P2(),

or

   sol = solve(prob_ode,Rodas4P2(autodiff=false,diff_type=Val{:central}),

or

   sol = solve(prob_ode,Rodas4P2(autodiff=false,diff_type=Val{:forward}),

In my opinion, these options should not play a role then.

Moreover, usage of a simple numerical f_jac

function f_jac(J,y,P,t)

--

del = sqrt(eps(1.0)); n = length(y); f0 = similar(y); f1 = similar(y); fcn(f0,y,P,t); for i=1:n del_1 = max(abs(y[i])*del,del); if y[i]<0.0 del_1 = -del_1; end y1 = copy(y); y1[i] = y1[i] + del_1; fcn(f1,y1,P,t) J[:,i] = (f1-f0)/del_1; end end

leads to 864 accepted steps compared to 4813 steps without providing the Jacobian and autodiff=false,diff_type=Val{:forward} option. The diff_type=Val{:central} option is similar to 860 steps.

ChrisRackauckas commented 3 years ago

Rosenbrock methods don't just need the Jacobian, they also need a gradient w.r.t. time.

gstein3m commented 3 years ago

Yes, thanks a lot. I did not notice, that these options do also influence the time derivative. Then there is still the second question: Even for the choice del_1 = del*max(abs(y[i]),del); the simple numerical Jac performs much better than autodiff=false,diff_type=Val{:forward} . Where is the difference?

ChrisRackauckas commented 3 years ago

What do you mean? Do you have an example?

gstein3m commented 3 years ago

My application is rather complex, but I will try to construct an example.

gstein3m commented 3 years ago

This is my example: 0 = y^2 -1/t^4, y(1)=-1 with solution y(t) = -1/t^2.

For f(y)=y^2 and h>0 the FD Approximation f'(y) = (f(y+h)-f(y))/h leads to a singular Jacobian latest at time t = sqrt(2/h).

1) I do not understand the results for diff_type=Val{:forward}. When I look at the implementation of "finite_difference_derivative" and "compute_epsilon" the choice absstep=relstep should lead to the same results as for my Jacobian with y_thres = 1.0.

2) When increasing the time interval y_thres should be lowerd. In many practical computations y_thres = 1.0e-5 has been a good choice.

3) Maybe the users should be allowed to alter y_thres in a range [sqrt(eps), 1.0] ?

using DifferentialEquations

function fcn(du,u,p,t)
    du[1] = u[1]^2 - 1/t^4;
end

function f_jac(J,y,P,t)
#-- numerical Jac by FD
 y_thres = P;
 del = sqrt(eps(1.0)); n = length(y);
 f0 = similar(y); f1 = similar(y); fcn(f0,y,P,t);
 for i=1:n
     del_1 = del*max(abs(y[i]),y_thres);
     y1 = copy(y); y1[i] = y1[i] + del_1;
     fcn(f1,y1,P,t)
     J[:,i] = (f1-f0)/del_1;
 end
end

tspan = (1.0, 1.0e3); M = zeros(1,1);

println("--- AD ---")

f = ODEFunction(fcn,mass_matrix=M)
problem = ODEProblem(f,[-1.0], tspan);

sol = solve(problem,Rodas4P2(),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD central ---")

sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:central}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward ---")

sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward, y_thres = 1 ---")

y_thres = 1.0;
f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)
problem = ODEProblem(f,[-1.0], tspan,y_thres);
sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward, y_thres = 1.0e-5 ---")

y_thres = 1.0e-5;
f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)
problem = ODEProblem(f,[-1.0], tspan,y_thres);
sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward, y_thres = sqrt(eps) ---")

y_thres = sqrt(eps(1.0));
f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)
problem = ODEProblem(f,[-1.0], tspan,y_thres);
sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

Output

--- AD --- Number of accepted steps: 5275

--- FD central --- Number of accepted steps: 5276

--- FD forward --- Number of accepted steps: 1393517

--- FD forward, y_thres = 1 --- Number of accepted steps: 6022

--- FD forward, y_thres = 1.0e-5 -- Number of accepted steps: 5231

--- FD forward, y_thres = sqrt(eps) --- Number of accepted steps: 5231

ChrisRackauckas commented 3 years ago

Yeah it looks like a more optimal eps could be chosen here.

gstein3m commented 2 years ago

The problem for the failure of (autodiff=false,diff_type=Val{:forward}) is within function calc_rosenbrock_differentiation! (see derivative_utils.jl). Here linsolve_tmp changes it's value, but should not. This is a workaround, but sure not the final solution.:

function calc_rosenbrock_differentiation!(integrator, cache, dtd1, dtgamma, repeat_step, W_transform) calc_tderivative!(integrator, cache, dtd1, repeat_step) nlsolver = nothing

- we need to skip calculating W when a step is repeated

new_W = false if !repeat_step

- println("a) Cache:",cache.linsolve_tmp)

  help = cache.linsolve_tmp[:]
  new_W = calc_W!(cache.W, integrator, nlsolver, cache, dtgamma, repeat_step, W_transform)
  cache.linsolve_tmp[:] = help

- println("b) Cache:",cache.linsolve_tmp)

end return new_W end

Best regards Gerd

ChrisRackauckas commented 2 years ago

Isn't it only used for the temporary inside that calculation? Is there a reason that buffer needs to remain unchanged later? If so, we probably need to add another cache vector, or use a different one somewhere.

gstein3m commented 2 years ago

cache.linsolve_tmp holds the result from calc_tederivative! and is required in rosenbrock_perform_step.

ChrisRackauckas commented 2 years ago

So the tderivative! just needs to be moved to after the W?

gstein3m commented 2 years ago

Yes, thats the most simple possibility.