Open jbcaillau opened 3 months ago
@amontoison can you please point towards a worked out example defining an NLPModel? According to the API, I guess that the stuff listed here are required (Jacobian / Hessian and associated products...)
Do you want a ManualNLPModel
or still use an ADNLPModel
?
-> https://jso.dev/ADNLPModels.jl/previews/PR209/sparsepattern/
Thanks @amontoison for the example. I suppose that building manually the whole thing (= sparse derivatives, using AD only where itβs needed) gives the best result, but what is tour advice?
@jbcaillau If you directly provide the sparsity pattern of Jacobians / Hessians in an ADNLPModel
/ ADNLSModel
, you will have a speed-up for sure.
If you are building manually the whole thing, the coloring / AD / decompression will be also replaced by your own function jac_coord
and hess_coord
and it highly depends on your implementation of these fonctions.
If you exploit the structure and redondant patterns, you will have an advantage but you also need to find some parallelism.
(It's a good application for Tapenade with code to code AD.)
With AD we can evaluate multiple directional derivates in the same time to compute the compressed Jacobian / Hessian.
@jbcaillau If you directly provide the sparsity pattern of Jacobians / Hessians in an
ADNLPModel
/ADNLSModel
, you will have a speed-up for sure.If your building manually the whole thing, the coloring / AD / decompression will be also replaced by your own function
jac_coord
andhess_coord
and it highly depends on your implementation of these fonctions [β¦]
thanks @amontoison letβs try the first option as a start. looking forward https://github.com/JuliaSmoothOptimizers/ADNLPModels.jl/pull/286
@amontoison actually, there is a worked out example of building an NLPModel by hand in MadNLP docs: https://madnlp.github.io/MadNLP.jl/dev/quickstart/#Specifying-the-derivatives-by-hand:-NLPModels.jl
I already gave you a tutorial on a ManualNLPModel
, is it not what you wanted?
https://jso.dev/tutorials/create-a-manual-model/
In the same spirit than the example of MadNLP, we have a few problems defined by hand to test our various AbstractNLPModels in the JSO ecosystem: https://github.com/JuliaSmoothOptimizers/NLPModelsTest.jl/tree/main/src/nlp/problems
@amontoison yes, exactly what we need ππ½. thanks again.
We can easily provide the sparsity pattern of the Jacobian / Hessian with the release 0.8.5 of ADNLPModels.jl
.
I added an example in the documentation: https://jso.dev/ADNLPModels.jl/dev/sparse/
We can easily provide the sparsity pattern of the Jacobian / Hessian with the release 0.8.5 of
ADNLPModels.jl
. I added an example in the documentation: https://jso.dev/ADNLPModels.jl/dev/sparse/
nice, @amontoison. actually, this seems to be OK for our needs for direct transcription into an nlp. BTW, given that, what would be the benefit to use NLPModels (instead of ADNLPModels)? the expected improvement being on sparsity patterns (that we do know in advance), I guess that ADNLPModels is enough. Am I missing sth?
@jbcaillau It depends on whether you have more information than what ADNLPModels.jl
can exploit.
For example, if you have redundant terms in your objective or constraints, your Hessian or Jacobian might contain the same block matrices multiple times (e.g., J1
, J2
, J3
).
J = [
J2 J3 0 0
J1 J2 J3 0
0 J1 J2 J3
0 0 J1 J2
0 0 0 J1
]
In this case, you can optimize computations by only calculating the blocks J1
, J2
, and J3
once.
This approach goes beyond merely knowing the sparsity pattern of J
; it also leverages additional information about the relationships between the non-zero elements.
The blocks J1
, J2
, J3
could also depends on x
or a subset of x
and in that case you just need to implement function J1(y)
, J2(y)
, J3(y)
where y
is a vector.
ππ½@amontoison perfectly clear, many thanks. having such common blocks is not expected a priori. the next step in the case of direct approach (ocp -> nlp vs. using Pontrjagin then shooting or so) is rather exploiting SIMD features.
ππ½@amontoison perfectly clear, many thanks. having such common blocks is not expected a priori.
Hi. What about the dynamics and path constraints along the time steps ? I thought they would be this kind of common blocks, did I miss something ?
Currently, the constraints evaluation looks like this (the double arguments and update parts are a specific optimization for the trapeze method and would not be present for another discretization scheme). Not included is the final part with the boundary and variable constraints, but setStateEquation
and setPathConstraints
would be your J_i
?
args_i = ArgsAtTimeStep(xu, docp, 0, v)
args_ip1 = ArgsAtTimeStep(xu, docp, 1, v)
index = 1 # counter for the constraints
for i in 0:docp.dim_NLP_steps-1
# state equation
index = setStateEquation!(docp, c, index, (args_i, args_ip1))
# path constraints
index = setPathConstraints!(docp, c, index, args_i, v)
# smart update for next iteration
if i < docp.dim_NLP_steps-1
args_i = args_ip1
args_ip1 = ArgsAtTimeStep(xu, docp, i+2, v)
end
end
@PierreMartinon common generators (= functions to evaluate) for every time step, but different evals so different blocks. Maybe a few common things in the particular case of the trapezoidal scheme (see your smart update - nice ππ½)? I might be missing sth, though.
See https://github.com/0Yassine0/COTS.jl/issues/17#issuecomment-2241187824