Fix preconditioning in `cg` when optional `radius` argument is positive

JuliaSmoothOptimizers / Krylov.jl

A Julia Basket of Hand-Picked Krylov Methods

Other

338 stars 51 forks source link

Fix preconditioning in `cg` when optional `radius` argument is positive #868

Closed mpf closed 1 month ago

mpf commented 3 months ago

As highlighted by @amontoison in this issue, the step-to-the-boundary computation in to_boundary needs to be updated so that it solves the equation

‖x + σd‖_M = radius

where ‖v‖_M^2 = v'Mv. This pull request implements the required changes to to_boundary.

Note:

cg is the only calling routine that's been correspondingly modified. Presumably all other solvers will also need to be suitably modified.
the modification to to_boundary implements two products with the preconditioner in order to compute Md and Mx. It should be possible to avoid these products by instead accummulating the products in the calling cg routine.

See also this JSOSolvers.jl issue.

amontoison commented 3 months ago

Thanks @mpf! Can you update these tests: https://github.com/JuliaSmoothOptimizers/Krylov.jl/blob/main/test/test_aux.jl#L104-L116 ? It should fix the errors.

github-actions[bot] commented 3 months ago

Package name	latest	stable
CaNNOLeS.jl
DCISolver.jl
FletcherPenaltySolver.jl
JSOSolvers.jl
LLSModels.jl
LinearSolve.jl
Percival.jl
RipQP.jl

amontoison commented 1 month ago

@mpf Do you confirm that if we use a trust region in CG (and other Krylov solvers like CR, CRLS, CGLS, LSQR, LSMR), we need both ldiv! and mul! methods for the preconditioner?

I don't remember why the norm of the trust region is defined by an elliptic norm induced by the inverse of the preconditioner.

I need to update the docstings to specify that after your PR.

dpo commented 1 month ago

I don't remember why the norm of the trust region is defined by an elliptic norm induced by the inverse of the preconditioner.

@amontoison say the trust-region constraint is s’Ms ≤ Δ². Then change variable: t = M^{½} s.

amontoison commented 1 month ago

@dpo Thanks! Can you confirm that we must provide both functions ldiv! and mul! if we use a trust-region? We only use M \ v or M^{-1} * v for the Lanczos process and not M * v.

dpo commented 1 month ago

The role of to_boundary() is to solve the quadratic equation $\Vert x + \alpha d \Vert_M^2 = \Delta^2$ for $\alpha$, i.e., $$\alpha^2 \Vert d \Vert_M^2 + 2 \alpha x^T M d + (\Vert x \Vert_M^2 - \Delta^2) = 0.$$

Currently, the user supplies $\Vert x \Vert_2$ and $\Vert d \Vert_2$, and we compute $x^T d$. But it would be just as easy to ask the user to supply $\Vert x \Vert_M$, $\Vert d \Vert_M$ and $x^T M d$, so that we would not require mul!() with $M$.

mpf commented 1 month ago

Currently, the user supplies ‖x‖2 and ‖d‖2, and we compute xTd. But it would be just as easy to ask the user to supply ‖x‖M, ‖d‖M and xTMd, so that we would not require mul!() with M.

That would indeed simplify to_boundary. But in that case, the preconditioned Krylov solver that calls to_boundary would need to compute these quantities, and thus require both mul!() and ldiv!().

Gould, Lucidi, Roma, and Toint [SIAM Opt, 1999, p.514] describe an approach that tracks $\Vert x+\alpha d\Vert_M$ iteratively. Perhaps that approach would avoid both mul! and ldiv! for some Krylov solvers?

amontoison commented 1 month ago

@mpf Can you merge this PR ou your fork? https://github.com/mpf/Krylov.jl/pull/1