Closed fph closed 11 months ago
Hi, the super short answer:
trust_regions
: That is a known instability problem for some costs on some manifolds (Rayleigh on the sphere is one such combination) and there is the project!=
keyword is there for helpquasi_Newton
is probably still fine the error you see super small, since
julia> 1-1.0000000000000002
-2.220446049250313e-16
so it might even just be the dimension of the sphere causing an error in the computation of the norm, since this is of order of machine precision.
There is actually 2 things interacting here
1) What do we do when we get an q=exp(M, p, X)
exponential map (or similarly a retraction, but for the sphere its exp) with a non-tangent vector X
– so here if X is not orthogonal to p
?
We do not want to error, that would be too strict – we could try to fix the X
but that might be costly and on some manifolds not easy / uniquely to do. So we leave that behaviour undefined. In practice – for the sphere – using non-tangent X
simply yields that q
is no longer on the sphere. This might happen already for slight deviations dot(p,X)
(maybe of size 1e-10
?). if we now have exp(M, q, Y)
for non-sphere q
we have the same problem. What should we do? We leave it as is and do not check inputs here (since also that is costly on high-dimensional spheres or other manifolds). What happens then is, that we actually stay on the sphere of the radius of q
. But keep in mind that exp
actually is not defined for such q
originally, so there is not even a right or wrong.
2) trust region has a subsolver, that solves the trust region sub problem (at least approximately). The result might – in a few cases – be slightly inaccurate even to the extend that the result is not a tangent vector. Sometimes also just of order e1-10
. Together with point 1) on the sphere these sadly accumulate.
The fix could be to always project the sub solver result (back) one the tangent space. The disadvantage is, that that might be costly on some manifolds or sometimes even not be implemented or defined.
One example is the fixed rank matrices, where UMTVector
has 3 fields and for a projection we would first have to embed this in the set of matrices,....for high-dimensional manifolds – super costly. If we would do that always – some might (rightfully) complain, that everything is slow.
But we have therfore the optional project!
keyword, which by default is a copyto!
(i.e. it copies the result from the subsolver back to the main solver). setting that to project!
activates projection (inplace, so with as less as possible allocations)
Hm I could not directly run your example since with your packages (at least on my Julia) sprandn
was not defined, but maybe you forgot to add SparseArrays
?
using Random, Manifolds, Manopt, Zygote, SparseArrays
# we set up a simple function to keep the optimizer busy.
n = 100
Random.seed!(0)
A = sprandn(n, n, 0.2); A = A + A'
M = Manifolds.Sphere(n-1, ℂ)
x0 = project(Manifolds.Sphere(size(A,1) - 1, ℂ), randn(Complex{eltype(A)}, size(A, 1)))
# this has no minimum since A is not posdef, but we don't care.
f(M, x) = real(x'*A*x)
g(M, x) = project(M, x, gradient(x->f(M,x), x)[1]) # you could also check out riemannian_gradient
x_old = trust_regions(M, f, g, x0)
x = trust_regions(M, f, g, x0; (project!)=project!)
then we have
julia> norm(x)
1.0000000000000027
so that is in the area of 1e-15
which is again close to machine precision
and we even see
julia> f(M,x_old)
-13.29339027908081
julia> norm(M, x_old, g(M,x_old))
0.183160719637751
julia> f(M,x)
-13.225587693798204
julia> norm(M, x, g(M,x))
2.921455811428008e-13
Not only is the new cost lower (ok the old point is also not in the domain of the “shortened version of Rayleigh” (no division by the norm); but also that the new one does have a small gradient.
I started reworking trust regions to finally allow other sub solvers – but that is still a bit of work (and needs the next breaking version of ManifoldsBase.jl, 0.15) see https://github.com/JuliaManifolds/Manopt.jl/pull/294, since we discussed that earlier. So this might still take some time, but will be available some time.
Here, with projection the algorithm is also much faster (ok because it stops with a small gradient not due to max iterations ;))
julia> @btime x = trust_regions($M, $f, $g, $x0; (project!)=$project!);
5.454 ms (12737 allocations: 41.69 MiB)
julia> @btime x_old = trust_regions($M, $f, $g, $x0);
102.932 ms (219532 allocations: 636.69 MiB)
Thanks for the very detailed answer! I missed that option in the docs, sorry.
I confirm also my 'true' function of interest looks a bit like a Rayleigh quotient.
And I confirm I forgot to include SparseArrays
in my example, sorry.
Great to here that that solved it :)
I agree that by default not having the project (and hence these cases) might be a point of discussion, but it is a choice between speed and availability on more manifolds (like fixed rank) versus stability/inaccuracy for a few others.
If you feel we could improve the docs here, let me know.
If you feel we could improve the docs here, let me know.
The project!
parameter was already there, so it's on me for not seeing it. The docs are fine, but since you are explicitly asking for suggestions for improvement I can put in one. You write
That is a known instability problem for some costs on some manifolds (Rayleigh on the sphere is one such combination) and there is the
project!=
keyword is there for help
and this problem seems to be specific to the trust_regions
solver. So I can suggest adding a remark that this might be a concern. The page https://manoptjl.org/stable/solvers/trust_regions/ already has a "Remarks" section, a note would fit well in there.
I could submit a PR myself if you wish, but of course I am far from being an expert since I did not even know about this issue until last week. :)
Many thanks for your work on this package!
I am rework ing trust regions here https://github.com/JuliaManifolds/Manopt.jl/pull/294 anyways, to have a really exchangeable subsolver (the original code is one of the first solvers we had before we had the sub solver-framework developed), for that I have to wait for ManifoldsBase.jl 0.15 (to have a generic tangent space available).
But I will add such a comment for sure, no problem :)
Thanks for the feedback!
After running sufficiently many iterations of
trust_regions
on the complex sphere, the result no longer has norm 1. Here is a minimal example:Returns
On more complicated examples, the norm gets even farther away from 1.