Closed salbert83 closed 8 months ago
Hi,
I extracted the following code
using LinearAlgebra, Manifolds, ManifoldDiff, Manopt, Random
m, n = 2, 8
H = randn(ComplexF64, m , n)
HtH = H'*H
P = randn(ComplexF64, n, m)
f(M, P) = real(logdet(I(m) + P' * HtH * P))
euclidean_āf(M, P) = 2HtH*P / (I(m) + P' * HtH * P)
manif = Manifolds.ArraySphere(n, m, field = ā)
āf(M, P) = ManifoldDiff.riemannian_gradient(M, P, euclidean_āf(get_embedding(M), embed(M, P)))
# Works fine for a positive memory size, does not otherwise
#s = m*m
s = -1
res = @time Manopt.quasi_Newton(manif, f, āf, P, memory_size = s, stopping_criterion = StopAfterIteration(2000) | StopWhenGradientNormLess(1e-6))
and can confirm, that this crashes if the memory is set to -1 but does not otherwise. The reason is quite simple.
For a nonnegative memory_size
we store a set of tangent vectors (named the same as in Nocedal & Wright) $y_k$, $s_k$ where $k$ runs from 1 to memory_size
and from these the compute the descent direction.
For most problems I would recommend using this method, if you feel it is not precise enough, choose a high value (up to manifold dimension)
If you set memory_size
to -1
we do not use the aforementioned method but store the (iteratively updated) Hessian approximation in a matrix. On manifolds this has the small disadvantage, that matrices that describe operations in the tangent space are always depending on a basis of the tangent space. Indeed that is the same in the Euclidean case, but the set of unit vectors is a very default choice we often do not think about.
But here we need a basis, and to be precise, this line
does exactly that, it transforms the tangent vector into coordinates, steps into the matrix multiplication (or solving a linear system in other cases) and then reconstructs the vector.
Now the main problem is that on the manifold you want to work with, no one has yet implemented the DefaultOrthonormalBasis()
we set as a default basis (unless the manifold declares a favourite default itself).
One reason might be that the ArraySphere
just came along as āAh, we could do that as well, let's implement all main functionsā but no one implemented the bases. In other cases it might also be that certain functions (on other manifolds) do not have a nice/efficient implementation.
So this is a problem for Manifolds.jl. I am a bit surprised, since I would have thought that this function
would be the fit and also be dispatched to, so it might also be a bug in that functions dispatch indeed. Please open an issue over there.
edit: Ah, I see why the linked function does not hit, that one is currently only implemented for the real (abstract) spheres. Yours is complex. On complex spheres one has the small challenge every now and then whether to take a complex basis with real coefficients (twice as many basis vectors, but nice real coefficients) or real basis with complex coefficients (less basis vectors but also complex coefficients).
If you have the formula for real coordinate basis of tangent spaces to the complex sphere, I can add it to Manifolds.jl. That shouldn't be too hard to derive, you can probably just start from the function written by Ronny and split X
to real and imaginary parts.
Yes, probably adapting that and getting to a complex-basis-with-real-coefficients vector for complex abstract spheres would be the way to go. For now I just do not have a good idea how to compute that.
BTW, I'd suggest trying L-BFGS with limited memory on Manopt 0.4.46 (will be released very soon), with Hager-Zhang linesearch from LineSearches.jl. After fixes it has similar performance to Optim.jl on a couple of test problems.
Note that the L here means memory_size > 0
. A negative number would also do (inverse) BFGS as well, but storing a full matrix.
But yeah we are currently working on even getting these algorithms faster (ok, the āweā is definitely mainly Mateusz).
I started a first sketch, but I have to still figure out the details, mainly in allocation (along the way=) of the actual coefficient vector. If you have a nicer formula, feel free to contribute.
And āĀ thanks āĀ this was a nice thing to ponder about during my hike today, and also just about 30 minutes to heck this sketch. I'll leave this open though we now have a place over on Manifolds.jl for this, in case there is general feedback on the current comments here.
BTW, I'd suggest trying L-BFGS with limited memory on Manopt 0.4.46 (will be released very soon), with Hager-Zhang linesearch from LineSearches.jl. After fixes it has similar performance to Optim.jl on a couple of test problems.
Yes, it does work quite well. I just raised the issue because, based on the documentation, I wasn't expecting a crash. Thanks to everyone for addressing so quickly and thoughtfully!
Well, usually one should not expect things to crash ;)
But we did start a section with each solver called āTechnical Detailsā that lists which functions one should implement for a manifold for this solver to work. This is meant for example when you implement your own manifold, that you know the minimal requirements. In your case it could also have helped to see āah for the matrix variants I need a basisā. I will check whether that is in there and correctly phrased then.
Concerning the open PR on doing the basis you are missing āĀ there are a few minor technical difficulties still, so it might take a while until that is covered, we are but only a very small team working on these packages involved here (originally 3, currently more realistically 2).
I will close this for now. My idea to compute an ONB for your case was not successful, so to be honest I just have no clue how to provide a good ONB in your case for the manifold you have.
I checked the docs and under Technical Derails (as mentioned in my last post) the get_coordinates
requirement is mentioned as the last point https://manoptjl.org/stable/solvers/quasi_Newton/#sec-qn-technical-details. This function is exactly what a complex sphere (any of them, also the array spherre) are missing, but that is an issue for Manifolds.jl and not here.
If you feel the docs could be improved, feel free to comment on this issue and reopening it.
The attached notebook fails when I set memory_size to -1. I am running Julia 1.9.4 with the following package versions:
IJulia v1.24.2 ManifoldDiff v0.3.10 Manifolds v0.9.11 Manopt v0.4.45 Zygote v0.6.68
I had to save the notebook as a .txt file so Github would allow me to attach.
QuasiNewtonQuestion.txt