Closed qrebjock closed 3 years ago
Hi,
thank you for the kind words and great that you are interested.
Usually the GradientOptions
would “know” the type P
, since the constructors are similarly typed (i.e. it would automatically be the SVDMPoint
in your case.
Cab you post your current approach or a small example where FixedRank fails? It's often easier to start from an MWE. I can also look for an easy problem on FixedRankMatrizes
, but if you have one...thats even easier. Also I could see which errors you are getting.
I can check for an easy starting example with FixedRank later today.
Concerning your points:
Manifolds.jl
, but the conversion from Euclidean grad/hess (or more generally from an embedding) I have not yet started with.in principle the recording can record anything you can compute from the options, but I have no ad-hoc idea ...wait. You could do the following: Write a functor for the cost, that counts itself how many times it was called. This you can access from the record options and store. Have you worked with functors before? I can surely provide a short example how to do this the easiest way.
Thanks again for your interest and kind regards from Trondheim to Lausanne.
Hm, a fast look around and I mostly saw gradients, where the euclidean gradient was implemented and a conversion used. As I said, this conversion is not yet available, but we could also check hot to do that together, if you like. I would just – if I approach that – use the machinery of the EmbeddedManifold
because what the conversion basically does is convert from one metric (in the embedding) to another (of the manifold) and from one representation (for example the rank-r-matrix) to another (the reduced SVD). So I would like to define it for a manifold and it's embedding in the full generality.
Here's a few tipps for your error.
(M,p)
to a real number, where p
should be a SVDMPoint
in your case.(M,p)
to an UMVTVector
Hi,
Thank you very much for your answers!
Here is the code I have been trying:
using LinearAlgebra
using Manopt, Manifolds
using Random
import Manopt: random_point, random_tangent
m = 5
n = 5
k = 2
function random_point(M::FixedRankMatrices{m, n, k}) where {m, n, k}
U = random_point(Stiefel(m, k))
S = sort(rand(k), rev=true)
V = random_point(Stiefel(n, k))
return SVDMPoint(U, S, Array(V'))
end
function random_tangent(M::FixedRankMatrices{m, n, k}, p::SVDMPoint) where {m, n, k}
Up = randn(m, k)
Vp = randn(n, k)
A = randn(k, k)
return UMVTVector(Up - p.U * p.U' * Up, A, Vp' - Vp' * p.Vt' * p.Vt)
end
F(M, p) = norm(p.U * diagm(p.S) * p.Vt - A)^2 / 2
# Riemannian gradient
function gradF(M, p::SVDMPoint)
dS = diagm(p.S)
return UMVTVector(
-(I - p.U * p.U') * A * p.Vt',
dS - p.U' * A * p.Vt',
-p.U' * A * (I - p.Vt' * p.Vt)
)
end
A = randn(m, n);
M = FixedRankMatrices(m, n, k)
X0 = random_point(M)
X = gradient_descent(M, F, gradF, X0)
It's simply trying to find the best rank-k approximation of some matrix. The error it raises is
MethodError: Cannot `convert` an object of type UMVTVector{Array{Float64,2},Array{Float64,2},Array{Float64,2}} to an object of type SVDMPoint{Array{Float64,2},Array{Float64,1},Array{Float64,2}}
From what I understand it's happening because in the struct GradientDescentOptions
the point on the manifold and the gradient are supposed to have the same type ::P
. But they don't because one is represented with SVDMPoint
and the other with UMVTVector
.
Thanks for your remarks regarding AD and recordings! This will be helpful.
Regarding the conversion of the gradient in Manopt.m and PyManopt they have these egrad2rgrad
functions that project the Euclidean gradient on the tangent space. I guess the same thing could be done here? I would actually be interested to look into it but I guess that's secondary at this point.
Kind regards, Quentin
Thank you for the example and coming back to your initial comment, no, you are not missing something. You maybe overtyped a little (types that are “clear from context” can be left out, they are inferred automatically).
And yes this is a bug. At first glance I might have to change the GradientOptions
constructor signature for that (since I need a Manifold
to infer the type of a tangent vector). I will try to fix it soon, thanks for noticing.
And sure, that happened due actually all other manifold up to now working directly on matrix representations where then both have the same type.
I just linked a PR that should fix your problem. With just two further remarks
1) The way you called gradient descent, it would use the exponential map, which is not available for fixed Rank matrices. actually that is one point where Manopt.jl
or more precisely Maifolds.jl
differs from Manopt and pymanopt: handling of retractions and inverse retractions (similarly for vector transports). Retractions always have an AbstractRetractionMethod
assigned, where a default (ExponentialRetraction()
) just falls back to calling exp
.
But you code with the last line written as
X = gradient_descent(M, F, gradF, X0; retraction_method=PolarRetraction())
works just fine.
2) You implemented random_point and random_tangent for fixed rank matrices. Would you like to contribute them to the code base? Surely the random functions here are merely a start and there is a longer project to provide distributions on manifolds (see JuliaManifolds/Manifolds.jl#57), but until that is available I will provide the poor-mans-variant random_point
and random_tangent
for certain types here. I would be happy if you provide your randoms as a PR, try to adopt the documentation from the others, write a test for each. And if you are stuck somewhere let*s discuss that in a PR.
Let me know whether the PR JuliaManifolds/Manifolds.jl#62 fixes your problem.
Manifolds.jl
does actually have some basic support for Riemannian gradients: https://juliamanifolds.github.io/Manifolds.jl/stable/features/differentiation.html#Manifolds.gradient-Tuple{Manifold,Any,Any,AbstractRiemannianDiffBackend} . It's not quite polished yet though nor integrated with Manopt.jl
, and currently Riemannian Hessians are still only planned: https://github.com/JuliaManifolds/ManifoldDiff.jl/issues/29 .
Here*s a small example on how to write your own Recording Action, just not very spectacular yet in the results. https://manoptjl.org/previews/PR63/tutorials/AdvancedRecord.html
Hi,
Sorry for the late answer, I was making sure that I understand everything before replying.
Thanks for the PR, it's working now! (with the PolarRetraction object)
Thanks also for the links for AD and recordings!
I will open a pull request to add the two functions that you mentioned.
But before doing this, I want to clarify something.
The UMVTVector
class represents a tangent vector in compact form.
I am confused by the documentation here https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/fixedrankmatrices.html because it is said that the tangent vector is U M V^T
. I think that a UMVTVector(U_p, M, V_p)
object should actually represent the vector U M V^T + U_p V^T + U V_p^T
where X = U S V^T
is the current point.
Is that right?
If you confirm this then I think there is a problem in the function project!(M::FixedRankMatrices, Y::UMVTVector, p::SVDMPoint, X::UMVTVector)
.
I think there might also be an issue with the retraction retract!(::FixedRankMatrices{M,N,k}, q::SVDMPoint, p::SVDMPoint, X::UMVTVector, ::PolarRetraction)
. I am not totally sure about it yet, I am still looking into it.
I can make the required changes but I would need you to confirm that I understand the representation of the tangent vectors correctly.
Once all of this is set up I will be able to try trust regions on FixedRankMatrices
.
Thanks for the PR, it's working now! (with the PolarRetraction object)
Great!
I will open a pull request to add the two functions that you mentioned.
Nice, looking forward to seeing that.
But before doing this, I want to clarify something. The
UMVTVector
class represents a tangent vector in compact form. I am confused by the documentation here https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/fixedrankmatrices.html because it is said that the tangent vector isU M V^T
. I think that aUMVTVector(U_p, M, V_p)
object should actually represent the vectorU M V^T + U_p V^T + U V_p^T
whereX = U S V^T
is the current point. Is that right?
Yes, that's right, my mistake when I wrote the documentation – feel free to also include that in the PR, my UMVTVector documentation is wrong there and the one written in the doctoring of FixedRankMatrices
at https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/fixedrankmatrices.html#Manifolds.FixedRankMatrices is the right one
If you confirm this then I think there is a problem in the function
project!(M::FixedRankMatrices, Y::UMVTVector, p::SVDMPoint, X::UMVTVector)
.
Can you provide more detail or include the fix? Because I think I implemented that based on Bart's paper (see reference on that page in the bottom).
I think there might also be an issue with the retraction
retract!(::FixedRankMatrices{M,N,k}, q::SVDMPoint, p::SVDMPoint, X::UMVTVector, ::PolarRetraction)
. I am not totally sure about it yet, I am still looking into it.
Same comment as for the project – I am not saying the current one is right – especially since I never used it, I just had the time and transferred the formulae (and it runs in test cases correctly, which for sure does not make it correct).
I can make the required changes but I would need you to confirm that I understand the representation of the tangent vectors correctly.
I confirm your understanding is correct, see again the docs of FixedRankMatrices
where the tangent vector form/representation is stated correctly.
Thanks for the clarifications!
project!
, I think that project!(::FixedRankMatrices, Y::UMVTVector, p::SVDMPoint, A::AbstractMatrix)
is correct. But the function project!(M::FixedRankMatrices, Y::UMVTVector, p::SVDMPoint, X::UMVTVector)
reconstructs the tangent vector with the formula U M V^T
and independently of the point to which it is attached. So I think this function should take one argument p
which is the point on whose tangent space we want to project, and another one q
which is the point to which the tangent vector that we want to project is attached. Do it make sense? It would have the signature project!(M::FixedRankMatrices, Y::UMVTVector, p::SVDMPoint, q::SVDMPoint, X::UMVTVector)
in the end.Here is the code I used for these tests:
using LinearAlgebra
using Manopt, Manifolds
using Random
import Manopt: random_point, random_tangent
using Plots
m = 10
n = 10
k = 5
function random_point(M::FixedRankMatrices{m, n, k}) where {m, n, k}
U = random_point(Stiefel(m, k))
S = sort(rand(k), rev=true)
V = random_point(Stiefel(n, k))
return SVDMPoint(U, S, Array(V'))
end
function random_tangent(M::FixedRankMatrices{m, n, k}, p::SVDMPoint) where {m, n, k}
Up = randn(m, k)
Vp = randn(n, k)
A = randn(k, k)
return UMVTVector(Up - p.U * p.U' * Up, A, Vp' - Vp' * p.Vt' * p.Vt)
end
function to_ambient(X::SVDMPoint, T::UMVTVector)
return X.U * T.M * X.Vt + T.U * X.Vt + X.U * T.Vt
end
function to_ambient(X::SVDMPoint)
return X.U * diagm(X.S) * X.Vt
end
F(M, p) = norm(p.U * diagm(p.S) * p.Vt - A)^2 / 2
function gradF(M, p::SVDMPoint)
dS = diagm(p.S)
return UMVTVector(
-(I - p.U * p.U') * A * p.Vt',
dS - p.U' * A * p.Vt',
-p.U' * A * (I - p.Vt' * p.Vt)
)
end
function retr(M, X, W)
QU, RU = qr([X.U W.U])
QV, RV = qr([X.Vt' W.Vt'])
T = svd(RU * [diagm(X.S) + W.M I; I zeros(k, k)] * RV')
return SVDMPoint(QU * T.U[:, 1:k], T.S[1:k], T.Vt[1:k, :] * QV')
end
function hessF(M, X, H)
dSinv = diagm(1 ./ X.S)
return UMVTVector(
(I - X.U * X.U') * (X.U * H.M + H.U - A * H.Vt' * dSinv),
H.M,
(H.M * X.Vt + H.Vt - dSinv * H.U' * A) * (I - X.Vt' * X.Vt)
)
end
A = randn(m, 50) * randn(50, n)
M = FixedRankMatrices(m, n, k)
X = random_point(M)
W = random_tangent(M, X)
G = gradF(M, X)
H = hessF(M, X, W)
ts = 10 .^ (range(-8, 0, length=100))
r = []
for t = ts
append!(
r,
abs(
F(M, retr(M, X, t * W))
#F(M, retract(M, X, t * W, PolarRetraction()))
- F(M, X)
- t * dot(to_ambient(X, W), to_ambient(X, G))
- t * t / 2 * dot(to_ambient(X, W), to_ambient(X, H))
)
)
end
plot(ts, r, xaxis=:log, yaxis=:log)
plot!(ts, ts .^ 3, xaxis=:log, yaxis=:log)
To compare to the polar retraction you can just comment F(M, retr(M, X, t * W))
and uncomment F(M, retract(M, X, t * W, PolarRetraction()))
.
- [...] But the function
project!(M::FixedRankMatrices, Y::UMVTVector, p::SVDMPoint, X::UMVTVector)
reconstructs the tangent vector with the formulaU M V^T
and independently of the point to which it is attached.
That would not fit the current generic signature of project!
. But yes, the current for is wrong and I confused myself with documenting UMVTVector
actually. The only solution I think, is to remove this variant.
The variant with another point q
for X
could be given as a vector_transport_to
using the ProjectionTransport
type. That would be rigorous and correct here. It would (roughly descripbed) und q
and X
to reconstruct the full vector and then call the first project variant with that matrix.
- Here is why I think the retraction might not work. I checked the gradient and Hessian of a simple function with the technique described in sections 4.8 and 6.7 of Nicolas Boumal's book http://sma.epfl.ch/~nboumal/book/IntroOptimManifolds_Boumal_2020.pdf. These tests fail. At first I thought that the gradient and Hessian were wrong. But then I implemented the retraction described in section 7.5, equation 7.48. And with that one the tests pass. I can provide code if needed. Do you have a reference where the polar retraction for fixed-rank matrices is specified? So I can compare.
I had hoped I implemented Algorithm 6 by Vandereycken, but it does not look like that at first glance. I will check Nicolas book for the tests, providing them here too, might be a good idea. So the goal is to have Algorithm 6 from Vandereycken for this retraction. If you have the time, it would be great if you could fix that, too :) Especially
function retr(M, X, W) QU, RU = qr([X.U W.U]) QV, RV = qr([X.Vt' W.Vt']) T = svd(RU [diagm(X.S) + W.M I; I zeros(k, k)] RV') return SVDMPoint(QU T.U[:, 1:k], T.S[1:k], T.Vt[1:k, :] QV') end
Is this what you propose as the projection retraction? Documenting and adding this would be great, too.
I can surely help with a PR, just an own PR by myself is only possible in a few weeks, since I started on a new position (in a new country) beginning of March. So – great that you work so much with Julia, I am happy to see this. If you have the tie to fix/contribute – even better! Thanks for that.
(sorry I did not mean to close this issue)
Regarding the conversion of the gradient in Manopt.m and PyManopt they have these
egrad2rgrad
functions that project the Euclidean gradient on the tangent space. I guess the same thing could be done here? I would actually be interested to look into it but I guess that's secondary at this point.
just an addendum to this, the have project
(and project!
) available oven on Manifolds.jl
for a lot of (embedded) manifolds, where the metric if from the embedding (eg the sphere). So with that you can easily define a gradient by projection, though we do not have an egrad2rgrad
function. I would also prefer to stick to project name wise, also wen doing the project for fixed rank for example, where the representation changes from the embedding of the matrices to the 3-factors..
This is further discussion and solved in JuliaManifolds/Manifolds.jl#340.
Hello,
Thanks for this great Toolbox! I have been using PyManopt so far but I wanted to give a try to Julia. The API is quite different so it raised some questions for me.
I am trying to optimize over the manifold of fixed rank matrices with a simple gradient descent algorithm (and trust regions later). From what I understand in the gradient plan
GradientDescentOptions
the point on the manifold and the gradient need to have the same type (annotated with::P
). But for fixed rank matrices the points are represented with the classSVDMPoint
and the tangent vectors with the classUMVTVector
. So I end up getting errors when I try to rungradient_descent
with a simple cost function and I don't really get what I am doing wrong here. I am pretty new with Julia so I might be missing something obvious.More generally, I was wondering if you had examples with other manifolds than the Sphere, and in particular with the FixedRankMatrices manifold?
Two unrelated questions (I can open other issues if needed):
Thank you very much!