JuliaManifolds / Manopt.jl

🏔️Manopt. jl – Optimization on Manifolds in Julia
http://manoptjl.org
Other
321 stars 40 forks source link

JuMP interface #264

Closed blegat closed 1 year ago

blegat commented 1 year ago

This is a proof of concept implementation of MathOptInterface so that this package can be used from JuMP. Here are a few things that are missing:

Here are a few questions:

Note: Requires https://github.com/jump-dev/JuMP.jl/pull/3106 for nonlinear objective because NLPBlock is incompatible with constrained variables because of https://github.com/jump-dev/MathOptInterface.jl/blob/32dbf6056b0b5fb9d44dc583ecc8249f6fd703ea/src/Utilities/copy.jl#L485-L500 but should already be useable from MOI

kellertuer commented 1 year ago

Wow! Super nice idea! Let me go through the points and try to answer.

How do I get the vectorization of a Manifold ? In JuMP, every set is represented as Vector{Float64}. For PSD matrices, it's the vectorization of the upper triangle for instance.

The short answer is: In general you don't.

The long answer: Sure in some way we always end up representing points on the manifolds with numbers, and most often even a single matrix. However, already your example of the symmetric positive definite matrices (semi-definite we do not have in a manifold – yet), you still have the constraint on that vector that the eigenvalues have to be positive. So while you could represent that as a vector, not all vectors are valid points on the manifold For fixed rank matrices, we even represent every point by two points on different Stiefel manifolds and a Diagonal matrix. That can not be vectorised I fear.

How do I get the gradient given the ambient gradient ? For a Riemanian submanifold, it should be the orthogonal projection to the tangent space, how do I compute that?

Also this is a bit more complicated. For an isometrically embedded manifold (e.g. the 2-sphere in R3) it is really just the projection onto the tangent space. In many other cases, you have (after projection) still a different metric to adapt to, see https://manoptjl.org/stable/tutorials/AutomaticDifferentiation/#.-Conversion-of-a-Euclidean-Gradient-in-the-Embedding-to-a-Riemannian-Gradient-of-a-(not-Necessarily-Isometrically)-Embedded-Manifold – especially change_representer. We have the function riemannian_gradient for that conversion (project and eventually change the Riesz representer). We are in the process of formalising that in https://juliamanifolds.github.io/ManifoldDiff.jl/stable/ but the best current summary is the tutorial already linked

Add support for choosing optimization algorithm (not just hardset gradient as of the current state of the PR)

One could pass a AbstractManoptSolverState somewhere, that is at least how solvers are “identified” – and they always have a constructor that just requires the manifold (and everything else gets reasonable defaults then).

Add support MOI.RawOptimizerAttributes and MOI.Silent RawStatusString Run against MOI.Test.runtests How do I get TerminationStatus and PrimalStatus ?

I am not sure what these are. There are a few (for now internal) function used for the show methods of the solver states, most prominently whether a stopping criterion indicates convergence, see https://github.com/JuliaManifolds/Manopt.jl/blob/a67453af146f7586d54e364e9fd21e5205e82e2a/src/plans/stopping_criterion.jl#L18, but all the nice display of these things is something I only started this year, now that the interface seems to settle to something stable and we have a reasomable set of solvers.

kellertuer commented 1 year ago

I think MOI should be thought of being slim enough to be a hard dependency? Than a hard dependency here would be ok.

At least for JuMP I would prefer that as an extension. But I can help with that when this starts to work :)

blegat commented 1 year ago

It shouldn't be a free parametrization, it's ok if all elements that can be parametrized does not belong to the manifold. So representation_size looks good.

kellertuer commented 1 year ago

It shouldn't be a free parametrization, it's ok if all elements that can be parametrized does not belong to the manifold. So representation_size looks good.

Cool. That should solve the part for a lot of manifolds; for those that are not represented as one array, I am still not sure what to do, but that might also only affect 2 or 3 manifolds currently (or certain representation models on 2 manifolds). Out of my head

codecov[bot] commented 1 year ago

Codecov Report

Merging #264 (01d2277) into master (ee354f9) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           master     #264    +/-   ##
========================================
  Coverage   99.73%   99.74%            
========================================
  Files          78       79     +1     
  Lines        7206     7353   +147     
========================================
+ Hits         7187     7334   +147     
  Misses         19       19            
Files Coverage Δ
ext/ManoptJuMPExt.jl 100.00% <100.00%> (ø)
src/Manopt.jl 88.88% <100.00%> (+1.38%) :arrow_up:

:mega: Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

blegat commented 1 year ago

I added a few questions in the PR description

kellertuer commented 1 year ago

What is the difference between a decorated state/problem and the non-decorated one? Is it ok if I use the decorated one to return the result to the user ?

The idea is a decorator model, a state can be “wrapped” into a debug state, which runs a slightly different step_sover! function, namely to issue prints before and after (that are implemented as actions). The same for recordings. There is also the very lightweight state that just indicates that the state should be returned in the very end instead of just the minimiser. Any access to fields of the state – e.g. get_iterate should be used instead of state.x, since get_iterate “unwraps” when “looking for” the iterate The same for the objective – you can “wrap” a cache around the objective, which changes how get_cost or get_gradient behave.

How do I get the objective value from the problem and state ?

get_solver_result(state) (see https://manoptjl.org/stable/solvers/#Manopt.get_solver_result), which also works on any other combination the solver might return, currently e.g. the tuple (objective, state). The reason for this is that if the objective is wrapped in a cache (or also counting possibilities). For MOI the first with state should be enough.

How do I get the primal status from the problem and state ? https://jump.dev/MathOptInterface.jl/dev/reference/models/#MathOptInterface.ResultStatusCode

I am not sure this applies here since the result is always on the manifold so it is always feasible? For carefulness one could check is_point(M,p)for the result p, but since that might require a tolerance this might also be misleading

How do I get the termination status from the problem and state ? https://jump.dev/MathOptInterface.jl/dev/reference/models/#MathOptInterface.TerminationStatusCode

Again a bit complicated to match probably. We started with a indicates_convergence(stopping_criterion) which returns true/false if the active ones (those that said the solver should stop) indicate that it has converged (small gradient for example does indicate that, max iterations does not), but a more fine granular access would still be something to work on. I also just noticed that function is not yet properly documented. WIll do that soon, since I do not like undocumented functions. So if this is true the state would be OPTIMAL I think. On most manifolds this would only be local optimal. If false, the details are missing (it could be max time it could be max iter if these are parts of what the user specified.

We should probably also get an access function for the stopping criterion. Since that was not accessed from outside, this was not yet done. So currently get_state(state) “throws away” all wrappers and return the inner one – and for now then the stopping criterion is on that one then get_state(state).stop) (but an accessor would be nicer).

blegat commented 1 year ago

Looking at the docstring for indicates_convergence, if it stops when iteration hit a limit or gradient is zero then indicates_convergence will always be false. Is there a way to know if the gradient was actually zero ? I see there is get_reason which might be a better candidate for RawStatusString than status_summary since it's one-line. However, it's not so easy to convert it to an MOI termination status code.

kellertuer commented 1 year ago

Yes indicate_convergence is more like a static evaluation of a stopping criterion in general could indicate that. The reasons would be more informative, but are strings and with the combinations it might be more complicated to find the right one (the & and | operators to combine criteria).

Yes, get_reason is the best for the RawStatusString when that should just reflect the reason the algorithm stopped. The termination status codes might still be too complicated to get from that. one could write a termination_code(stopping_criterion) function for that, though I am not 100% sure what to do then the and/or combinations indicate several different codes.

blegat commented 1 year ago

I guess you should define some kind of priority. If the solver stops while reaching both the limit of number of iterations and the gradient is zero, you should still say the status is LOCALLY_SOLVED

kellertuer commented 1 year ago

Yes, but for that I would also have to see whether the MOI status types have a natural ordering themselves? Kind of from “not cool” to “coolest” – do do that for all stopping criteria. Then the decision is of course easiest; but yes, I would propose something like that, probably on this branch, since the MOI dependency is introduced here?

Or would it be fine to just have the RawStatus for now here and do the other one later?

blegat commented 1 year ago

Or would it be fine to just have the RawStatus for now here and do the other one later?

Yes, we can do it later and leave LOCALLY_SOLVED for now.

Kind of from “not cool” to “coolest” – do do that for all stopping criteria.

We don't have any explicit rule like that but I'd say LOCALLY_SOLVED is definitely cooler than ITER_LIMIT :laughing: For choosing between two limits, e.g., time limit and iteration limit, we leave it as a free to choose for the solver. It's usually the limit you hit first in your code since you set this status when you leave the loop so you usually know why you're leaving the loop but in your case, you don't store it so it's fine to just return any of the limit I'd say.

kellertuer commented 1 year ago

Yes, we can do it later and leave LOCALLY_SOLVED for now.

That sounds fair. I can think about a better version than later, maybe even one directly tailored to the existing status enums.

Kind of from “not cool” to “coolest” – do do that for all stopping criteria.

We don't have any explicit rule like that but I'd say LOCALLY_SOLVED is definitely cooler than ITER_LIMIT 😆 For choosing between two limits, e.g., time limit and iteration limit, we leave it as a free to choose for the solver. It's usually the limit you hit first in your code since you set this status when you leave the loop so you usually know why you're leaving the loop but in your case, you don't store it so it's fine to just return any of the limit I'd say.

Yes sure, locally solved is cooler. The difference here would be that the criteria are quite modular and a criterion that is the and of a few criteria would have to decide on the status of their children. Then sure the small gradient (locally solved) and max iter (tier limit), if by chance both would hit at the same time I would take locally solved‚ but I would have to think about that for other combinations as well I think.

mateuszbaran commented 1 year ago

Wow, that must've been quite a lot of work. Thank you, it's a good idea! :heart:

How do I get the vectorization of a Manifold ? In JuMP, every set is represented as Vector{Float64}. For PSD matrices, it's the vectorization of the upper triangle for instance.

The short answer is: In general you don't.

The long answer: Sure in some way we always end up representing points on the manifolds with numbers, and most often even a single matrix. However, already your example of the symmetric positive definite matrices (semi-definite we do not have in a manifold – yet), you still have the constraint on that vector that the eigenvalues have to be positive. So while you could represent that as a vector, not all vectors are valid points on the manifold For fixed rank matrices, we even represent every point by two points on different Stiefel manifolds and a Diagonal matrix. That can not be vectorised I fear.

We kind of can vectorize points using charts. Not all manifolds have charts yet though. You can take a look here how it works: https://github.com/JuliaManifolds/Manifolds.jl/blob/master/tutorials/working-in-charts.jl .

kellertuer commented 1 year ago

In principle all of our representations can be seen as charts locally. So if we would write a vectorisation for all manifold point (and tangent vector) representation, that should also be fine (without the detour via charts). That would be something e.g. the SPDMPoints and the UMVTVector would need then to be used in MOI/JuMP I think. Though the name vectorisation might be misleading since vectoring tangent vectors sounds strange. Maybe serialisation would fit as well? However, I was thinking this could be a second step after this PR for the non-single-array-types we have.

mateuszbaran commented 1 year ago

Vectorization already has too many meanings, yes :sweat_smile: . If we are going for a different name, why not "parametrization" (it's already in use in Manifolds.jl in this context).

blegat commented 1 year ago

I tried making it work for a matrix manifold by reshaping inside the gradient callback and reshaping the starting value but I'm getting a weird error:

julia> include("test/MOI_wrapper.jl");
JuMP tests: Error During Test at /home/blegat/.julia/dev/Manopt/test/MOI_wrapper.jl:63
  Got exception outside of a @test
  MethodError: injectivity_radius(::AbstractManifold) is ambiguous.

  Candidates:
    injectivity_radius(M::AbstractPowerManifold)
      @ ManifoldsBase ~/.julia/packages/ManifoldsBase/74WVY/src/PowerManifold.jl:729
    injectivity_radius(::Grassmann)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Grassmann.jl:125
    injectivity_radius(::Manifolds.GeneralUnitaryMatrices{n, ℝ}) where n
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/GeneralUnitaryMatrices.jl:528
    injectivity_radius(::Manifolds.GeneralUnitaryMatrices{n, ℂ, Manifolds.DeterminantOneMatrices}) where {n, ℂ}
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/GeneralUnitaryMatrices.jl:510
    injectivity_radius(M::ProbabilitySimplex)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/ProbabilitySimplex.jl:235
    injectivity_radius(::Euclidean)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Euclidean.jl:343
    injectivity_radius(::SymmetricPositiveDefinite)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/SymmetricPositiveDefinite.jl:239
    injectivity_radius(::Circle)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Circle.jl:242
    injectivity_radius(M::TangentBundle{𝔽} where 𝔽)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/VectorBundle.jl:617
    injectivity_radius(::ManifoldsBase.DefaultManifold)
      @ ManifoldsBase ~/.julia/packages/ManifoldsBase/74WVY/src/DefaultManifold.jl:106
    injectivity_radius(::Flag)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Flag.jl:114
    injectivity_radius(M::MetricManifold)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/MetricManifold.jl:363
    injectivity_radius(M::ValidationManifold)
      @ ManifoldsBase ~/.julia/packages/ManifoldsBase/74WVY/src/ValidationManifold.jl:244
    injectivity_radius(M::ProductManifold)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/ProductManifold.jl:790
    injectivity_radius(::AbstractSphere)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Sphere.jl:277
    injectivity_radius(::AbstractProjectiveSpace)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/ProjectiveSpace.jl:241
    injectivity_radius(::Hyperbolic)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Hyperbolic.jl:246
    injectivity_radius(::TangentSpaceAtPoint{𝔽} where 𝔽)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/VectorBundle.jl:608
    injectivity_radius(::HeisenbergGroup)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/groups/heisenberg.jl:229
    injectivity_radius(::GeneralizedGrassmann)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/GeneralizedGrassmann.jl:179
    injectivity_radius(::UnitaryMatrices{1, ℍ})
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/Unitary.jl:67
    injectivity_radius(::Manifolds.GeneralUnitaryMatrices)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/GeneralUnitaryMatrices.jl:498
    injectivity_radius(M::AbstractDecoratorManifold)
      @ ManifoldsBase ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:305
    injectivity_radius(::PositiveNumbers)
      @ Manifolds ~/.julia/packages/Manifolds/edNnV/src/manifolds/PositiveNumbers.jl:184

  Stacktrace:
    [1] injectivity_radius(::ManifoldsBase.EmptyTrait, M::FixedRankMatrices{3, 2, 2, ℝ})
      @ ManifoldsBase ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:346
    [2] injectivity_radius
      @ ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:313 [inlined]
    [3] injectivity_radius
      @ ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:306 [inlined]
    [4] max_stepsize
      @ ~/.julia/dev/Manopt/src/plans/stepsize.jl:39 [inlined]
    [5] #default_stepsize#637
      @ ~/.julia/dev/Manopt/src/solvers/gradient_descent.jl:88 [inlined]
    [6] default_stepsize
      @ ~/.julia/dev/Manopt/src/solvers/gradient_descent.jl:82 [inlined]
    [7] GradientDescentState(M::FixedRankMatrices{3, 2, 2, ℝ}, p::Matrix{Float64})
      @ Manopt ~/.julia/dev/Manopt/src/solvers/gradient_descent.jl:63
    [8] optimize!(model::Manopt.Optimizer)
      @ Manopt ~/.julia/dev/Manopt/src/MOI_wrapper.jl:212

Any hint ?

kellertuer commented 1 year ago

I tried making it work for a matrix manifold by reshaping inside the gradient callback and reshaping the starting value but I'm getting a weird error:

julia> include("test/MOI_wrapper.jl");
JuMP tests: Error During Test at /home/blegat/.julia/dev/Manopt/test/MOI_wrapper.jl:63
  Got exception outside of a @test
  MethodError: injectivity_radius(::AbstractManifold) is ambiguous.
  [...]

Any hint ?

This happens when a method is not implemented on a bit more advanced manifolds, since FixedRank uses other manifolds to “pass tasks on”. So – for FixedRankmatrices the injectivity_radius is not defined/implemented. Most probably because we are not aware of what the radius is – and a bit less likely just because we never needed sad radius until now.

In this case it does happen, because the default step size (I think that should be Armijo) tries to determine a reasonable default starting step size that does not “break stuff” (a step size beyond 2\pi on the sphere for example would. be a bad idea).

blegat commented 1 year ago

So to fix it, we should set the step size ?

kellertuer commented 1 year ago

For fixed rank you have to set (parts of) the step size yourself, yes. Here it is the stop_when_stepsize_exceeds safeguard, since (like on the sphere) too large step sizes really break stuff (on manifolds) this should be set to a reasonable upper bound. So e.g.

using Manifolds, Manopt
M = FixedRankMatrices(3,3,2)
ArmijoLinesearch(M) # errors
ArmijoLinesearch(M; stop_when_stepsize_exceeds=3.0) # works just fine

I am, however, not sure how Step sizes are handled in MOI/JuMP, nor do I have a good/ better idea for our default, since infectivity_radius is a really good upper bound, but if we are not aware what that is on a manifold, there is not much we can do.

You can of course also use any other step size rule available and fitting.

blegat commented 1 year ago

Getting closer. Now I am getting:

julia> include("test/MOI_wrapper.jl")
JuMP tests: Error During Test at /home/blegat/.julia/dev/Manopt/test/MOI_wrapper.jl:64
  Got exception outside of a @test
  MethodError: project!(::AbstractManifold, ::Matrix{Float64}, ::Matrix{Float64}, ::Matrix{Float64}) is ambiguous.

...

  Stacktrace:
    [1] project!(::ManifoldsBase.EmptyTrait, M::FixedRankMatrices{3, 2, 2, ℝ}, Y::Matrix{Float64}, p::Matrix{Float64}, X::Matrix{Float64})
      @ ManifoldsBase ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:346
    [2] project!
      @ ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:313 [inlined]
    [3] project!
      @ ~/.julia/packages/ManifoldsBase/74WVY/src/nested_trait.jl:306 [inlined]
    [4] riemannian_gradient!(M::FixedRankMatrices{3, 2, 2, ℝ}, X::Matrix{Float64}, p::Matrix{Float64}, Y::Matrix{Float64}; embedding_metric::EuclideanMetric)
      @ ManifoldDiff ~/.julia/packages/ManifoldDiff/iL1ys/src/riemannian_diff.jl:334
    [5] riemannian_gradient(M::FixedRankMatrices{3, 2, 2, ℝ}, p::Matrix{Float64}, Y::Matrix{Float64}; embedding_metric::EuclideanMetric)
      @ ManifoldDiff ~/.julia/packages/ManifoldDiff/iL1ys/src/riemannian_diff.jl:323
    [6] riemannian_gradient(M::FixedRankMatrices{3, 2, 2, ℝ}, p::Matrix{Float64}, Y::Matrix{Float64})
      @ ManifoldDiff ~/.julia/packages/ManifoldDiff/iL1ys/src/riemannian_diff.jl:316
    [7] (::Manopt.var"#eval_grad_f_cb#896"{Manopt.Optimizer})(M::FixedRankMatrices{3, 2, 2, ℝ}, X::Matrix{Float64})
      @ Manopt ~/.julia/dev/Manopt/src/MOI_wrapper.jl:230
    [8] get_gradient!(M::FixedRankMatrices{3, 2, 2, ℝ}, X::Matrix{Float64}, mgo::ManifoldGradientObjective{AllocatingEvaluation, Manopt.var"#eval_f_cb#895"{Manopt.Optimizer}, Manopt.var"#eval_grad_f_cb#896"{Manopt.Optimizer}}, p::Matrix{Float64})
      @ Manopt ~/.julia/dev/Manopt/src/plans/gradient_plan.jl:185
    [9] get_gradient!
      @ ~/.julia/dev/Manopt/src/plans/gradient_plan.jl:138 [inlined]
   [10] initialize_solver!
      @ ~/.julia/dev/Manopt/src/solvers/gradient_descent.jl:263 [inlined]
   [11] solve!(p::DefaultManoptProblem{FixedRankMatrices{3, 2, 2, ℝ}, ManifoldGradientObjective{AllocatingEvaluation, Manopt.var"#eval_f_cb#895"{Manopt.Optimizer}, Manopt.var"#eval_grad_f_cb#896"{Manopt.Optimizer}}}, s::GradientDescentState{Matrix{Float64}, Matrix{Float64}, StopWhenAny{Tuple{StopAfterIteration, StopWhenGradientNormLess}}, ConstantStepsize{Int64}, IdentityUpdateRule, PolarRetraction})
      @ Manopt ~/.julia/dev/Manopt/src/solvers/solver.jl:118
   [12] optimize!(model::Manopt.Optimizer)
      @ Manopt ~/.julia/dev/Manopt/src/MOI_wrapper.jl:245

Is I need to call this riemannian_gradient function since JuMP has the Euclidean gradient but it seems it's not implemented for fixed rand matrices, what's the catch ?

kellertuer commented 1 year ago

Puh, you really directly try to use all features on one of the most complicate manifolds. Yes, riemannian_ gradient changes the Riesz representer of the gradient from the Euclidan to the Riemannian gradient.

This requires that you can project a tangent vector from the embedding to the tangent space (again, for the sphere easier to imaging – any vector in R3 can be projected onto the plane orthogonal to a point p on the sphere, which is its tangent space) as well as the change_representer function which does the metric conversion mentioned above.

For both, Manifolds.jl does not provide an implementation, if I remember correctly (I implemented Fixed Rank) this is due to I am not aware whether these formulae exist. If you find them, let me know, I‘ll add them.

But why does JuMP Have tp only work with the Euclidean gradient? Usually – Euclidean gradient is a reasonable fallback (that is the conversion then to the Riemannian) besides providing the Riemannian gradient (see here for details), but usually this conversion also costs a bit, so its better to provide the Riemannian one (or at least be able to provide that).

So in short: Probably neither of these formulae for project or change_representer is known, but I can check the literature again (Vandereycken 2013, also linked in the docs).

edit: Ah at least it seems the Riesz representer conversion is the identity, but I am not sure about projection onto the tangent space.

edit2: Now I am confused. that method does exist. Its Formula (2.5) from the Vandereycken paper and implemented.

kellertuer commented 1 year ago

Ah, now I see it. FixedRankManifold is among those, that I discussed over on discourse, that they might be a problem with JuMP.

If we would store just the matrix, we would compute the SVD many, many, (many many many) times. So we store a point as SVDMPoint (with internally cached SVD) and similarly it is far beneficial to store tangent vectors in the format proposed by Vanereycken, namely what we then called UMVTVector, since it has 3 matrices internally stored. But your project! tries to store the result in a matrix (second argument). We do not provide Fixed Rank matrices in matrix representation. Not only would that double all implementations, it would also never be efficient (compared to the cached variant).

So we have to find a way to declare points as SVDMPoints (or arbitrary subtypes of AbstractManifoldPoint) and similarly tangent vectors (“directions”, e.g. gradients) as subtypes of TVector.

blegat commented 1 year ago

The user could specify the Riemannian gradient as an MOI model attribute, say RiemannianGradient <: MOI.AbstractModelAttribute but at the moment, since it only provides the algebraic expression of the objective function, JuMP's AD will get the Euclidean one. Manopt.Optimizer will then check if the riemannian gradient is set. If not, it will fallback to transforming with riemannian_gradient the Euclidean one computed from the JuMP's AD. Maybe we can leave that to a follow up PR.

In the meantime, do you have another manifold I could use in the test for which representation_size is not a vector size and for which riemannian_gradient is implemented ?

mateuszbaran commented 1 year ago

In the meantime, do you have another manifold I could use in the test for which representation_size is not a vector size and for which riemannian_gradient is implemented ?

I think GeneralizedGrassmann with non-identity B would work?

kellertuer commented 1 year ago

In the meantime, do you have another manifold I could use in the test for which representation_size is not a vector size and for which riemannian_gradient is implemented ?

It is for FixedRankMatrices you just have to be able to pass in SVDMPoints and UMVTVectors instead of matrices. The main limit is, it seems to me, that this PR might be limited to “pure-single-array-represented” manifolds. That's probably fine for a first step if properly documented

But sure, Mateusz choice works, but also Sphere, Hyperbolic (as long as you are in the “hyperboloid model”, Stiefel, Grassmann, but sure the generalised Grassmann even has a change in the representer (besides projection) so that might be more interesting to check.

blegat commented 1 year ago

Stiefel worked like a charm. I think there is two possible follow up : https://github.com/JuliaManifolds/Manopt.jl/issues/273 and then https://github.com/JuliaManifolds/Manopt.jl/issues/274 but let's discuss it there and keep the scope of this one on array-represented manifolds with Euclidean-specified objective function. I added some doc as well, now it looks good to go from my side, let me know if you have any more review comments :)

kellertuer commented 1 year ago

Thanks, I expected that since Stiefel is nearly as nice as the sphere to use here.

Concerning the code – I will check it carefully somewhen soon (but will be travelling the next few days until Tuesday at least – but maybe its boring at the airport as well ;) ). At first glance – a dependency on MOI for me looks fine, since an interface is expected to be lightweight. A dependency on JuMP I would prefer – I think – as an extension/with Requires.jl as a fallback. Would that be possible?

Besides that, I think I also have to check for code coverage, since this seems to be not running for PRs from forks. edit: Ah no, code coverage is coming through, we should check to keep that up as well.

kellertuer commented 1 year ago

Hi, sure currently its conference and vacation time, but is there any plans when to continue with this PR? Something I can help with?

blegat commented 1 year ago

Yes, I'm waiting for https://github.com/jump-dev/MathOptInterface.jl/pull/2235 to be merged and released. Then we'll be able to get rid of the src/qp_block_data.jl file

kellertuer commented 1 year ago

Great! The answer to be on a vacation or otherwise busy would have been ok as well for sure. Looking forward to the next steps :)

blegat commented 1 year ago

We're about the release the new nonlinear interface, once JuMP v1.15 is out, I'll revisit, it will be much simpler

kellertuer commented 1 year ago

That sounds great! Looking forward to new updates :)

blegat commented 1 year ago

What's the status of this PR ? Are you waiting for me to do some changes or is the ball on your side ?

kellertuer commented 1 year ago

Oh, I was not sure what your status is here and was waiting for you, since I just saw a few commits last week but no status on how finished this is.

Should I review something? Could you maybe provide an example how to use Manopt and JuMP together?

kellertuer commented 1 year ago

Oh, Sorry! ;) That was me not being precise enough. We have

https://manoptjl.org/stable/extensions/

a page about all extensions, where that example would fit perfectly. Thanks for providing one :)

kellertuer commented 1 year ago

Currently the 1.6 tests fail because they can not find Manopt.Optimizer are you sure you included it correctly? Also – is that necessary? I feel that Manopt.Optimizer is a bit of a solution of the manopt namespace, since it is not the generic struct for an optimiser in manopt but more the “boilerplate-struct” towards JuMP?

I do acknowledge that Model(Manopt.Optimizer) looks nice, just that every non-Jump-person might be confused by Manopt.Optimizer (and sure it also for now only exists if one loads Jump)

For the rest of the code I try to find time soon, but this week I am also travelling and giving a seminar talk in Oslo (while still teaching).

blegat commented 1 year ago

Documenter doesn't seem to find the new things defined in the extension. Any idea what's missing ?

kellertuer commented 1 year ago

I am not sure MOI.-something will work in the docs?

kellertuer commented 1 year ago

Ah no, I think I know what it is – extensions are not meant to define their own structs (I am travelling so can not directly narrow down the thread / discussion we had about that on discourse).

mateuszbaran commented 1 year ago

Working with types defined in package extensions is really annoying. They are fine for internal use but to my knowledge there is no reasonable way to reference such types from outside of package extensions. There is Base.get_extension but I'm not sure if it works with Documenter.jl.

blegat commented 1 year ago

I see, maybe we could do like JuMP_Optimizer and define a shortcut in Manopt so that it can be referenced ?

kellertuer commented 1 year ago

If you are ok with that name, I am :) I am not 100% sure how to do the shortcut thing though. For functions I usually defined just a doc string in the main package, but for structs this might be more complicated.

mateuszbaran commented 1 year ago

Maybe we should ask Documenter.jl people what the recommended way to include this docstrings is? It looks like randomly trying different ways doesn't work that well.

blegat commented 1 year ago

I couldn't find a way to include the docstrings for the new structs but at least we have the docstring for the new methods.

blegat commented 1 year ago

One last attempt for structs in dfb00c3c1772e8d627361014b6b1f8132f655bc3

blegat commented 1 year ago

It seems the docstring for Optimizer gets included without error because Manopt.JuMP_Optimizer is in the docs but I still get:

┌ Error: 1 docstring not included in the manual:
│ 
│     ManoptJuMPExt.Optimizer
│ 
│ These are docstrings in the checked modules (configured with the modules keyword)
│ that are not included in @docs or @autodocs blocks.
└ @ Documenter.DocChecks ~/.julia/packages/Documenter/bYYzK/src/Utilities/Utilities.jl:32

Maybe we can just remove the check that all docstrings are included ?

kellertuer commented 1 year ago

I would prefer to keep that check, since I think it is super important that everything is documented. In the current form I can not yet see why the error appears, since it reads like JuMP_Optimizer is correctly documented (otherwise that docs block would error?). So maybe we could also just not use two names for the struct but also within the extension callOptimizertheJuMP_Optimizer`.

Those are the same currently anyways, just that you use the hack with the set_global to introduce the second name.

blegat commented 1 year ago

Let's try that

kellertuer commented 1 year ago

Hm, I think the only thing left to fix this documentation issue is to do something like

"""
<<Short docs that mention that this is a constructor for a struct when using JuMP>>
"""
function JuMP_Optimizer end

in the main namespace somewhere? So to basically document an empty constructor there to have the name in the Manopt namespace – that avoids the set_global hacks and fixes the documentation problems?