kellertuer commented 4 months ago

This is a start to rework constraints, make them a bit more flexible and address / resolve #185.

All 3 parts, the function $f$, inequality constraints $g$ and equality constraints $h$ are now stored in their own objectives internally. That way, they can be more flexibly provided – for example a Hessian for one of them.

Besides that, there is more details available on the co-domain of these, especially in case of a single function, the gradient can now also be specified to map into the tangent space of the power manifolds tangent space.

One open point still to check is, how the internal functions can be adopted to this in hopefully a non-breaking way. A main challenge (or where I stopped today), is, when a function that returns gradients (or a gradient of functions) is stored in an objective – and it hence might be cached – how to best model element access. This might also be a nice improvement (or a unification with) the LevenbergMarquart type objective. One could unify that to an objective of type VectorObjective maybe.

Comments on this idea welcome.

🛣️ Roadmap

[x] 👨‍💻 rework the internal functions / accessors to the constraints
~~👨‍💻 unification with the LestSquaresObjetice (VectorObjective ?)~~
[x] 👨‍💻 rewrite all constrained gradients and costs to have a positional range, e.g. ALM and EPM grad.
[x] 👨‍💻rewrite that DefaultManifoldProblem passes in a range nothing to trigger the default
[x] 👨‍💻refactor all get_gradients to get_gradientwith Colon
[x] 👨‍💻deprecate the plural variant
[x] 👨‍💻refactor the ConstrainedManifoldObjective that
- [x] 🔁g and grad_g are internally stored as the new VectorialGradientFunction
- [x] 🔁 h and grad_h are internally stored as a mew VectorialGradientFunction
[x] 👨‍💻adapt the high-level interfaces to have a range as well
[x] 🔎 check how this would best interact with Caching.
[x] ⚠️ For best of cases stay non-breaking
[x] 📈 test coverage
[x] 📚 documentation

kellertuer commented 4 months ago

This is conceptually a quite interesting PR. I just spent a bit of time sketching the new idea of e vectorial objective. I think it would reduce code duplication (between inequality constraint access, equality constraint access and all Levenberg Marquardt) quite a bit.

Since I am not 100% familiar with LM; a bit of a feedback would be nice on the ideas sketched here @mateuszbaran: https://manoptjl.org/previews/PR386/plans/objective/#Manopt.VectorialGradientObjective This is of course only sketched and for example in the rendering I messed up the 1./2./3. in the 3 types of representations for the single-cost-function implementation. But I hope those 3 ( 1., 1., and 1. ;)) capture the existing ones and the new one the issue refers to?

If this sounds good, I could start implementing all that. Note that the basis in LM is now even stored in the type, so it is only there if the type requires it (the CoefficientType)

mateuszbaran commented 3 months ago

This is a nice direction of improvement. I don't quite get the difference between ComponentVectorialType and PowerManifoldVectorialType -- the second type essentially covers the first one as a special case?

Also, you should be careful to note that VectorialGradientObjective doesn't represent multi-objective optimization but optimization of $g(f(p))$ for some $g\colon \mathbb{R}^n \to \mathbb{R}$, where we might want to encode g in VectorialGradientObjective. For example it is sum of squares for LM, mean/sum for stochastic gradient descent or some other function for robustified nonlinear least squares (see #332). I could also represent multi-objective optimization but we don't currently have any such solvers in Manopt.

Note that the basis in LM is now even stored in the type, so it is only there if the type requires it (the CoefficientType)

That sounds like a good idea.

kellertuer commented 3 months ago

Component is the old one, so basically nested vector, I kept that because I think the power manifold one might (more often) require and actual power manifold while the old vector variant did not.

And sure, one idea would be to have this also inside a VectorOptimisation problem later, but maybe we should then rethink the name, if you feel that might be confusing.

I would maybe thing that a MultiObjective would combine the idea here with the the function g, that is store both internally.

mateuszbaran commented 3 months ago

Component is the old one, so basically nested vector, I kept that because I think the power manifold one might (more often) require and actual power manifold while the old vector variant did not.

I'm not sure if keeping that separation is actually useful. NestedPowerRepresentation is the same thing. We could have some specializations to avoid explicit construction of power manifold if needed.

I would maybe thing that a MultiObjective would combine the idea here with the the function g, that is store both internally.

Multi-objective optimization doesn't have the function g, its the single-objective optimization that needs it. VectorialGradientObjective doesn't currently specify a unique optimization problem due to g being unknown.

kellertuer commented 3 months ago

I never want to specify said g in this PR since the goal is really to only represent elements that map into Rn, like the equality, or inequality constraints or the vectorial function in LM. At least the first two never have said g.

But wrapping this in a new Objective that provides g would be the way to go I think.

Ok, doing just the Powermanifold thing should be fine as well and we can omit the component one.

mateuszbaran commented 3 months ago

I never want to specify said g in this PR since the goal is really to only represent elements that map into Rn, like the equality, or inequality constraints or the vectorial function in LM. At least the first two never have said g.

But wrapping this in a new Objective that provides g would be the way to go I think.

That's fine but then maybe let's use a name without objective in it?

kellertuer commented 3 months ago

Interesting idea. For a bit of background: This started, when a student of mine said, it might be nice to have a Hessian with the cost in the constrained objective.

So I encapsuled that and instead of saving f and ? grad_f` in the objective, it now (in this PR) stores an objective. So the constrained objective is now and objective plus constraints.

So I would be fine with the idea that the constraints g and h are objectives (though vectorial) themselves. I also do not have enough experience in vectorial optimization whether they sometimes would really just have a vectorial cost? maybe the g you used to get a number is something that is just parametrised in an objective and not a concrete function? Then VectorialObjective would indeed be fine and describing well what we have – a function (and derivative information) that maps into a vector space.

Wel, but I am also fine giving it another name, just that I struggle a bit with a good name for now. Do you have ideas for a name?

mateuszbaran commented 3 months ago

Maybe just VectorialGradientFunction and VectorialFunction?

I also do not have enough experience in vectorial optimization whether they sometimes would really just have a vectorial cost?

Yes, that's what multi-objective optimization deals with. The goal is to explore the Pareto front. It is, in a way, equivalent to exploration of the impact of g on the result of single-objective optimization of the composite function $g(f(p))$.

maybe the g you used to get a number is something that is just parametrised in an objective and not a concrete function?

I don't understand, why wouldn't it be a concrete function?

kellertuer commented 3 months ago

I don't understand, why wouldn't it be a concrete function?

Maybe some vector optimisation area I do not know? I do not know much.

But then the vectorial objective we have here is fine for vector optim as well just that a vecetorobjetive need a vectorial objective plus g Like the constraint objective needs an objective, one or two vectorial objectives.

So I till neither see what would be wrong with the vectorial objective nor do I have any other good name here.

mateuszbaran commented 3 months ago

Maybe some vector optimisation area I do not know? I do not know much.

Yes, it is fine for multi-objective/vector-valued optimization but to me using it directly for anything else is confusing. VectorialGradientObjective sounds like something I'd only (or primarily) be using for multi-objective optimization. For single-valued optimization, VectorialGradientObjective is not a complete objective.

But then the vectorial objective we have here is fine for vector optim as well just that a vecetorobjetive need a vectorial objective plus g

Using both names (vector objective and vectorial objective) for different things sounds confusing. Maybe one of this things could be names SplitObjective for example?

kellertuer commented 3 months ago

Though not yet used, once we go for vector optimisation I want to keep VectorObjective for that I feel.

Since we already discussed this is in most vases (even for LM) just a part of the objective, we could call the type here VectorFunction? The only thing I do not like in this name is that it actually all contains the vector functions gradient ;)

edit: SplitObjective sounds too vague for me.

mateuszbaran commented 3 months ago

OK, then what about VectorGradientFunction? It would be fine I think.

kellertuer commented 3 months ago

Sounds good. Will work on that tomorrow then. Thanks for the feedback and the discussions :)

kellertuer commented 3 months ago

I did the renaming and will now start to write the access functions (which will simplify the 3 existing access functions it will replace quite a bit), I noticed that I am now not sure whether VectorGradientFunction should be <:AbstractManifoldObjective or not. It behaves in many aspects like such a type, but usually requires and argument more (to access the entries) or returns a vector of things instead of just a thing. So maybe it should not be an objective in type even?

kellertuer commented 3 months ago

A final thing to maybe consider: For the power manifold approach one sometimes needs the power manifold to access the elements of the (power manifolds) tangent vector.

For now I just added the power representation type to the new PowerManifoldVectorialType. That would, however, mean, one would often generate the power manifold just to access a component. I do not have a better idea; storing the power manifold in the objective would ne agains the idea of splitting the objective (or here part of the objective) and the manifold.

mateuszbaran commented 3 months ago

I noticed that I am now not sure whether VectorGradientFunction should be <:AbstractManifoldObjective or not. It behaves in many aspects like such a type, but usually requires and argument more (to access the entries) or returns a vector of things instead of just a thing. So maybe it should not be an objective in type even?

It could be an objective for a multi-objective optimization problem but I don't think we want to design that feature in this PR so maybe let's not make it <:AbstractManifoldObjective for now.

For now I just added the power representation type to the new PowerManifoldVectorialType. That would, however, mean, one would often generate the power manifold just to access a component. I do not have a better idea; storing the power manifold in the objective would ne agains the idea of splitting the objective (or here part of the objective) and the manifold.

I will think about it.

kellertuer commented 3 months ago

I agree on the first.

For the second I have 2 ideas

we could either always generate PowerManifold(M,vgf.dimension), this keeps the distinction but always generates a power manifold
safe the power manifold, which makes the objective explicitly depend on the manifold (which until now we avoided).

I am more tending to the first case, since I think the memory/time spend on creating that is not too bad – and the second would break quite a bit with the current model ideas.

kellertuer commented 3 months ago

I illustrated my problem a bit with the get_gradients and get_gradients! functions. I think the single gradient access functions are a bit easier, the Jacobian function might again have the need to create the power manifold (to access vector elements in order to get them to coordinates).

mateuszbaran commented 3 months ago

I see, I will try to fix it today or tomorrow.

kellertuer commented 3 months ago

I think I just found out what my main confusion was.

First of al: Sure if you see that wrapping the vector of functions makes it more type stable, then let‘s do that.

The main problem I had and why I did not get it to work is, that we now basically store the range of our functions, especially grad_g in our type.

But there is also the type the user might expect / want, and that‘s what confused me in get_gradients, that I could not specify which type it returns. A solution is for sure, is, that we have to carefully revise some of the code to be more agnostic to which representation on the power manifold is used. But that also means that the power manifold has to be available somewhere, to be agnostic here (that is to call X[N,1] on the get_gradients where N is the power manifold. Then the question is where/whether to store that.

I would prefer to not store it in the objective, since until now the objective is meant to be independent of the manifold (though defined using it). That would mean we have to

maybe generate it when needed in some functions
or change paces that need the power manifold to have it as an input (probably breaking then)

So in short (of my breakfast thoughts). Since now we have different power manifolds appearing: Where to store that without breaking the current model of Manopt.jl? that is, I do not want to store it in the objective. I think having a few places with N=PowerManifold(...) when needed would be fine.

kellertuer commented 3 months ago

I have a solution. I can sketch it in short in the following but will provide more details when implementing and also documenting it in the following days.

Origin of the idea: The domain (M) is stored in the problem. The range is not stored but implicitly assumed.

So: Store the range for the gradients in the problem as well (a new type of problem), use the same assumption as before for the DefaultProblem and make the range of the gradient a(n optional) positional argument of the corresponding functions. Sure one can hence provide a “wrong range”, but one can do the same with the manifold for an objective as well.

kellertuer commented 3 months ago

I think I like this new idea of a more precise problem (and nice fallbacks for the DefaultProblem to the old forms).

with the “capsule” around the vectorial functions I had to comment out the ones that depended on dispatching on the old forms (access, ALM, EPM grad)
we should still check whether we want to encapsulate the vector-of-functions cost and grad in some way. (I am not sure what would be efficient here)

But overall I like the idea I had today and will continue to rework the code to that.

mateuszbaran commented 3 months ago

Hi! Sorry for a delay, I will try to find some time tomorrow or the day after to work on it.

kellertuer commented 3 months ago

I think I have a solution I can work through. I would just need some feedback whether that approach is useful and sounds good.

mateuszbaran commented 3 months ago

Here is more or less the interface I'd imagine for ALM:

function (
    LG::AugmentedLagrangianGrad{
        <:ConstrainedManifoldObjective{InplaceEvaluation,<:VectorConstraint}
    }
)(
    M::AbstractManifold, X, p
)
    m = length(LG.co.g)
    n = length(LG.co.h)
    get_gradient!(M, X, LG.co, p)
    MPm = PowerManifold(M, n)
    YPm = zero_vector(MPm, p)
    gps = get_inequality_constraint(M, LG.co, p, :)
    needed_indices = gps .+ LG.μ ./ LG.ρ .> 0
    get_grad_inequality_constraint!(MPm, YPm, LG.co, p, needed_indices)
    for i in 1:m
        # evaluate in place
        if needed_indices[i]
            X .+= (gps[i] * LG.ρ + LG.μ[i]) .* YPm[MPm, i]
        end
    end

    hps = get_equality_constraint(M, LG.co, p, :)
    MPn = PowerManifold(M, n)
    YPn = zero_vector(MPn, p)
    get_grad_equality_constraint!(MPn, YPn, LG.co, p, :)
    X .+= (hpj .* LG.ρ .+ LG.λ) .* Y
    for j in 1:n
        # evaluate in place
        X .+= (hps[j] * LG.ρ + LG.λ[j]) * YPn[M, j]
    end
    return X
end

Note that YPm, YPn, gps, and hps would be stored in AugmentedLagrangianGrad to avoid unnecessary allocations, and ConstrainedManifoldObjective would have to store the array representation type.

Currently the test example for ALM in tests is a weird corner case where the gradient is constant and moreover it appears to be the Euclidean gradient instead of Riemannian one?

Determining how to most efficiently evaluate a bunch of gradients would be deferred to get_grad_inequality_constraint!(MPm, YPm, LG.co, p, needed_indices) which gets info what is needed through needed_indices. The user would then write something like

function my_grad_inequality_constraint!(MPm, YPm, p, needed_indiced)
    if 1 in needed_indices
        YPm[MPm, 1] = some_value
    end
    if 2 in needed_indices
        YPm[MPm, 2] = some_other_value
    end
end

Note that if those gradients are very cheap to compute (like in the case of nonnegative PCA) it may be even slower to evaluate them selectively instead of all of them due to branch prediction issues, CPU cache architecture and vector instructions.

kellertuer commented 3 months ago

Note that YPm, YPn, gps, and hps would be stored in AugmentedLagrangianGrad to avoid unnecessary allocations, and ConstrainedManifoldObjective would have to store the array representation type.

But then you could never change that representation and you implicitly assume that get_equality_constaint( [...], :) (which currently has its own name with an s at the end) always returns array power manifold tangent vectors. That would (a) be breaking and (b) restrict usage to only exactly one representation where points can be represented in a single array (Fixed rank would for example be excluded).

I agree that for both cases (1) a single function for all gradients and (b) a function for every gradient of a component, there are surely cases where either of them is (far) more efficient than the other. That is also why I want to support both. But I also want to support the nested case further and nor remove that.

I do like the : idea, that could deprecate the constraints function.

kellertuer commented 3 months ago

Ah and most the hustle I went through in the last rewrite (and all thinking last week) was to avoid having to regenerate the Power manifold on every function call, hence there is now the range= parameters, that by themselves do generate them when you do not pass an existing one. They also provide the exact difference that both nested and array power manifolds (or their tangent spaces to be precise) are possible.

mateuszbaran commented 3 months ago

But then you could never change that representation and you implicitly assume that get_equality_constaint( [...], :) (which currently has its own name with an s at the end) always returns array power manifold tangent vectors. That would (a) be breaking and (b) restrict usage to only exactly one representation where points can be represented in a single array (Fixed rank would for example be excluded).

No, it doesn't have to be array power representation:

function (
    LG::AugmentedLagrangianGrad{
        <:ConstrainedManifoldObjective{InplaceEvaluation,<:VectorConstraint}
    }
)(
    M::AbstractManifold, X, p
)
    m = length(LG.co.g)
    n = length(LG.co.h)
    get_gradient!(M, X, LG.co, p)
    MPm = PowerManifold(M, LG.power_represenation, n)
    YPm = zero_vector(MPm, p)
    gps = get_inequality_constraint(M, LG.co, p, :)
    needed_indices = gps .+ LG.μ ./ LG.ρ .> 0
    get_grad_inequality_constraint!(MPm, YPm, LG.co, p, needed_indices)
    for i in 1:m
        # evaluate in place
        if needed_indices[i]
            X .+= (gps[i] * LG.ρ + LG.μ[i]) .* YPm[MPm, i]
        end
    end

    hps = get_equality_constraint(M, LG.co, p, :)
    MPn = PowerManifold(M, LG.power_represenation, n)
    YPn = zero_vector(MPn, p)
    get_grad_equality_constraint!(MPn, YPn, LG.co, p, :)
    X .+= (hpj .* LG.ρ .+ LG.λ) .* Y
    for j in 1:n
        # evaluate in place
        X .+= (hps[j] * LG.ρ + LG.λ[j]) * YPn[M, j]
    end
    return X
end

or you can use LG.something.range_something instead of MPm and MPn.

Ah and most the hustle I went through in the last rewrite (and all thinking last week) was to avoid having to regenerate the Power manifold on every function call, hence there is now the range= parameters, that by themselves do generate them when you do not pass an existing one. They also provide the exact difference that both nested and array power manifolds (or their tangent spaces to be precise) are possible.

OK, I just didn't see how to get them in ALM so I made them on the spot. The main parts of my idea is using get_inequality_constraint(M, LG.co, p, :), get_grad_inequality_constraint!(MPm, YPm, LG.co, p, needed_indices) and letting the user specify multiple constraints in a single function. If the range is somewhere inside AugmentedLagrangianGrad, it can surely be just extracted from there.

kellertuer commented 3 months ago

Hm, what would still allocate a power manifold in every call? Sure sorting just the power representation is maybe ok, but I would prefer (similar to the manifold not being part of the objective) that this is also not art of the objective.

The trick would be that vector functions have a range argument (positional and optional) after p; the new ContrainedProblem would set that, the DefaultProblem not, the default would be the old power manifold used (nested).

Sure :and nested_indices sounds like a good extension of the current one. We could even deprecate the _constaints-functions (and for now call the : variant for that until we remove it). That sounds very reasonable.

mateuszbaran commented 3 months ago

Hm, what would still allocate a power manifold in every call? Sure sorting just the power representation is maybe ok, but I would prefer (similar to the manifold not being part of the objective) that this is also not art of the objective.

PowerManifold is an immutable struct so it would be on the stack (=fast to create, not counted towards allocations). Just storing the power representation would be enough.

The trick would be that vector functions have a range argument (positional and optional) after p; the new ContrainedProblem would set that, the DefaultProblem not, the default would be the old power manifold used (nested).

Nested by default is OK for me as long as there is a reasonable way to override it.

Sure :and nested_indices sounds like a good extension of the current one. We could even deprecate the _constaints-functions (and for now call the : variant for that until we remove it). That sounds very reasonable.

:+1:

kellertuer commented 3 months ago

Nested by default is OK for me as long as there is a reasonable way to override it.

We could at some point discuss the default, for now that would be necessary to stay nonbreaking. We could discuss that when the next breaking change is due.

For now the idea would be: You could do a ConstrainedProblem(M, obj, PowerManifold...); and sure in there we could store just the representation, I would not mind. Maybe even better / more flexible Compared to a DefaultProblem(M, obj) that would trigger the default.

And for the high-level interface I was thinking of a keyword argument for that.

The idea would be similar to the domain (M) being not stored in the objective, the range should neither. For M my idea in this is that if the objective is agnostic enough of the manifold, one could just exchange the Manifold to run the optimization on another manifold – maybe also just the same manifold with another metric. The same I would like to keep for the range as well – hence storing at also in the Problem; this is also nicer since we would only store it once and not (for example) also in the cost or such. Storing it multiple times might only lead to inconsistencies.

mateuszbaran commented 3 months ago

I see, I think it would be best then to just store the representation type instead of the complete power manifold.

kellertuer commented 3 months ago

Nice, thanks for that idea. So I will improve that and work on the rest then, also on the new idea with an index range and the :.

kellertuer commented 3 months ago

I indeed have a short question on your code idea above.

For

needed_indices = gps .+ LG.μ ./ LG.ρ .> 0
get_grad_inequality_constraint!(MPm, YPm, LG.co, p, needed_indices)

To work I am a bit lost which functions I have to implement. I first thought I just need

single index (sure I have those, its just 6 functions)
range?
colon?
bit array?
other magic that is necessary?

That would mean 18 or more further functions. So I wanted to check before I get into a dispatchageddon here...

mateuszbaran commented 3 months ago

I think AbstractVector{Bool} dispatch is enough for get_grad_inequality_constraint. You can always get a single constraint by setting only one element of the vector to true, or all of them by setting all elements to true. So other dispatches would only be an optimization, and quite likely they are not worth it.

Colon would be nice for get_grad_equality_constraint.

kellertuer commented 3 months ago

get a single constraint by setting only one element

that sounds easy in description, complicated in practice since for the allocating thingy that is different things to allocate.

Colon would be nice for get_grad_equality_constraint.

formerly I distinguished gradientS for all gradients and gradient with an additional index for one – but sure with Colon these can nicely be combined.

I refactored that partly already and also the range is not the power manifold any longer but just its representation – which is anyways necessary if we now have different “sizes” we return depending on I.

kellertuer commented 3 months ago

So I checked and I am currently not sure how many different cases that would mean to implement. I currently fear it is really 6 function dispatches for

array vectors
range
other vector magic?

and what each of these would need in allocations That one I really have not yet figured out. If you hand me a bit array with a single 1 – do you want a single tangent vector or a vector of tangent vectors with just one element? I would maybe even tend to the second, since that seems more consistent. Otherwise the place calling this function would have to do a lot of ifs. For getting a single tangent vector, use and index j (these already exist).

But I will think about that. For now I can at least also continue (though more like tomorrow) with adopting single access and full access in the other areas of Manopt (“higher up” and in the algorithms.)

kellertuer commented 3 months ago

I rewrote the access and in get_gradient(M, vgf, p, j, range) the j can now be

BitVector the same length as the number of constraints
AbstractVector{<:Integer}
UnitRange{<:Integer}
Colon
Integer

it was a bit tricky to allocate (vectors of) tangent vector(s) here, but I think I managed to solve this. Of course only the last returns a single tangent vectors, all others a (possibly length 1) vector of tangent vectors. This way this is nice and consistent. This should now be consistent with power manifolds and the j in X[pM, j].

I currently can not render the docs, since we need a next version of ManifoldsBase (with fill) first. But I will now slowly continue to rewrite the existing code to internally use the new vector gradient function for constraints – this should half the amount of code in quite a few places.

kellertuer commented 3 months ago

Uff this is quite some rework, but I like both the code reduction by having the new vectorial function as well as the reduction we will see with the : notation. But from the vector_objective functions I now managed to rewrite the getters and setters for the constrained objectives. So the main next step is to fix the sub_objectives of ALM and EPM as well as these methods themselves.

edit: A main headache is now to check that : returns an array still and that this works consistently, e.g. for embedded objectives, but for today I'll stop (Have to wait for ManifoldsBase to test here anyways).

kellertuer commented 3 months ago

Ah, that went faster than expected. The only errors left in the tests are from the following:

In the old scheme, if you provide a point for the single-function-thingy, I could check the length of the vector. This is maybe a bit more complicated in the array-representation case, but if we can get that back, tests would already pass again (and only checking/reworking LevenbergMarquard is left in rework)

mateuszbaran commented 3 months ago

that sounds easy in description, complicated in practice since for the allocating thingy that is different things to allocate.

I wouldn't spend too much time optimizing the allocating variant.

and what each of these would need in allocations That one I really have not yet figured out. If you hand me a bit array with a single 1 – do you want a single tangent vector or a vector of tangent vectors with just one element? I would maybe even tend to the second, since that seems more consistent. Otherwise the place calling this function would have to do a lot of ifs. For getting a single tangent vector, use and index j (these already exist).

Yes, I think special-casing single constraint allocating variant doesn't make much sense.

So I checked and I am currently not sure how many different cases that would mean to implement. I currently fear it is really 6 function dispatches for

Let's start from what ALM and EPM need. It's either all (:) or something BitVector-ish. So we don't need other access patterns I think.

In the old scheme, if you provide a point for the single-function-thingy, I could check the length of the vector. This is maybe a bit more complicated in the array-representation case, but if we can get that back, tests would already pass again (and only checking/reworking LevenbergMarquard is left in rework)

I will have to check that carefully.

kellertuer commented 3 months ago

By now both : and BitVectors should work fine, I hope I found a good way to realise them. And an array of integers as well. All these return vectors of tangents of an element on the 1-vector product manifold otherwise. I am just not yet sure the allocations are all optimal. But sure working ELM/ALM to these nicer access methods as you sketched is the next step

kellertuer commented 3 months ago

I just fixed the first few things in ALM/EPM so that they can be used as before. But this is all super tricky and there seems to be a million failing tests left. As I wrote already, I fear this might really take some time to get right and working. Ok it's about 100 test failing, but to just fix 3(!) it just took me 2 hours. Easy to extrapolate.

There are for example a lot of ConstraintObjective constructor calls where the automatic “How many constraints are there?” does not yet work as automatic as before.

So the one thing I am not so sure about by now is “just add another representation here” is a lot a lot a lot of work, is that worth it?

mateuszbaran commented 3 months ago

Hm, I will take a look.

kellertuer commented 3 months ago

Don‘t get me wrong, I think the general idea is good and if it works it slows for quite some flexibility. But I fee the code is now more clever than me, so for every error it takes me a long time to narrow it down. So something might still be off in this idea.

mateuszbaran commented 3 months ago

I've fixed all EPM tests and some ALM tests. The current ALM failure is due to a different constructor signature -- we probably need a convenience constructor for ConstrainedManifoldObjective to fix that?

kellertuer commented 3 months ago

I already tried to add the “guess the number of constraints” thing, but sure if you have even more ideas for convenience, that would be great :)

mateuszbaran commented 3 months ago

OK, I've fixed ALM then.

mateuszbaran commented 3 months ago

By the way, is this intentional:

function get_inequality_constraint(
    M::AbstractManifold, co::ConstrainedManifoldObjective, p, j
)
    return get_cost(M, co.inequality_constraints, p, j)
end

?

kellertuer commented 3 months ago

Yes that is the case since co.inequality_constraints is a VectorGradientFunction vgf, which has a cost (vector) and a gradient (vector of them). So evaluation one of the constraints is evaluating the cost of said vgf.

edit: One large advantage of this is, that we implement all the access to this only once (not once for eq once for ineq constraints – in the end maybe even the same for the Jacobian).

JuliaManifolds / Manopt.jl

Modularise Constraints #386

🛣️ Roadmap