Closed kellertuer closed 3 months ago
Now the main parts left seem to be interaction with the embedded objective and caching, the rest are internal tests of the Objective itself we have to adapt, it seems; and a bit of Aqua, that I have not yet looked at.
Yes that is the case since
co.inequality_constraints
is aVectorGradientFunction vgf
, which has a cost (vector) and a gradient (vector of them). So evaluation one of the constraints is evaluating the cost of saidvgf
.edit: One large advantage of this is, that we implement all the access to this only once (not once for eq once for ineq constraints โ in the end maybe even the same for the Jacobian).
Then maybe it would be better to call it get_value
instead of get_cost
? After all, inequality constraints are not costs.
Sure, I do not mind. Maybe better to emphasise that it indeed not a cost, since we would expect costs to be real-valued.
I changed the name, the main things missing from the current errors (and that I am stuck with)
BitVectors
yet, it then complains that one can not access an integer with a boolean and I do not understand whyiterate(:Colon)
and I do not understand whyiterate(get_constraints)
and even worse than I do not understand this, running that line locally in REPL it works.The larger issues after these are Cache and Count, that currently error a bit still, but these three ones I am again totally lost on and spent 1.5 hours without any progress. For cache/count: I think I do understand why both interactions fail and how to fix them, they just need a bit of internal restructure probably.
I will try to fix those iterate
issues.
Takk! I will try the caches then next (tomorrow hopefully); they are a bit more technical than the counting, but I think I know what to do.
It looks like I've fixed everything except passthrough and caches:
Test Summary: | Pass Fail Error Total Time
Constrained Plan | 180 2 11 193 11.6s
Partial Constructors | 10 10 0.7s
ConstrainedManifoldObjective{AllocatingEvaluation} | 23 23 0.1s
ConstrainedManifoldObjective{InplaceEvaluation} | 23 23 0.3s
ConstrainedManifoldObjective{AllocatingEvaluation} | 23 23 0.2s
ConstrainedManifoldObjective{InplaceEvaluation} | 23 23 0.2s
Augmented Lagrangian Cost & Grad | 12 12 2.2s
Exact Penalties Cost & Grad | 16 16 2.3s
Objective Decorator passthrough | 4 4 1.8s
Count Objective | 34 34 0.7s
Cache Objective | 12 2 7 21 2.5s
Interesting that it was just so few lines to fix. I think from what I read in the errors it would have taken me a few hundred. I also do understand exactly nothing of the new function, so thanks for providing that. The code is indeed now more clever than I am. Can you maybe document that a bit?
Julia's indexing interface is incredibly expressive but also quite difficult to understand beyond basic use cases. So there are sometimes magic solutions like this one, and actually the only reason I could do it so quickly is because I've already spent dozens of hours with other indexing code :sweat_smile: .
I will try to add some documentation.
Perfect, good that I said, I am stuck and let you resolve it then. The only error I do not yet fully understand is the โno iterative get_constraints
โ thing. But I will check what to do there best and work on caches as soon as I find time.
The only error I do not yet fully understand is the โno iterative
get_constraints
โ thing.
How can I reproduce that? I don't see it among the currently failing tests.
Well I just run the tests on Juli 1.10, wanted to show that here, but the tests here seem to be weird currently, claiming we do not import Base.length though we definelty do.
Here is the full error I get
Cache Objective: Error During Test at /Users/ronnber/Repositories/Julia/Manopt.jl/test/plans/test_constrained_plan.jl:322
Test threw exception
Expression: get_constraints(M, cofa, p) == get_constraints(M, cccofa, p)
MethodError: no method matching iterate(::typeof(get_constraints))
Closest candidates are:
iterate(::Base.MethodSpecializations)
@ Base reflection.jl:1148
iterate(::Base.MethodSpecializations, ::Nothing)
@ Base reflection.jl:1154
iterate(::Base.MethodSpecializations, ::Int64)
@ Base reflection.jl:1155
...
Stacktrace:
[1] in(x::Symbol, itr::Function)
@ Base ./operators.jl:1292
[2] firstcaller(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, funcsyms::Function)
@ Base ./deprecated.jl:165
[3] macro expansion
@ ./deprecated.jl:132 [inlined]
[4] macro expansion
@ ./logging.jl:388 [inlined]
[5] depwarn(msg::String, funcsym::Function; force::Bool)
@ Base ./deprecated.jl:127
[6] depwarn(msg::String, funcsym::Function)
@ Base ./deprecated.jl:121
[7] get_constraints(M::ManifoldsBase.DefaultManifold{โ, Tuple{Int64}}, co::ConstrainedManifoldObjective{AllocatingEvaluation, ManifoldGradientObjective{AllocatingEvaluation, var"#f#117", var"#grad_f#118"}, Manopt.VectorGradientFunction{AllocatingEvaluation, Manopt.FunctionVectorialType, Manopt.FunctionVectorialType, var"#h#129", var"#grad_h#131", Int64}, Manopt.VectorGradientFunction{AllocatingEvaluation, Manopt.FunctionVectorialType, Manopt.FunctionVectorialType, var"#g#120", var"#grad_g#121", Int64}}, p::Vector{Float64})
@ Manopt ~/Repositories/Julia/Manopt.jl/src/plans/constrained_plan.jl:268
[8] macro expansion
@ ~/.julia/juliaup/julia-1.10.3+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:669 [inlined]
[9] macro expansion
@ ~/Repositories/Julia/Manopt.jl/test/plans/test_constrained_plan.jl:322 [inlined]
[10] macro expansion
@ ~/.julia/juliaup/julia-1.10.3+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[11] macro expansion
@ ~/Repositories/Julia/Manopt.jl/test/plans/test_constrained_plan.jl:308 [inlined]
[12] macro expansion
@ ~/.julia/juliaup/julia-1.10.3+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[13] top-level scope
@ ~/Repositories/Julia/Manopt.jl/test/plans/test_constrained_plan.jl:6
But on the current commit I also have 3 9
instead of your 2 7
errors left on the cache objective. So while tests should be reproducible, we currently seem to have 3 different variants here. Very strange.
Hm but I can reproduce the error on Julia 1.9 we have on CI here as well: Though line 11 specifically has length
in the things to import Julia 1.9 complaints that length was not imported ๐ซ
I can locally reproduce that Base.length
error on 1.9 but not your iterate
one. Both look weird.
Let's see if CI can reproduce your problem.
Yes, this time its your machine having something magic, that you have less errors, maybe you fixed them but did not commit that?
It looks like I've been running tests without depwarns, while you and CI had them turned on. Anyway, I've fixed the issue.
Ah, I see, I probably misunderstood how to use depwarn then. But good to know that that was the issue then I have to change those lines anyways, since that method shall be deprecated.
I've checked NonlinearLeastSquaresObjective
and I don't actually see much benefit in unification with VectorGradientFunction
. There is a bit of common logic and some some potential for support of additional Jacobian/gradient types but it doesn't look particularly important to me.
I've checked NonlinearLeastSquaresObjective and I don't actually see much benefit in unification with VectorGradientFunction. There is a bit of common logic and some some potential for support of additional Jacobian/gradient types but it doesn't look particularly important to me.
Semantically the gradient on the power manifold and the Jacobian should be really similar. I thought your initial idea even came from that, that get_vector
on the power manifold of the Jacobin would yield the new gradient representation we now have.. Semantically I see a lot of similarities, but have not yet checked how useful that is for the code.
If it is not useful we can also leave out the CoefficientRepresentation
for now, since that was meant to be used for the jacobian.
Semantically the gradient on the power manifold and the Jacobian should be really similar. I thought your initial idea even came from that, that
get_vector
on the power manifold of the Jacobin would yield the new gradient representation we now have.. Semantically I see a lot of similarities, but have not yet checked how useful that is for the code.If it is not useful we can also leave out the
CoefficientRepresentation
for now, since that was meant to be used for the jacobian.
CoefficientRepresentation
is still needed for constraints and that was my primary point, it might be used for least squares as well but least squares has it built-in already and there is probably something like 10 lines of logic in the middle of existing function that is duplicated. We could remove that duplication but it would be of no real benefit to least squares without additional work being done there. Maybe it would make sense to restructure least squares while resolving #332 .
Ok, then let's postpone LM until we/someone works on #332.
H, I thought now about this for about an hour but I am conceptually stuck on the caches. Mainly conceptually and related to indices again. Here are my two problems, that seem more clever than myself again:
The old system had two caches: :GradEqualityConstraints
and GradEqualityConstraint
. The first one caches what is now the j::Colon
scenario, the other the single evaluations (j::Integer
). Would this distinction still be useful? I think so since the function evaluation variant would benefit quite a bit from the first case. So this is really just a question of usefulness. Also storing the single j::Integer
for the array representation power manifold is a bit tricky to do maybe, I am not yet so sure, but my main problem is the design decision here.
My main problem are again all j
addressings else. Those could check either of the caches above to access cached values (first the full one and return a subset, other wise collect the single ones?); but for collecting the single ones I fear I need some trick like the _to_iterable_indices
with the only problem that from the current objective, the array I would need is cache.objective.equality_constraints.jacobian!!
and to safely know that that exists, the dispatch-types would be several line. So how would I get from any j
to such an iterable here then? I am not able to do that.
Sorry for so many questions, I fear the originally simple Idea I had to realise this turned into something super technical and with the new indices-tricks into something I am not able to handle.
Hm, how do you remember in the old scheme which single constraint has been cached?
Sorry for so many questions, I fear the originally simple Idea I had to realise this turned into something super technical and with the new indices-tricks into something I am not able to handle.
No problem, I think this improvement is worth the work put into it. I will try to help with the caches.
For the first I think it is fine to keep that, might be beneficial for the function one.
I fixed all tests, but caching does not yet work for fancy new indexing as described in open point 2 for now (added function dummies with errors). And the similar problem (and testing) is missing for (single gradient eval) counting.
Hm, how do you remember in the old scheme which single constraint has been cached?
The old scheme was easy, I will directly describe it in the new notation
get_grad_equality_constraint(M, p, i::Integer)
does cache with key (p,i)
into the single gradient cache (n repetitions, hence the i in the key
get_grad_equality_constraint(M, p, I::Colon)
dies cache with key (p)
but caches the whole power manifolds thingy (array representation not yet tested)
Those two now work also exactly that way, though this might lead to a double-cached tangent vector basically (but it did that potentially before as well).
An idea might be for any range i
[I]
on that onemy main problem is that I can access the number of constraints, but I can not get a _to_iterable_index
thing there, since I can not access the AbstractArra you use some magic on (as mentioned it is a a.b.c.v.d.f.c.d.f.c.d.c.d.f.
(subsubsubsubsubsubsubsubstructure thingy) at that point basically.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.77%. Comparing base (
2a7cc0d
) to head (d7f1205
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
_to_iterable_indices
can actually be called on any vector of the right length (as many as the number of constraints).
By the way, I have a small question right now. If I want to, for example, evaluate gradients of constraints 3, 5 and 10 in-place, should they be saved at indices 3, 5 an 10 of the given array or 1, 2, 3? I would say 3, 5, 10 is more consistent but it looks like you attempt in some places to do 1, 2, 3? Sure, it might use less memory, but actually obtaining that benefit would be a large follow-up project so it might not be worth making our lives harder right now.
By the way, I have a small question right now. If I want to, for example, evaluate gradients of constraints 3, 5 and 10 in-place, should they be saved at indices 3, 5 an 10 of the given array or 1, 2, 3? I would say 3, 5, 10 is more consistent but it looks like you attempt in some places to do 1, 2, 3? Sure, it might use less memory, but actually obtaining that benefit would be a large follow-up project so it might not be worth making our lives harder right now.
They should be saved in 3,5,10 โ and for single index access they are (as they were before).
If you would do I=[3,5,10]
that is exactly the magic we are missing. I am not sure how to get from any arbitrary indexing thingy (not sure what that is, for me all just magic) to the iterable. It could also be a BitArray or whatever.
I currently attempt nothing on those โ they all have error messages in there, since I have no clue how to do this. For best of cases, as I said:
1) get the length of [3,5,10]
and an iterable.
2) check whether the :
cache exists, and return the that_cahe[3,5,10]
(maybe with a copy to for the inplace)
3) if :
does have no cache for that p
, check which ones have index catches. Assume that is [3,10]
, then take them from cache
4) for the remaining ones (in the example [5,]
call the getter with this indexing array, store them in the result we started in 3
5) cache all results from 4 in the corresponding indices
6) done
The only step I roughly would know how to do is 2 and 6.
They should be saved in 3,5,10 โ and for single index access they are (as they were before).
Then why is there n = _vgf_index_to_length(i, vgf.range_dimension)
instead of n=vgf.range_dimension
?
I am not sure how to get from any arbitrary indexing thingy (not sure what that is, for me all just magic) to the iterable. It could also be a BitArray or whatever.
_to_iterable_indices
does exactly that but you need to tell it what is going to be indexed, or give it something with the same shape.
Then why is there n = _vgf_index_to_length(i, vgf.range_dimension) instead of n=vgf.range_dimension?
where โthereโ? I basically rewrote all of the cache functions in the last 2 hours. The only ones that work are single index and :
โ both should not use that. The others have error cals in them, maybe first ideas, but I got lost in that, so maybe there is a fragment of that in there โ sure. But I think I now wrote often enough that I have no clue what to do here.
edit: To be more precise, the _vgf_index
thingy is indeed the 1,2,3
but I need that when returning the result (without cache or filled in by cache), since if the gradients get called with [3,5,10]
the result is just a size-3-power manifold not a size-n one. So the result has these 1,2,3 โ the cache should of course store at 3,5,10; everything else would be total chaos.
_to_iterable_indices does exactly that but you need to tell it what is going to be indexed, or give it something with the same shape.
Yes and you call it with vgf.jacobian!!
(which might even be a function).
When I am in the cache this is get_objective(cache_obj, true).inequality_constraints.jacobian!!
but also only if the user does not invent a better vectorial function, so I would have to super-type-dispatch-constrain that function to be suche to reach that jacobian. I would not know what else to take instead.
The newest commit is how far I get with the 6 steps above.
My main problem is, that when I check the single constraints cache, I have to iterate over the j in i
(line 449), but at that point I do not have any array to pass as first argument to the _to_iterable_indices
function. I also have no idea where to get such an array from.
If we manage that one function, I think I can adopt all others in that file and adopt even the counting accordingly.
I will check it during the weekend.
I'm trying to fix it right now... is there a function that tells how many equality constraints a given objective has?
Cool. For now not (yet) be we could surely implement one. (that would pass to the vectorgradientfunctions length() call). What would be a good name?
Maybe num_equality_constraints
? I think need that to fix the issue with caching.
That was also my idea and I do not like the abbreviation num
upfront. But as long as we do not have another idea that is what we take :)
I think equality constraints should more or less work right now after my latest commit, up to my one TODO. Let me know if anything is broken.
That looks good, and should even also work for :Colon
(which is currently a form before).
I will think about a name later today and continue. As I said, I hope that is the main thing missing besides testing and documentation.
I could not commit the :Colon
case, since these are the only ones that actually can cache the full constraint evaluation (the others that are not an integer, can only read them)
I think I now implemented all cases. What is missing is testing and maybe a bit of documentation. We are closing in on this (and as for so often it was a bit longer than I thought).
First thing I noticed starting with testing, the new version of
leads to an index out of bounds in the Tutorial (not yet in testing, but testing also does not yet include ranges so much).
_edit 1:_I think I found it. One other thing I noticed โ I still use often X[pM, j]
to write to X
white you use _write
โ should I switch to that as well?
edit 2: Yay found it! The problem was that we have 2 indexing things: If you ask to access constrains [1,3,4]
as a range, we still only return a tangent vector on the 3-powermanifold (not the 4 one) so the single indices thereon are 1,2,3. I adapted that in all necessary places, I hope. Still if you could check the X[pM,1]
things that would be great (I am not so sure about differences and effects).
_edit 1:_I think I found it. One other thing I noticed โ I still use often
X[pM, j]
to write toX
white you use_write
โ should I switch to that as well?edit 2: Yay found it! The problem was that we have 2 indexing things: If you ask to access constrains
[1,3,4]
as a range, we still only return a tangent vector on the 3-powermanifold (not the 4 one) so the single indices thereon are 1,2,3. I adapted that in all necessary places, I hope. Still if you could check theX[pM,1]
things that would be great (I am not so sure about differences and effects).
I think we should switch to _write
everywhere. It's more generic.
Regarding access like [1, 3, 4], I think basically all places that do that should do it in-place and provide sufficiently large array so that output is indeed written at indices [1, 3, 4] and not [1, 2, 3]. For consistency the allocating variant should do the same IMO.
I think we should switch to _write everywhere. It's more generic.
I have no clue what the actual differences are, but I can check and change
Regarding access like [1, 3, 4], I think basically all places that do that should do it in-place and provide sufficiently
I disagree. That would lead to much more memory usage and basically-empy-entries, which I feel is much to error prone.
Even more, if you call get_inequality_costraint(M, ...., 1:3)
you would for consistency expect a vector of length 3 not of length n.
For consistency the allocating variant should do the same IMO.
But then this is fully different from a vector subsample v[2:3]
which I would consider both variants to be equal to. v[2:3]
gives you a vector of length 2. We should to the same for such an indexing โ and we actually do now.
I disagree. That would lead to much more memory usage and basically-empy-entries, which I feel is much to error prone. Even more, if you call
get_inequality_costraint(M, ...., 1:3)
you would for consistency expect a vector of length 3 not of length n.For consistency the allocating variant should do the same IMO.
But then this is fully different from a vector subsample
v[2:3]
which I would consider both variants to be equal to.v[2:3]
gives you a vector of length 2. We should to the same for such an indexing โ and we actually do now.
Allocating variant should never be used in practice actually, especially when memory usage is of concern, and the in-place variant needs maximum capacity anyway. This is going to be error-prone either way because using the allocating variant requires compacting indices while the in-place one does not. But sure, at least your approach is mirroring array indexing so I can see how it looks consistent.
julia> X = rand(5)
5-element Vector{Float64}:
0.768801628457022
0.2125959599454933
0.22309372971963248
0.4219945781972464
0.253065406198804
julia> Y = similar(X)
5-element Vector{Float64}:
0.0
0.0
0.0
0.0
0.0
julia> Y[2:3] .= X[2:3]
2-element view(::Vector{Float64}, 2:3) with eltype Float64:
0.2125959599454933
0.22309372971963248
julia> Y
5-element Vector{Float64}:
0.0
0.2125959599454933
0.22309372971963248
0.0
0.0
julia> X[2:3]
2-element Vector{Float64}:
0.2125959599454933
0.22309372971963248
Hm, I had hoped we slowly come to some convergence here. This seems not to be the case then. I totally disagree.
the allocating variant should be totally fine and work the same way as the in-place variant.
I do understand
get_grad_inequality_constraint!(M, X, co, p, 1:2)
as the full right hand since of
X .= [ gradient j evaluated at p for j=1:2]
The right hand side is a vector of length 2. If you just ask for 1:2
I feel it is super strange to (1) eventually get a thousand constraints back or pass memory for a thousand gradients at X
. Also the integer ones (which should act the same as the [j,]
variants basically) work on single gradient vectors.
and yes, the idea is that this is the same as army indexing. We use an array indexing index j
by now and have a vector of gradients. So if we would do something else than vector indexing behaviour, I do not have a good argument as to why, especially if the alternative might have a super huge memory overhead, for example with a million constraints where we have an algorithm that at every point in time only accesses and has to store two gradients.
This is a start to rework constraints, make them a bit more flexible and address / resolve #185.
All 3 parts, the function $f$, inequality constraints $g$ and equality constraints $h$ are now stored in their own objectives internally. That way, they can be more flexibly provided โ for example a Hessian for one of them.
Besides that, there is more details available on the co-domain of these, especially in case of a single function, the gradient can now also be specified to map into the tangent space of the power manifolds tangent space.
One open point still to check is, how the internal functions can be adopted to this in hopefully a non-breaking way. A main challenge (or where I stopped today), is, when a function that returns gradients (or a gradient of functions) is stored in an objective โ and it hence might be cached โ how to best model element access. This might also be a nice improvement (or a unification with) the LevenbergMarquart type objective. One could unify that to an objective of type
VectorObjective
maybe.Comments on this idea welcome.
๐ฃ๏ธ Roadmap
๐จโ๐ป unification with the LestSquaresObjetice (VectorObjective
?)range
, e.g. ALM and EPM grad.DefaultManifoldProblem
passes in a rangenothing
to trigger the defaultget_gradients
toget_gradient
withColon
ConstrainedManifoldObjective
thatg
andgrad_g
are internally stored as the newVectorialGradientFunction
h
andgrad_h
are internally stored as a mewVectorialGradientFunction