Closed kellertuer closed 3 days ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.90%. Comparing base (
c8564b8
) to head (949457a
). Report is 1 commits behind head on master.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The ::Number
ambiguities with types points are all removed.
TCG has some strange ambiguities, that I still have to look at. but all other high-level interfaces do not have ambiguities any longer. The trick was to slightly redefine the methods that “generated” array-functions for number type data.
Small update: For now any single of the remaining 6 ambiguities, when trying to fix them, the number only goes further up. I am a bit los on this by now, since trying to reduce ambiguities only increases them (with my limited knowledge).
Some are also due to a deprecated signature so we can maybe leave these out in the end.
Maybe the reasonable solution is to make a breaking release that sets the Hessian to be mandatory, then the user would have to pass the approxHessian themselves.
The code with mandatory Hessian is so much cleaner. So this will be part of 0.5.0 – so nothing urgent and I will wait for a few other issues to be resolved and check whether we have other breaking changes we should consider / include then.
Oh tests might be a bit tricky, because we first have to bump version on ManoptExamples.jl it seems for the tests.
This is mainly waiting for https://github.com/JuliaManifolds/ManoptExamples.jl/pull/16 and is (again) not urgent. So we can also think a bit about what else “to break” to improve.
Oh ManoptExamples only works with Julia 1.8, so I would either have to raise the Julia version here as well or run the tests with manoptExamples only on Julia 1.8 but not on 1.6
Maybe we should make ManoptExamples.jl compatible with Julia 1.6? What's the issue?
That would maybe also be an idea, there was something with range.
Yes, you just need to replace
range(interval[1], interval[2], n)
with
range(interval[1], interval[2]; length=n)
Ah will check; probably the best way to loosen the constraints to 1.6 again once I fixed that (maybe I was a bit too eager to finish the weak deps). I am travelling tomorrow so I might only find time on Thursday for that then.
Short status update here: With ManoptExamples 0.1.8 tests should be fine again. I also removed all deprecated keywords, but it seems I missed something on reworking the interfaces somewhere since now the cache (some StorageRef
?) Fails since it suddenly gets Floats
where it expects Vector{Float}
s. No clue why and I thought I had that resolved (so I hope it is not due to removing deprecated stuff).
But well. This might hence still take quite a while I feel. but we also can wait until we have more breaking stuff.
I think I at least narrowed down the error: now the p
we store internally is always mutating (that is on circle and positive numbers it is internally also a [p]
if p
is a number, but the point_storage
win the storage/cache gets initialised to p
somethow, so that is numbers – and then conversion fails.
Since this is now consistent in all high level interfaces, we might have to check a few places where internally this is accidentally not the case (that points are always vectors in structures of Manopt)
Since this is a breaking change then maybe it would be a good idea to also tackle #400 here? Especially implementation of the factory pattern.
Then this becomes a monster-PR again, and I was hoping to tackle that even in a non-breaking way, but sure, we can also do this here :)
If so I would also check for other constructors that might have the same like a step-size or so.
The adjustments necessary for step sizes don't look that complicated to me, especially if we don't have to make them non-breaking.
No, also the other ones for the gradient rules should not be that complicated and doable nonbreaking I think. That was why I thought we could to that after this PR, but I can check (when I have finished the points still open) how much we can address that here as well, sure.
I looked at it and I agree – in a breaking way it should be easy to do and also much easier than trying to do it non-breaking. Will add it here, sure. Just that this way the PR might take a bit longer, given that my semester starts next week (and I have resit-exams to administer this week).
Great! I think it is fine to delay a breaking release to resolve that issue with line searches.
I needed a small trick – the tutorials have to render for this version also on the dev version from the folder, otherwise they would run on 0.4; we could maybe return to normal after the breaking release.
I fell into a small rabbit hole.
Now we have for states with sulbsolvers the constructors
State(M, sub_problem_closed; evaluation=[...], kwargs...)
that wraps the evaluation and callsState(M, sub_problem, sub_state; kwargs...)
which now has a single type allowed for the state.To make this easier, p=
was moved to a keyword. So then I did the same with all other states.
But with this, the constructors are very clear now, though it was a bit of a refactoring. I am thinking about not exporting the states to declare them a bit more internal.
This finishes the main intention of this PR. so only the factory-idea is left here (and code coverage).
I just worked on a first sketch for the factory idea. I would like some feedback on whether this is what you want @bvdmitri, and from @mateuszbaran whether this is still nice memory wise (I hope so).
The idea is as follows. The factory just wraps the direction update T
all args
and kwargs
; I even added a manifold if one wants to, see
(oh the docs are still outdated; the actual type is only a parameter no longer a field)
Then the actual call is either just factory()
(works if the internal manifold from before is not nothing) or factory(M)
that then passes all args/kwargs to the actual type.
There is also a produce_rule
that accepts both rules and factories but makes sure the result is a rule
With that both
gradient_descent(SomeManifold(), f, g, p; direction = MomentumGradient())
gradient_descent(SomeManifold(), f, g, p; direction = MomentumGradientRule(M, p))
work, but since the factory also accepts the p
as a positional argument also the old version
gradient_descent(SomeManifold(), f, g, p; direction = MomentumGradient(M,p))
works and even
gradient_descent(SomeManifold(), f, g, p; direction = MomentumGradient(p))
(so you can fill any args/kwargs. The last one I like because if you have memory/a point you want to pass you do not longer have to think of the manifold either.
besides that, I feel since p
can always be filled with rand(M)
I might move that even to a keyword argument in the (more internal) XRule
types. The current X
types will then always create such a factory.
Let me know
a) what you think (before I rework all other rules and b) wherelse we could provide such a factory pattern
If the factory is ok in its form, it might even work for nearly any other “thing” in Manopt, but my first idea would be
Stepsize
sStoppingCrtierion
maybe also Debug/Record things, but since they can already usually addressed with symbols (cf. https://manoptjl.org/stable/plans/debug/#Manopt.DebugActionFactory-Tuple{Symbol}), maybe they are already fine.
edit: Since all try to follow the same idea “delay the use of a manifold to fill defaults in args and kwargs” one could even cover that with the factory above – and maybe name it a bit more general DelayedManifoldDefaultsFactory
or so.
Nice work, I’m currently on vacation, but will be able to provide some feedback next week when return to the office :)
Cool, enjoy your vacation!
Since the fix was slightly breaking, I also added the fix for the remaining 2 points in #382 here. Now both the full matrix and the limited memory variant are equivalent in how they perform their updates.
Sure, I was just asking about feedback on the factory, the rest is quite a bit of rework because we noticed Aqua was not too happy with a few ambiguities. The reason both are on this PR is, since that will be the next (breaking) version that. So no worries, I was not asking you to review the rest – just the idea of the factory.
Nice. that you like it, so I will continue introducing that to the other areas as well and also use that to review the docs.
Concerning hidden bugs, the Average gradient had one, where in some cases the history was not filled correctly and hence it was much closer to gradient descent than averaging. This already has been fixed.
For the rest, sure I'll let you know all the refactoring to the new factory has been finished; probably the main work is even the docs and not the code itself.
@bvdmitri thanks for the feedback, I addressed the structural things already; I generalised the factory also in name to a ManifoldDefaultsFactory
so it can be used with other things as well.
The docs are not yet perfect, but I also have to work through the remaining direction rules, stopping Criteria, and Stepsizes. and will do that during that as well.
An idea is that there is usually an internal type (like the XRule
s now) that documents fields and internal parts and is also not exported. Compared to that the new X
(exported) will document the mathematical part and have the new Note to the internal one with the factory.
Since this involves a few types to rework, this will still take a while, but I plan to maybe refactor a type per day (besides teaching that started today again).
Great amount work @kellertuer ! nicely done, the documentation is indeed important, but the implementation is solid IMO
Thanks. As I said I will then work through the changes for some time, but I feel it improves usability quite a bit :)
Small update, all direction rules should now work fine with the factory, that is one can now always write
conjugate_gradient_descent(M, f, grad_f; coefficient=FletcherReevesCoefficient())
but one can still overwrite any argument of keyword argument when calling these (still without using a manifold) for example the threshold here can be set factory = ConjugateGradientBealeRestart(; threshold=0.2)
– but its vector transport it would need is still only filled with a default later when one passes a manifold to the factory(M)
this call creates. This way one can easily still overwrite keywords and even arguments to coefficients or also for the
AverageGradient(; n=20)
this still works fine for all keywords of any of these rules. And test coverage should also by now work.
So the only thing on this PR left is to check whether it is reasonable to do this for step sizes (probably) and stopping criteria (not 100% sure, not many need a manifold I think), too.
So that we have Armijo()
as a factory for ArmijoLinesearch
and such; the factory is already generic enough for this, it's mainly a matter of slight adaption of code and a bit larger adaption of documentation.
This now maybe also closes #406, #407, #408 (all Manifolds bumps for the package, docs and tutorials).
@mateuszbaran I do not seem able here to get RecursiveArrayTools (now a weak dependency here as well) to work fine with Statistics. Any ideas?
Today was rally not my day. My progress over several hours can bee strictly upper bounded by zero; and locally all works fine.
edit: and now I do not know why it works now. Well,...
Ah, I see you've figured it out already.
My main problem is, I neither know what I actually did wrong nor how I fixed it (in basically total confusion). But it works now and for this PR I am even only 2 stepsize rules aways from finishing all rework.
I think this is good to go (was a bit more work than I thought) – and turned out quite nice I think
@bvdmitri could you maybe check whether the new docs that provide the new interface are understandable? These are
XStepsize
structs below, see https://manoptjl.org/previews/PR392/plans/stepsize/#Manopt.AdaptiveWNGradientStepsize and followingIf you have time, maybe give this new version a test run – if not, that is of course also fine
@mateuszbaran if you have time – the high level interfaces have now been reworked, so there are no warnings from Aqua any longer, especially no ambiguities, start points p
are set to rand
as default last positional argument whenever reasonable. Throughout I hope this is now consistent – and at least Aqua is happy :)
I would like to merge this maybe towards the end of the week, so if you have time to comment on these changes, that would be great – I think this still works as intended; I hope I have documented all breaking changes. Overall this breaking change is mainly unification of documentation and keyword names and improvements on Usability, since M
is less often needed.
I noticed some bad renders in the docs, but otherwise seems good to me :) I also noticed that math doesn't render nicely very often in many docstrings, not sure why is this the case...
Here the English is broken IMO
How about
This function generates a `ManifoldDefaultsFactory`for `ConstantStepsize`.
The manifold specification is optional; if not provided, the manifold will be automatically determined from the optimization procedure. This affects all arguments and keyword arguments with defaults that depend on the manifold, unless explicitly specified here.
Also sometimes its inconsistent as here
(also in other places, e.g. Polyak
and PolyakStepsize
)
But just let me reiterate that this is a great amount of work you did and its quite impressive!
Ah, the info box is a function (with that type as a variable, exactly because I want to just fix the English in one place ;) and maybe I set that in copy. paste not correctly. Will check – also your other findings; thanks for checking!
edit: And thanks again for the nice feedback – that does still not excuse me being unconcentrated when doing all this copy-pase-work – I should probably read the docs one more time myself as well. The typos you found are already fixed :)
I might consider in a later PR to also introduce the factory at least to some stopping criteria (StopWhenChangeLess
or StopWhenGradientChangeLess
or such), but that would also require StopWhenAny
and StopWhenAll
to be able to handle those, so it is maybe a bit more work – should be doable nonbreaking still I think.
thanks again for all the ideas and contributions!
This is an approach to fix #381
🛣️ Roadmap
get/set_manopt_paramater
to justget/set_parameter
since it is no longer exported – and it makes the code a bit more readablestaibilize=
keyword in Quasi newton to aproject!=
keyword as in the other solvers.update_stopping_criterion!
set_parameter
, since the dispatch by the first type should be good enough.AverageGradient()
) and put the details like fields of the structs as well as a reference to the interface into theAverageGradientState
docs.StopWhenChangeLess
StopWhenGradientChangeLess
StopWhenAny
as factories and “producing” them before is a bit too tough, so I will leave this as is for now.inequality_constrains
andequality_constrains
keyword arguments toinequality_constraints
andequality_constraints
respectively. This ensures they are correctly passed to theConstrainedManifoldObjective
constructor