Closed mateuszbaran closed 1 year ago
That sounds like a lot of rewriting, since it affects all retractions - but it also seems reasonable since we have the same for exp
/exp!
and those should be probably checked as well then.
So I think it is a very good idea to do that, just also one that is a lot of work.
It doesn't look to me like that much of work, just a bit tedious. And it would be nice to make Manopt more competitive in terms of performance :slightly_smiling_face: .
Sure, since I feel the interfaces are quite mature by now, of problem and state but also all sub-stuff – Making it more performant is definetly something very cool and might be a small paper somewhere when we not only compare manifold functions but man opt functionality.
I just discussed something with a student when I noticed
that is -step
which is a number so the step size, is actually what you propose here,
or: We already have the fusion defined, though currently by default it does the same as with exp, that is scale and call the normal one
so what we would have to do is to get to the level3 with this function in Base and only then do the default fuse (unless someone defines this.
Yes, we already have the scaled variant in the API.
so what we would have to do is to get to the level3 with this function in Base and only then do the default fuse (unless someone defines this.
I don't quite understand what you mean here. What we need is to have
function retract!(
M::AbstractManifold,
q,
p,
X,
method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)),
)
return retract!(M, q, p, X, one(X), method)
end
instead of
function retract!(
M::AbstractManifold,
q,
p,
X,
t::Real,
method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)),
)
return retract!(M, q, p, t * X, method)
end
And then for each manifold implement the variant with t
instead of the one without t
.
We probably also need to t::Number
instead of t::Real
to allow complex numbers and quaternions.
Yes, I like your result, but it might be a bit to complicated to enforce anyone to implement the fused-one (the not with t
).
So my idea would be (in ManifoldsBase
) to introduce the functions as you say (with t::Number
sure).
But then we should make sure, that it is still fine to just implement the non-fused one (maybe to get started, but also to be backwards compatible), that is maybe a bit of boilerplate code, but I think necessary. We just have to check for a good order probably
You currently propose to switch order of preference (currently, the fused-one falls back to the non-fused one, you want to switch that).
But then you want to change level 2, that is the the _retract
(_retract!
) functions to all have a t
? Or do you want to have both on this level? (maybe the one with t
is enough and keeps it simple).
But then on level 3 (that is the retract_caley
/ retract_caley!
) ones should fuse-by-default (that is the variant with t
should fall back to the one without? Otherwise we loose backwards compatibility.
since Level 2 is always with t
this should be fine?
To summarise, the new workflow is (just for in-place, the other ones when to allocate and step over should stay where they are
retract!(M, q, p, X, m)
is “on highest level” and defaults to a one(eltype(X))
(probably?) into step 2retract!(M, q, p, X, t, m)
passes to _retract!(M, q, p, X, t, m)
_retract!
functions are now with t
retract_caley!(M, q, p, X, t, m)
are now the defaults we overwrite, but if not overwritten theyretract_caley!(M, q, p, t*X, m)
which is mainly for backwards-compatibility and ease of start for new users implementing a retractionIs that dispatch-workflow what you have in mind?
Hm, I haven't thought that much about interactions with decorators and levels. Your new workflow looks fine with one caveat. If we were to do the same change for exp
, this would result in an infinite cycle that would give awkward-looking errors (exp(NewManifold(), p, X)
-> exp(NewManifold(), p, X, 1.0)
-> exp(NewManifold(), p, 1.0 * X)
).
\o/ \o/ (party crowd) for infinite cycles!
I have not yet thought about exp
in this scenario, my main idea was to have the fused-one first but falling back to the non-fused one which is currently usually implemented.
So exp
currently has the upper two levels in my workflow switched? Maybe we want to enable the lower-level-fused exp variant as well and move exp around as well?
exp
and log
currently mostly ignore the levels unfortunately.
We until now kept them a little special, yes; that does not mean we can switch the order of the non-fused and the fused one for exp
.
I have a small design issue with storage, which is (after patching fused retractions in) about half of the time spent running the example from first post. Ideally, previous iterate and gradient storage would be allocated once, stored in a type-stable field and modified by copyto!
. This is currently not possible without major changes, either to the storage structure or places that use storage. Which of these two things would you prefer to change to make this faster?
Uff, in that generality without understanding (a) which major changes and (b) its implications, I can not answer this.
but you are right, for best of cases, the stored iterate and gradient (or any other stored value) should only be allocated once and be stored in a type-stable field.
I had hoped that my storage already is able to do that – but as I maybe wrote a few times – this is a design from a very early stage of me learning Julia (probably back in 2016) – so maybe that is not the case. So can you elaborate a bit on which changes are necessary? Since you mentioned somewhen earlier that my storage might not be that efficient, I maybe have a slight tendency to improve the storage structure and keep the places that use the storage as is?
So can you elaborate a bit on which changes are necessary? Since you mentioned somewhen earlier that my storage might not be that efficient, I maybe have a slight tendency to improve the storage structure and keep the places that use the storage as is?
OK, so the first point is that it is not possible to construct for example efficient FletcherReevesCoefficient
without passing to it at least types of objects in which point and tangent vector are stored. Ideally a point and a tangent vector would be available as a template for allocation but it can be reasonably worked around. That's what I attempted at first but then I noticed that it results in a very complex design that can be massively simplified by just having internal storage in FletcherReevesCoefficient
or a custom StoreStateAction
, which would be basically equally easy to use. Maybe a new AbstractStateAction
would be nicer.
Then there is a matter of constructing FletcherReevesCoefficient
(or its storage). Ideally I'd like to have a constructor of the form FletcherReevesCoefficient(M, p0)
to construct storage. Would that be OK?
For your first part – sorry, I do not follow. I just don't understand what you mean. But for the question on whether to store the point/vector explicitly in FRC – I would prefer not to do that but really to use a storage. For example I have no idea what you propose with a new AbstractStateAction, since I am just not able to follow the arguments before. It seems you mean something very specific but write it in an abstract way that I can not follow.
Sure a storage should be able to get an initial point/vector to set up (and initialise) the storage, but I think that should either be doable or is even done already to some extend? I can check in the evening.
For the constructor – that one I though about a bit already and there is both a way to define that now and a way I want to move to.
To avoid that you have to always set the manifold (e.g. when it is used for the default retraction or something) I wanted to move to
f(M::AbstractManifold=DefaultManifold(2))
as first arguments especially in these inside functions like the coefficients.
Then, a lot of these functions already have a initial_vector=rand(M; vecetor_at=p)
or similar keyword argument, we could use initial_point=rand(M)
similarly as a keyword. And sure both of these keywords can be passed to the storage to create their internal structures.
Oh, maybe the init function I had in mind does not yet exist – but sure a constructor for the storage that initialises some fields would be very useful, indeed.
Manopt is currently full of expressions like
retract!(M, xNew, x, s * η, retr)
. For quickly evaluating objective functions the fact we have to allocates * η
can be a significant slowdown. I've just tried optimizing Rosenbrock using CG:and about half of the time spent (after optimizing storage, but that's a separate story) is in doing these scaled retractions. My suggestion is rather simple and nearly non-breaking: let's move to defining
retract!(M::AbstractManifold, q, p, X, t::Real)
for each manifold instead ofretract!(M::AbstractManifold, q, p, X)
by default to fuse this scaling. After that, Manopt would just change toretract!(M, xNew, x, η, s, retr)
.What do you think @kellertuer ?