Closed mateuszbaran closed 1 year ago
Looking forward to seeing how that turns out. I hope we had good ideas to both improve performance and keeping it nonbreaking :)
I've worked a bit on this today and retract_pade(M::AbstractManifold, p, X, ::Number)
makes me confused due to the two numbers at the end. Or one. It's hard to notice that it actually is well-defined. I think it would be much more clear if we didn't extract number from PadeRetraction
there and just passed the PadeRetraction
object.
It is just one number, since the pade-retraction has an approximation order n
that we extract, when we move from level 2 (dispatch on retraction type) to level 3 (retraction-type functions like retract_pade
.
But sure if you want to go for fusing stuff, then it would have two arguments that are numbers in the end n::Int, t::Real
.
Note that currently retract_caley
defaults to n=1
.
Can you elaborate on the argument why that would not be well-defined? Is it ok if the fix n
to be integer?
My argument for the current form ist that retract_pade
that has a PadeRetraction{n}()
as its last (or with fusion second to last) argument doubles the information that it is a pade-retraction. It also makes the interplay with Caley more complicated. Also the type really just has an integer in its parameters.
If possible I would like to keep the n
there as a last (non-fused) or second-to-last (fused) argument.
But sure if you want to go for fusing stuff, then it would have two arguments that are numbers in the end
n::Int, t::Real
. Note that currentlyretract_caley
defaults ton=1
.
Why that order? Right now we have t
right after X
in exp
and retract
, before retraction method.
Can you elaborate on the argument why that would not be well-defined? Is it ok if the fix
n
to be integer?
As long as n
doesn't have a default value it is well-defined. If it did have a default value for n
, we couldn't be sure if the user wanted to specify n
or t
by an integer after X
. We have retraction methods as optional arguments so expecting n
to maybe have a default value is something I had done until I realized it couldn't work with optional t
as well.
My argument for the current form ist that
retract_pade
that has aPadeRetraction{n}()
as its last (or with fusion second to last) argument doubles the information that it is a pade-retraction. It also makes the interplay with Caley more complicated. Also the type really just has an integer in its parameters. If possible I would like to keep then
there as a last (non-fused) or second-to-last (fused) argument.
Note that we have different expectations of what retract_pade(M, p, X, 2, 3)
is supposed to mean. retract_pade(M, p, X, 2, PadeRetraction(3))
(or retract_pade(M, p, X, PadeRetraction(3), 2)
) would make the intention clear. We can use a different type instead of PadeRetraction
if you prefer (like Val(3)
), just something that is not <:Number
.
But sure if you want to go for fusing stuff, then it would have two arguments that are numbers in the end
n::Int, t::Real
. Note that currentlyretract_caley
defaults ton=1
.Why that order? Right now we have
t
right afterX
inexp
andretract
, before retraction method.
I do not have a strong preference here, I was just thinking: It's added at the end – but you are right it's “at the end but before the retraction” and since n
specifies the retraction your idea of doing t, n
is also very reasonable. Let's go with your choice then, since mine was really only to add t
at the end (which is not what we do). Or maybe not, see next point.
Can you elaborate on the argument why that would not be well-defined? Is it ok if the fix
n
to be integer?As long as
n
doesn't have a default value it is well-defined. If it did have a default value forn
, we couldn't be sure if the user wanted to specifyn
ort
by an integer afterX
. We have retraction methods as optional arguments so expectingn
to maybe have a default value is something I had done until I realized it couldn't work with optionalt
as well.
This might actually be an argument to putting t
last, since I do not expect n
to have a default value, it is a mandatory argument of the retract_pade
and only t
has a default value. It is mandatory, since it is always a given parameter of PadeRetraction
.
My argument for the current form ist that
retract_pade
that has aPadeRetraction{n}()
as its last (or with fusion second to last) argument doubles the information that it is a pade-retraction. It also makes the interplay with Caley more complicated. Also the type really just has an integer in its parameters. If possible I would like to keep then
there as a last (non-fused) or second-to-last (fused) argument.Note that we have different expectations of what
retract_pade(M, p, X, 2, 3)
is supposed to mean.retract_pade(M, p, X, 2, PadeRetraction(3))
(orretract_pade(M, p, X, PadeRetraction(3), 2)
) would make the intention clear. We can use a different type instead ofPadeRetraction
if you prefer (likeVal(3)
), just something that is not<:Number
.
For me, the order is clearer after reading this. the n
should be the 2 is your first notation, since the t
is last in the retract_typeofretraction()
functions in level 3. the n
for me is a mandatory parameter to I am still for doing
retract_pade(M, p, X, n, t=1.0)
where only the t
is optional. t
can be real or integer, n
is always integer. And I think with a proper documentation there is no need to clarify this in the documentation, so I do not see the need to overcomplicate things by introducing types here. edit: We also do not have types for p
or X
at this point, but their intention is well-documented. In that sense we can also handle the n
which is mandatory.
Sorry but after reading your response I'm only more confident that we do need a type wrapper for n
there -- it's just too easy to make a mistake by switching the order of t
and n
, and you suggest an order that is inconsistent with retract
.
Hm, I am still not that convinced, in principle Pole and Schield do the same with the number of steps (and even the NLSolve, though only for the inverse).
My main argument is that we also do not type p and X and in the same manner for retract_pade
the n is a mandatory integer variable added before the t
.
For example the embedded retraction actually needs certain type, there we have that, but for n
I only see that we can restrict it to integer. Still, since it is mandatory, I think there is no confusion (same as the thing that we can not confuse X (on the circle for example) with t
. And sure this is only for retract_pade
on level 3, I am not speaking about Level 1 or 2 where we keep the type for sure.
to be precise, the signatures we have are (in my model)
pade_retract(M, p, X, n, t=1.0)
, so the first integer (after X
with also might be an integer) it n
followed by the optional t
retract_pade!(M, q, p, X, n, t=1.0)
same herein practice these are not exported and not called by a user directly, so the only point where this has an effect is implementing these. There I would prefer to keep it in this simplicity maybe? On the other hand with your argument we also would have to type q,p,X because they could be confused in order?
Hm, I am still not that convinced, in principle Pole and Schield do the same with the number of steps (and even the NLSolve, though only for the inverse).
Pole and Schild are for vector transport IIRC, not retractions?
in practice these are not exported and not called by a user directly, so the only point where this has an effect is implementing these. There I would prefer to keep it in this simplicity maybe? On the other hand with your argument we also would have to type q,p,X because they could be confused in order?
Our entire interface is designed around not constraining q
, p
and X
so the possibility of confusing them is something we just have to accept. I've made the error multiple times, so this design choice isn't something I particularly like. Until IIRC Julia 1.5 it was also motivated by performance but that's no longer an issue.
If you really hate my suggestion then I won't push on this more but I won't like this interface either.
Hm, I am still not that convinced, in principle Pole and Schield do the same with the number of steps (and even the NLSolve, though only for the inverse).
Pole and Schild are for vector transport IIRC, not retractions?
A yes mixed them up, we called them pole_ladder
instead of vector_transport_to_pole
or something but basically they have the same thing.
in practice these are not exported and not called by a user directly, so the only point where this has an effect is implementing these. There I would prefer to keep it in this simplicity maybe? On the other hand with your argument we also would have to type q,p,X because they could be confused in order?
Our entire interface is designed around not constraining
q
,p
andX
so the possibility of confusing them is something we just have to accept. I've made the error multiple times, so this design choice isn't something I particularly like. Until IIRC Julia 1.5 it was also motivated by performance but that's no longer an issue.If you really hate my suggestion then I won't push on this more but I won't like this interface either.
First of all I like the untyped p,q,X
since it allows for arbitrary things to be used with the packages, for example Arrays or StaticArrays without having to introduce wrapper types like MPointArray
or something. I feel that might lead to overtyping and too many end users having to think about introducing the right types instead of things “just working” (if used in right order). Sure this is not a black-white situation but a spectrum, but I feel the current state is a good mix. Don't you?
And since we have a very consistent order of arguments, I think the confusion should not be so large. And I think the same about the n
.
Though in the end, I mean for this PR, we just have to decide on one thing. I do not have a strong opinion here, so if you feel that a type for the n
or passing down the retraction is really necessary, I would prefer passing down the retraction type, since that avoids introducing yet-another-type.
For me personally n
here as the oder of pade has a quite concrete meaning so just keeping it an integer would be my preferred way.
IN short: My preferred way is to not type and keep the n
as in the signatures I wrote above, but if you feel a type is necessary since two numbers as arguments might be confusing (I very much think that is not the case), then let's pass down the PadeRetraction
.
Sure this is not a black-white situation but a spectrum, but I feel the current state is a good mix. Don't you? And since we have a very consistent order of arguments, I think the confusion should not be so large. And I think the same about the
n
.
I generally like our interface, I just wanted to point out that there are some drawbacks of not wrapping things. There are also drawbacks of wrapping things but a single layer of wrapping is rarely an issue in my experience.
Part of the issue with retract_pade
is that its order or arguments is inconsistent with retract
. Of course order of arguments for two different functions doesn't have to be consistent but they are somewhat related.
IN short: My preferred way is to not type and keep the
n
as in the signatures I wrote above, but if you feel a type is necessary since two numbers as arguments might be confusing (I very much think that is not the case), then let's pass down thePadeRetraction
.
OK, I think I will pass down PadeRetraction
then.
Sure this is not a black-white situation but a spectrum, but I feel the current state is a good mix. Don't you? And since we have a very consistent order of arguments, I think the confusion should not be so large. And I think the same about the
n
.I generally like our interface, I just wanted to point out that there are some drawbacks of not wrapping things. There are also drawbacks of wrapping things but a single layer of wrapping is rarely an issue in my experience.
Ok, I feel wrapping might be less usable for an end-user who just has arrays and wants to use those. So that is not a computational drawback but a usability one, that is might feel more complicated.
Part of the issue with
retract_pade
is that its order or arguments is inconsistent withretract
. Of course order of arguments for two different functions doesn't have to be consistent but they are somewhat related.
Could you elaborate a bit on that since I feel if “retraction goes last if it is there” and “t is before that if it is there” are the assumptions we do, then this is consistent.
IN short: My preferred way is to not type and keep the
n
as in the signatures I wrote above, but if you feel a type is necessary since two numbers as arguments might be confusing (I very much think that is not the case), then let's pass down thePadeRetraction
.OK, I think I will pass down
PadeRetraction
then.
Ok, as I said not my favourite one, but one I am ok with, we have the embedded_retraction which does that as well because it needs a retraction there, so that is not really inconsistent. And actually one could dispatch on PadeRetraction{1}
if needed for Caley? Also nice.
Sure this is not a black-white situation but a spectrum, but I feel the current state is a good mix. Don't you? And since we have a very consistent order of arguments, I think the confusion should not be so large. And I think the same about the
n
.I generally like our interface, I just wanted to point out that there are some drawbacks of not wrapping things. There are also drawbacks of wrapping things but a single layer of wrapping is rarely an issue in my experience.
Ok, I feel wrapping might be less usable for an end-user who just has arrays and wants to use those. So that is not a computational drawback but a usability one, that is might feel more complicated.
Sure, I think we should change that part of our interface.
Part of the issue with
retract_pade
is that its order or arguments is inconsistent withretract
. Of course order of arguments for two different functions doesn't have to be consistent but they are somewhat related.Could you elaborate a bit on that since I feel if “retraction goes last if it is there” and “t is before that if it is there” are the assumptions we do, then this is consistent.
Yes, these are the assumptions I have in mind but in your suggestion retraction-related information (n
) goes before t
.
And actually one could dispatch on
PadeRetraction{1}
if needed for Caley? Also nice.
Yes, begin able to dispatch on n
is a small benefit of not unwrapping, so we don't even need retract_caley
then (which has a typo BTW...)
Sure this is not a black-white situation but a spectrum, but I feel the current state is a good mix. Don't you? And since we have a very consistent order of arguments, I think the confusion should not be so large. And I think the same about the
n
.I generally like our interface, I just wanted to point out that there are some drawbacks of not wrapping things. There are also drawbacks of wrapping things but a single layer of wrapping is rarely an issue in my experience.
Ok, I feel wrapping might be less usable for an end-user who just has arrays and wants to use those. So that is not a computational drawback but a usability one, that is might feel more complicated.
Sure, I think we should change that part of our interface.
Part of the issue with
retract_pade
is that its order or arguments is inconsistent withretract
. Of course order of arguments for two different functions doesn't have to be consistent but they are somewhat related.Could you elaborate a bit on that since I feel if “retraction goes last if it is there” and “t is before that if it is there” are the assumptions we do, then this is consistent.
Yes, these are the assumptions I have in mind but in your suggestion retraction-related information (
n
) goes beforet
.
Retraction (type) goes last, not retraction information – but well, as I said let's keep the retraction type also for the dispatch argument.
And actually one could dispatch on
PadeRetraction{1}
if needed for Caley? Also nice.Yes, begin able to dispatch on
n
is a small benefit of not unwrapping, so we don't even needretract_caley
then (which has a typo BTW...)
Oh! In deed, let's deprecate the typo one then. I think retract_caley
should be kept, though merely for a user who has caley
in mind, wants to overwrite that without knowing/thinking about that it is a special case of pade. So, keeping caley
is mainly for the users convenience again (that it is another name for pade(1)).
I've changed DefaultManifold
to store size in a field to make it do less compilation. Actually very little can be optimized by statically knowing the dimension.
By the way, I'm up to about Stiefel in updating Manifolds.jl to these changes. I will make a PR to Manifolds.jl when tests pass locally.
Merging #146 (aef5a0a) into master (65e9263) will decrease coverage by
0.04%
. The diff coverage is100.00%
.:exclamation: Current head aef5a0a differs from pull request most recent head d60387a. Consider uploading reports for the commit d60387a to get more accurate results
@@ Coverage Diff @@
## master #146 +/- ##
==========================================
- Coverage 99.87% 99.84% -0.04%
==========================================
Files 19 19
Lines 2464 2522 +58
==========================================
+ Hits 2461 2518 +57
- Misses 3 4 +1
Impacted Files | Coverage Δ | |
---|---|---|
src/DefaultManifold.jl | 100.00% <100.00%> (ø) |
|
src/ManifoldsBase.jl | 98.75% <100.00%> (-0.80%) |
:arrow_down: |
src/PowerManifold.jl | 99.81% <100.00%> (-0.19%) |
:arrow_down: |
src/decorator_trait.jl | 100.00% <100.00%> (ø) |
|
src/exp_log_geo.jl | 100.00% <100.00%> (ø) |
|
src/nested_trait.jl | 100.00% <100.00%> (ø) |
|
src/point_vector_fallbacks.jl | 100.00% <100.00%> (+0.74%) |
:arrow_up: |
src/retractions.jl | 100.00% <100.00%> (ø) |
|
... and 3 more |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
If you say that thats easier to handle, sure the DefaultManifold can be changed.
Nice, I assume it is a bit of work to get the new versions fast?
Nice, I assume it is a bit of work to get the new versions fast?
Most of the time it's either easy or there doesn't seem to be much room for improvement over just multiplying the vector by t
. At the moment the biggest problem is that I'm not sure how to multiply StiefelFactorization
by t
. Yes, it's defined as an MPoint
but it's also used to represent tangent vectors.
I can check tomorrow, but I thinking's mainly some caching (maybe for Cholesky?).
Oh yes that is quite unfortunate that the StiefelFactorization
is an MPoint
but also used as a tangent vector, it should probably just be struct? At first sight, I do not see direct how the multiplication with t works here and would have to check the ZimmermannHüper paper more closely.
Yes, I'd prefer StiefelFactorization
to just be a struct. I'd expect the scaling to be just multiplication of the Z
part by t
but I'm not fully sure.
Then lets just make it a struct and as far as I see it might indeed just be the Z, yes.
I just discovered that #134 made isapprox
quite slow -- something like 35 microseconds when actual isapprox
on arrays takes about 60 ns. My last commit got it down to about 1.2 us but to make it acceptable I need to change the interface of function check_approx(M::AbstractManifold, p, q; kwargs...)
a bit -- we can't have it generate a string when all we need is a yes/no response. Do you have a preferred way to handle this @kellertuer ?
BTW, I discovered the issue when I was investigating why statistics.jl tests in Manifolds.jl take so long to finish (about 20 minutes on my computer). A large part of that was calling isapprox
in median calculation.
I just discovered that #134 made
isapprox
quite slow -- something like 35 microseconds when actualisapprox
on arrays takes about 60 ns. My last commit got it down to about 1.2 us but to make it acceptable I need to change the interface offunction check_approx(M::AbstractManifold, p, q; kwargs...)
a bit -- we can't have it generate a string when all we need is a yes/no response. Do you have a preferred way to handle this @kellertuer ?
I am often amazed what you find :) Since that is an internal function sure we can change it. But to my (though untrained) eye, the new shortcut-return should already the point that no string is generated?
I do not have a preferred way of handling this, no, if you have a good idea, feel free to propose it here.
I am often amazed what you find :)
Thanks. I can show you how I do it if you want :). It's not that complicated.
Since that is an internal function sure we can change it. But to my (though untrained) eye, the new shortcut-return should already the point that no string is generated?
This shortcut skips creating one string, but check_approx(M::AbstractManifold, p, q; kwargs...)
still makes one.
I'm not sure what the best solution here is. Currently manifolds implement isapprox
ignoring the new interface. We probably need the fast yes/no implementations to just coexist next to the ones providing errors. The logic is going to be doubled but I feel that the cost of not doubling it and providing full functionality is too large.
is_point
/check_point
functions may have a similar issue but they are less likely to occur in a performance-critical code.
I thought we even introduced the check_
functions to split something already.
I am not so much a fan of doubling, but sure if that is necessary for performance, then let‘s do that. If it is not too much work, I would prefer doing the same for point/vector as well, because that would be consistent.
There wouldn't be that much doubling in isapprox
because it has few methods and they are relatively simple. Doing something similar for is_point
/is_vector
would be much more doubling though, and the speed gains would be less important. So I'm not sure it's a good idea at this time.
That‘s why I phrased it carefully, then the inconsistency is ok and we just adapt isapprox, of course.
Manifolds.jl tests pass for me locally now with these changes, so I think you can start looking at this :slightly_smiling_face: . I will still investigate a bit more if these are all the changes we need at the moment, since it's breaking, and will make a short list of changes.
Will check the fusion soon, for now just 2 comments on the change in isapprox.
I've added a changelog, what do you think?
I've added a changelog, what do you think?
Looks good, thanks for starting this. We should do this in the other packages as well, at least for breaking changes/releases. And the changelog contains the one criticism I see in the nicer, faster variants – the user has to think of this and implement a “slightly more complicated version”. Should we adapt the tutorial to that? Se https://juliamanifolds.github.io/ManifoldsBase.jl/stable/example.html#manifold-tutorial-fn and maybe even the technical note before?
We should do this in the other packages as well, at least for breaking changes/releases.
Yes, that would be nice. At least this is a start :slightly_smiling_face: .
And the changelog contains the one criticism I see in the nicer, faster variants – the user has to think of this and implement a “slightly more complicated version”. Should we adapt the tutorial to that? Se https://juliamanifolds.github.io/ManifoldsBase.jl/stable/example.html#manifold-tutorial-fn and maybe even the technical note before?
Sure, I forgot about it. I will modify the tutorial.
I think everything looks fine now and this can be merged :slightly_smiling_face: .
Here I'm working on https://github.com/JuliaManifolds/Manifolds.jl/issues/569 .