JuliaManifolds / Manifolds.jl

Manifolds.jl provides a library of manifolds aiming for an easy-to-use and fast implementation.
https://juliamanifolds.github.io/Manifolds.jl
MIT License
368 stars 53 forks source link

Implement `translate_diff` and `inv_diff` for all groups at once? #679

Open olivierverdier opened 10 months ago

olivierverdier commented 10 months ago

Here is a suggestion to decrease the amount of code, and improve the user interface of Manifolds.jl.


I suggest the basic method to be implemented to be the left and right adjoint action. Something like

adjoint_action!(G::MyGroup, Y, p, X, ::LeftAction) = ... # essentialy, pXp^{-1}
adjoint_action!(G::MyGroup, Y, p, X, ::RightAction) = ... # essentially p^{-1}Xp

This has to be implemented for every group (if only the left adjoint action is implemented, there is an easy fallback (can also be made as a test) such as adjoint_action!(G::MyGroup, Y, p, X, ::RightAction) = adjoint_action!(G, Y, inv(G,p), X, LeftAction())).


Now, the rest is automatic.

1) To state the obvious, implement a general adjoint_action(G, p, X) by allocation and using adjoin_action!, something like

function adjoint_action(G, p, X, conv)
    Y = allocate_result(G, adjoint_action, p, X, conv)
    return adjoint_action!(G, Y, p, X, conv)
end

2) Then also, for backward compatibility:

adjoint_action(G, p, X) = adjoint_action(G, p, X, LeftAction()) # backward compatibility

3) Now with the left invariance convention for storing tangent vectors, one has:

translate_diff!(G, Y, ::Any, ::Any, X, ::LeftForwardAction,) = copyto!(G, Y, X)
translate_diff!(G, Y, ::Any, ::Any, X, ::RightForwardAction,) = copyto!(G, Y, X)
translate_diff!(G, Y, p, ::Any, X, ::LeftBackwardAction,) = adjoint_action!(G, Y, p, X, LeftAction())
translate_diff!(G, Y, p, ::Any, X, ::RightBackwardAction,) = adjoint_action!(G, Y, p, X, RightAction())

Then, remove all the specific instances of translate_diff that are implemented (except those with a different convention for storing tangent vectors). That is less code to manage and test. 4) and also

inv_diff(G, p, X) = -adjoint_action(G, p, X, LeftAction())

(and then all the apply_diff_group and apply_diff for group operation actions come for free, since inv_diff is always defined; it is already implemented this way in group.jl).


Benefits: 1) in the documentation for creating a new group, mention that adjoint_action is one of the main method that has to be implemented (with the other main ones, obviously, such as compose, lie_bracket, etc.)

2) People can still overwrite any of the other methods such as translate_diff etc. if they want to store tangent vectors in a different way

3) If the left invariance seems arbitrary, one can later add a switch (by storing the side of the storage in the Group struct). Nothing urgent though, in my opinion.

mateuszbaran commented 10 months ago

I think this is a good idea. Right now our definitions are somewhat messy. I will keep this in mind for the next round of group cleanup.

kellertuer commented 10 months ago

To me that reads fine as well, we should just be careful, that this is nonbreaking.

olivierverdier commented 10 months ago

This change is definitely non-breaking.

However, if a group stores tangent vectors differently, and the various translate_diff are not implemented (yes, I'm looking at you SemiDirectProductGroup), although this change is definitely non breaking, code that previously threw a MethodError will now run but give the wrong result. It is thus imperative to

kellertuer commented 10 months ago

I would be against the second, since manual error messages have the disadvantage that (compared to a method error) no hints are given in case of a typo. We should then maybe be more careful to minimise the wrong results – for best of cases carefully take into account where that might appear and fix it there? I do not understand enough of the theory behind to see where they appear, but maybe that can be fixed to some extend generically as well?

mateuszbaran commented 10 months ago

The problem here is providing default that work for groups with left-invariant tangent vector representation (basically all our groups except semidirect products), while not giving wrong results for the groups that don't use the left-invariant convention. The alternative to error is not MethodError, it's wrong result or having no default, or maybe constructing an elaborate trait solution which would make the code even less understandable.

olivierverdier commented 10 months ago

There are four solutions, I emphasize the effort to be put and the backward compatibility breaking:

1) The error solution above, Easy, Non-breaking. 2) Switching SemiDirectProduct to left-invariant storage. Easy, Mildly breaking, unless you consider that tangent vector storage is not the users concern. 3) Implementing the three missing methods for SemiDirectProduct. Medium (?), Non-breaking. 4) Adding some trait mechanism that makes sure that no confusion is ever possible. Best solution hands down, but maybe complicated to implement. Hard, Non-breaking.


In summary:

Solution 1 is the best in the short run. (It is totally ok, as these methods are already missing (and shouldn't), so no one ever uses them anyway. Also: translate_diff can be considered an "internal" method, I doubt many users ever use it directly, so even a MethodError is rather confusing.)

But yes, this is not entirely satisfactory, so one can implement either solutions 2 (very easy) or, solution 3, if one has some extra time to put on this.

Solution 4 (or similar) is unavoidable on the very long term, in my opinion: it should be clear and explicit how the tangent vectors are stored. But that is not at all urgent, especially if one chooses solution 2.

Does that make sense?

olivierverdier commented 10 months ago

(In the original post, I forgot about all the various inv_diff methods that could now be removed as well)

olivierverdier commented 10 months ago

More simplifications: when storing tangent vectors as left invariant vector fields, you actually also have the identities, which allow to define log and exp from log_lie and exp_lie:

log(G, p, q) = log_lie(G, compose(G, inv(G,p), q))

and

exp(G, p, X) = compose(G, p, exp_lie(G, X))

(In fact if you look closely at, for instance, the implementation of log in GeneralUnitaryMatrics.jl, it is just an implementation of log_lie.)

So the specific implementations of exp and log methods in each individual group can also be removed.

kellertuer commented 10 months ago

This sounds fine to me as well.

Could you maybe start implementing this (maybe even implement the idea completely?) and do a PR? We can surely help with checking tests and documentation then as well.

olivierverdier commented 10 months ago

Sure, I'll give it a shot.

mateuszbaran commented 10 months ago

There are four solutions, I emphasize the effort to be put and the backward compatibility breaking:

I think this is a good summary of options. In Manifolds.jl we do consider point and vector storage format as part of the API so (2) would require a breaking release. We just had a breaking release recently so even if we decide to change the representation here, I think in the short term it's better to explore other options. I think (1) is perfectly fine for now.

I won't have much time to work on Manifolds.jl for a couple of weeks but I could review your PR.

kellertuer commented 10 months ago

the “not having so much time” is also my point here; reviewing a PR is something I can manage – for the rest I am happy if I continue slowly on the PRs I already have started for now.

olivierverdier commented 10 months ago

I guess another exception to the left-invariance storage is CircleGroup, right?

mateuszbaran commented 10 months ago

I checked to be sure and it looks like we ended up in an inconsistent state somehow. Manifold structure is implemented without left-invariance, but at least part of group code (exp_lie, log_lie) is implemented using left-invariance. I need to fix it. @kellertuer I think this isn't documented anywhere but I assume left-invariant tangent representation wasn't supposed to be used for circle group?

kellertuer commented 10 months ago

No until now that was not supposed to be the case.

and sure manifold structures are implemented without left invariance in mind, because without a group structure, left invariance is not defined. And I think the reason we split of exp_lie from exp is exactly the left invariance.

mateuszbaran commented 10 months ago

OK. Using a different tangent vector representation for exp_lie doesn't seem to make sense though? exp_lie is different from exp because it is conceptually a different operation.

kellertuer commented 10 months ago

Then I am maybe confusing something. As you might have noticed I am not that well-educated yet on Lie groups it seems – I conclude that from the proposal to define exp for groups (see above) – if the are conceptually different, then we maybe should not define exp as proposed above.

mateuszbaran commented 10 months ago

Sure, I was just asking if you maybe remember something about CircleGroup. Changing implementation of exp to use exp_lie would be very tricky because exp needs to be available on the manifold without group structure, which doesn't have exp_lie. I'm not opposed to it but I think we have more important things to work on than removing a bit of duplication.

kellertuer commented 10 months ago

No, I do not remember when or what we changed about CircleGroup – and I agree that I prefer the code duplication in favour of also having the manifold without group structure if someone prefers to have that.

olivierverdier commented 10 months ago

I checked to be sure and it looks like we ended up in an inconsistent state somehow. Manifold structure is implemented without left-invariance, but at least part of group code (exp_lie, log_lie) is implemented using left-invariance. I need to fix it.

Hang on, exp_lie and log_lie should only work on the tangent space at identity and therefore be independent of the way tangent vectors are stored? (although they depend on how tangent vectors at identity are stored, but that is unavoidable) What is inconsistent, exactly?

olivierverdier commented 10 months ago

And I think the reason we split of exp_lie from exp is exactly the left invariance.

To me, exp and exp_lie are two fundamentally (but related) operations.

The relation between the two is that if you choose any of the Cartan–Schouten connections, then exp can be implemented from exp_lie as above.


and sure manifold structures are implemented without left invariance in mind, because without a group structure, left invariance is not defined.

More generally, the left-invariant storage is exploiting the fact that the Cartan–Schouten connection (the one you call CartanSchoutenMinus), is curvature-free.

This "storage trick" would in principle work on any manifold equipped with a curvature-free connection: a single vector space is isomorphic to all the tangent spaces, allowing for this efficient storage.

mateuszbaran commented 10 months ago

Hang on, exp_lie and log_lie should only work on the tangent space at identity and therefore be independent of the way tangent vectors are stored? (although they depend on how tangent vectors at identity are stored, but that is unavoidable) What is inconsistent, exactly?

You're right, this is consistent, I forgot exp_lie and log_lie don't depend on the storage.

kellertuer commented 10 months ago

More generally, the left-invariant storage is exploiting the fact that the Cartan–Schouten connection (the one you call CartanSchoutenMinus), is curvature-free.

The main problem here is, that we do have a trait (decorator) system, and we have a few manifolds where the connection is specified – for this to work consistently and correctly, we would first have to be more rigorous in using a connection trait on all manifolds (and for now we even do not do that completely for the metric). Otherwise this would at least be imprecise behaviour.

olivierverdier commented 10 months ago

More generally, the left-invariant storage is exploiting the fact that the Cartan–Schouten connection (the one you call CartanSchoutenMinus), is curvature-free.

The main problem here is, that we do have a trait (decorator) system, and we have a few manifolds where the connection is specified – for this to work consistently and correctly, we would first have to be more rigorous in using a connection trait on all manifolds (and for now we even do not do that completely for the metric). Otherwise this would at least be imprecise behaviour.

In my opinion, the biggest obstacle is that most connection are not curvature free. This is really a special feature of the Lie groups (and that only works when using either of the two Cartan–Schouten connections, not even a combination of them).

kellertuer commented 10 months ago

That is the theoretical one, for sure, but in practice, nearly none of our manifolds has their (default) connection specified, so there it is programatically even worse since you can not check whether you have a metric that fits.

olivierverdier commented 10 months ago

With regards to solution 4 above, I think it would make sense to set the CartanSchoutenMinus as a default connection to all the Lie Groups (except the ones with a different storing of tangent vectors), and then restrict the new implementation of translate_diff and inv_diff to manifolds with the connection CartanSchoutenMinus.

kellertuer commented 10 months ago

Yes, that would be the most precise (and in my opinion best) way to go :)

mateuszbaran commented 10 months ago

Using CartanSchoutenMinus as the default connection for Lie groups would be a significant breaking change but it does make sense.