Closed const-ae closed 2 years ago
I've tried to figure it out but your have quite a weird example here. First it would be a good idea to indicate which manifold/tangent space each quantity belongs to, and if operations performed actually make sense. This: diff_fnc(p, embed(M, p, v))
indicates that diff_fnc
expects a vector from the Euclidean space as the second argument (that's what embed
does) but in the difficult problem: diff_fnc(arg, v) = -(Y - exp(S, p, arg)) .* differential_exp_argument(S, p, arg, v)
it's used as a tangent vector on S
(so embed
is not a good idea). Note that M = TangentSpace(S, p)
inherits representation from S
, so we can just use S
here. Next, I think df0
doesn't do what's in the Section 4.8 of that book. I'd change sum( something .* something)
into an appropriate call to inner
. There are usually the same but definitely not always. Also, subtracting two points is not a valid notion. You can use log
, though. All in all, this is how I managed to (partially) clean it up:
function check_diff(M::AbstractManifold, cost_fnc, diff_fnc; p = random_point(M), v = random_tangent(M, p),
log_space_range = range(-8,0, length = 51))
f0 = cost_fnc(p)
df0 = inner(M, p, v, diff_fnc(p))
log_space = 10 .^log_space_range
# Based on the Taylor approximation of development of the cost function around point x:
# f(R_x(t * v)) = f(x) + t * <grad f(x), v)> + O(t^2)
lhs = [cost_fnc(exp(M, p, v, t)) for t in log_space]
rhs = [f0 + t * df0 for t in log_space]
err = abs.(rhs - lhs)
plt = scatter(log10.(log_space), log10.(err), legend = false); Plots.abline!(plt, 1, 0); Plots.abline!(plt, 2, 0)
return plt
end
function difficult_problem(S)
q = random_point(S)
p = random_point(S)
v = random_tangent(S, p)
dir = random_tangent(S, p)
cost_fnc(arg) = 0.5 * distance(S, q, exp(S, p, arg))^2
diff_fnc(arg) = adjoint_differential_exp_argument(S, p, arg, -log(S, q, exp(S, p, arg)))
return check_diff(S, cost_fnc, diff_fnc, p = dir, v = v)
end
function easy_problem(S)
p = random_point(S)
cost_fnc(arg) = 0.5 * distance(S, p, arg)^2
diff_fnc(arg) = -log(S, arg, p)
return check_diff(S, cost_fnc, diff_fnc)
end
easy_problem
seems to work completely but difficult_problem
does not. I'm not exactly sure how to use differentials from Manopt.jl, so @kellertuer could you help me fix that gradient?
I just took a five-minute look and was surprised the sum
by @const-ae (in the cost function) would have a minus – I would expect distance on a manifold.
Then I am not sure why you use and adjoint differential for the differential? I would concatenate the differential of distance (hm, a log basically that is) with the differential of the exp.
With the adjoint you already go for the gradient (then diff_fnc
is misleading).
So maybe
diff_fnc(arg) = differential_exp_argument(S, p, arg, -log(S, q, exp(S, p, arg)))
(if I do not mix up the order, but I think the order switches so dist/log is the inner differential and exp/its-diff is the outer one). But I do not have the time to check that right now, so this is just a first guess. Ah, no, compare (4.21) in Nicolas Boumals book, hm, sorry but I first have to prepare my lecture stuff for next week.
I found at least a short moment to check the easy one.
I adapted the check_diff
slightly, but that is mainly because cost/diff/grad functions have the manifold as fist argument (by convention) in Manopt
function check_diff(M::AbstractManifold, cost_fnc, diff_fnc; p = random_point(M), v = random_tangent(M, p),
log_space_range = range(-8,0, length = 51))
f0 = cost_fnc(M, p)
df0 = diff_fnc(M, p, v)
log_space = 10 .^log_space_range
# Based on the Taylor approximation of development of the cost function around point x:
# f(R_x(t * v)) = f(x) + t * <grad f(x), v)> + O(t^2)
lhs = [cost_fnc(M, geodesic(M, p, v, t)) for t in log_space]
rhs = [f0 + t * df0 for t in log_space]
err = abs.(rhs - lhs)
plt = scatter(log10.(log_space), log10.(err), legend = false);
Plots.abline!(plt, 1, 0); Plots.abline!(plt, 2, 0)
return plt
end
The easy problem
function easy_problem(S)
q = random_point(S)
cost_fnc(M, p) = 1//2 * distance(S, q, p)^2
diff_fnc(M, p, v) = inner(M, p, -log(M, p, q), v)
return check_diff(S, cost_fnc, diff_fnc)
end
Note
You could check https://manoptjl.org/stable/functions/gradients.html#Manopt.grad_distance for the gradient to see that it is -log (and then the differential is just the inner product of v with the gradient).
Calling easy_problem(Rotations(3))
, the check prints
That the previous version was fine is mainly because for the manifolds you checked either Minus&Norm was distance or in the embedding it was close enough.
I will check the similar ideas for the more involved one later – but I hope this already gives a good idea how to do these :)
For the complicated problem I think (but the check still disagrees) it should be something like
function difficult_problem(S)
p = random_point(S)
q = random_point(S)
M = TangentSpace(S, p)
cost_fnc(M, X) = 0.5 * distance(S, q, exp(S, p, X)).^2
diff_fnc(M, X, Y) = inner(S, exp(S, p, X), -log(S, exp(S, p, X), q), differential_exp_argument(S, p, X, Y))
X = random_tangent(S, p)
Y = random_tangent(S, p)
return check_diff(M, cost_fnc, diff_fnc, p = X, v = Y)
end
where I might still confuse the chain rule – will try to check with pen/paper later (as well as provide an explanation for the formula there).
edit: I found a few bugs in my code, but this still does not seem to work correctly, so here is also my derivation of why I think this is correct – and a next thing to check is the ONB maybe? Because for some values the error looks okay-ish
I've fixed the problems with diagonalizing ONB although the code is a bit ugly now:
function get_basis(M::Rotations{3}, p, B::DiagonalizingOrthonormalBasis{ℝ})
if manifold_dimension(M) != 3
error("get_basis is only defined for Rotations(3).")
end
decomp = schur(B.frame_direction)
U = decomp.vectors
e1, e2, e3 = [[1,0,0], [0,1,0], [0,0,1]]
e32 = (e3 * e2' - e2 * e3')
e21 = (e2 * e1' - e1 * e2')
e31 = (e3 * e1' - e1 * e3')
println("Schur: ", decomp.Schur)
beta_e32 = (e32 .* decomp.Schur)[3,2]
beta_e31 = (e31 .* decomp.Schur)[3,1]
beta_e21 = (e21 .* decomp.Schur)[2,1]
beta_e32_abs = abs(beta_e32)
beta_e31_abs = abs(beta_e31)
beta_e21_abs = abs(beta_e21)
if beta_e32_abs > beta_e31_abs && beta_e32_abs > beta_e21_abs
beta = beta_e32
e_fd = e32
e_1 = e21
e_2 = e31
elseif beta_e31_abs > beta_e21_abs
beta = beta_e31
e_fd = e31
e_1 = e21
e_2 = e32
else
beta = beta_e21
e_fd = e21
e_1 = e32
e_2 = e31
end
println(beta_e32, " || ", beta_e21, " || ", beta_e31)
println("Beta: ", beta)
println("decomposition error: ", norm(B.frame_direction - U * beta * e_fd * U'))
println("Vectors: ", decomp.vectors)
println(beta)
# eigenvectors
Ξ = [U * e_fd * U' ./ sqrt(2),
U * e_1 * U'./ sqrt(2),
U * e_2 * U'./ sqrt(2)]
println("Frame direction: ", B.frame_direction)
println(Ξ)
# corresponding eigenvalues
κ = [0,beta^2 / 4,beta^2 / 4]
return CachedBasis(B, κ, Ξ)
end
The two problems were:
Now the plot looks like expected:
Wow! Nicest team work!
edit the first check is not necessary since we dispatch on Rotations(3)
already.
Wow, that is great. Thank you both so much for figuring this out. I will test it tomorrow :)
For reference, here are some Jacobians worked out: https://vision.in.tum.de/_media/members/demmeln/nurlanov2021so3log.pdf but I'm not currently sure how they relate to differentials of log and exp on SO(3).
Hi, quick update.
I took @mateuszbaran's code and generalized it to rotations of arbitrary dimension 🥳
The code is not particularly appealing (I haven't found an elegant way to code the pattern in which the eigenvectors occur, so it's rather brute-force'ish), but works and is surprisingly efficient (median 30ms for get_basis()
of 50 dimensions).
using LinearAlgebra
using Manifolds, Manopt
import Manifolds.get_basis
using Plots
function check_diff(M::AbstractManifold, cost_fnc, diff_fnc; p = random_point(M), v = random_tangent(M, p),
log_space_range = range(-8,0, length = 51))
f0 = cost_fnc(M, p)
df0 = diff_fnc(M, p, v)
log_space = 10 .^log_space_range
# Based on the Taylor approximation of development of the cost function around point x:
# f(R_x(t * v)) = f(x) + t * <grad f(x), v)> + O(t^2)
lhs = [cost_fnc(M, geodesic(M, p, v, t)) for t in log_space]
rhs = [f0 + t * df0 for t in log_space]
err = abs.(rhs - lhs)
plt = scatter(log10.(log_space), log10.(err), legend = false);
Plots.abline!(plt, 1, 0); Plots.abline!(plt, 2, 0)
return plt
end
#### NEW CODE ####
function get_tridiagonal_elements(trian)
N = size(trian, 1)
res = zeros(N)
down = true
for i in 1:N
if i == N && down
elem = 0
else
elem = trian[i + (down ? +1 : -1), i]
end
if elem ≈ 0
res[i] = 0
else
res[i] = elem
down = !down
end
end
return res
end
function ev_diagonal(tridiagonal_elements, unitary; i)
a,b = [unitary[:,i] for i in [i, i+1]]
evec = [[-b a] * [a b]' ./ sqrt(2)]
evals = [0.0]
return (values = evals, vectors = evec)
end
function ev_offdiagonal(tridiagonal_elements, unitary; i, j)
a,b,c,d = [unitary[:,i] for i in [i, i+1, j, j+1]]
ref = transpose(view(unitary, :, [i,i+1, j,j+1]))
evec = [
[-c -d a b] * ref ./ sqrt(4),
[-c d a -b] * ref ./ sqrt(4),
[-d -c b a] * ref ./ sqrt(4),
[ d -c b -a] * ref ./ sqrt(4),
]
evals = [(tridiagonal_elements[i] - tridiagonal_elements[j])^2 / 4,
(tridiagonal_elements[i] + tridiagonal_elements[j])^2 / 4,
(tridiagonal_elements[i] + tridiagonal_elements[j])^2 / 4,
(tridiagonal_elements[i] - tridiagonal_elements[j])^2 / 4]
return (values = evals, vectors = evec)
end
function ev_zero(tridiagonal_elements, unitary; i)
evec = Vector{Array{Float64,2}}()
evals = Vector{Float64}()
ref = unitary[:, i]
for idx in 1:size(unitary,1)
if idx != i
push!(evec, (ref * unitary[:, idx]' - unitary[:, idx] * ref') ./ sqrt(2))
push!(evals, tridiagonal_elements[idx]^2 / 4)
end
end
return (values = evals, vectors = evec)
end
function get_basis(M::Rotations{N}, p, B::DiagonalizingOrthonormalBasis{ℝ}) where N
trian, unitary, ev = schur(B.frame_direction)
trian_elem = get_tridiagonal_elements(trian)
evec = Vector{Array{Float64,2}}()
evals = Vector{Float64}()
i = 1
while i <= N
if trian_elem[i] == 0
evs = ev_zero(trian_elem, unitary, i=i)
push!(evec, evs[:vectors]...)
push!(evals, evs[:values]...)
i += 1
else
evs = ev_diagonal(trian_elem, unitary, i=i)
push!(evec, evs[:vectors]...)
push!(evals, evs[:values]...)
j = 1
while j < i
if trian_elem[j] == 0
j += 1
else
evs = ev_offdiagonal(trian_elem, unitary, i=i, j=j)
push!(evec, evs[:vectors]...)
push!(evals, evs[:values]...)
j += 2
end
end
i += 2
end
end
return CachedBasis(B, evals, evec)
end
jacobi_op(A, X) = 1/4 * (-A * A * X + 2 * A * X * A - X * A * A)
skew(M) = (M - M') / 2
N = 50
A = skew(randn(N, N));
B = get_basis(Rotations(N), nothing, DiagonalizingOrthonormalBasis(A));
evals = B.data.eigenvalues
evec = B.data.vectors
# Check eigen equation
all([isapprox(jacobi_op(A, vec), lambda * vec, atol = 1e-12) for (lambda, vec) in zip(evals, evec)])
# check that all eigen vectors are orthogonal
all([isapprox(sum(ev1 .* ev2), 0, atol = 1e-10) for ev1 = evec, ev2 = evec if ev1 != ev2])
all([sum(ev .* ev) ≈ 1 for ev in evec])
function difficult_problem(S)
p = random_point(S)
q = random_point(S)
M = TangentSpace(S, p)
cost_fnc(M, X) = 0.5 * distance(S, q, exp(S, p, X)).^2
diff_fnc(M, X, Y) = inner(S, exp(S, p, X), -log(S, exp(S, p, X), q), differential_exp_argument(S, p, X, Y))
X = random_tangent(S, p)
Y = random_tangent(S, p)
return check_diff(M, cost_fnc, diff_fnc, p = X, v = Y)
end
difficult_problem(Rotations(N))
Nice!
I am sure Mateusz can find some ways to optimise it still – but a PR of this would be neat! For that I would just ask for more documentation, and for the basis itself even with math formulae and a reference.
That's great! I'll try optimizing it a bit during the weekend.
This is a bit faster, though maybe a little less nice. I've mostly worked on reducing the number of allocations.
function get_tridiagonal_elements(trian)
N = size(trian, 1)
res = zeros(N)
down = true
for i in 1:N
if i == N && down
elem = 0
else
elem = trian[i + (down ? +1 : -1), i]
end
if elem ≈ 0
res[i] = 0
else
res[i] = elem
down = !down
end
end
return res
end
function ev_diagonal(tridiagonal_elements, unitary, evec, evals, fill_at; i)
a = unitary[:,i]
b = unitary[:,i+1]
evec[fill_at.x] = [-b a] * [a b]' ./ sqrt(2)
evals[fill_at.x] = 0
fill_at.x += 1
end
function ev_offdiagonal(tridiagonal_elements, unitary, evec, evals, fill_at; i, j)
a = unitary[:,i]
b = unitary[:,i+1]
c = unitary[:,j]
d = unitary[:,j+1]
ref = hcat(a, b, c, d)' ./ 2
evec[fill_at.x] = [-c -d a b] * ref
evals[fill_at.x] = (tridiagonal_elements[i] - tridiagonal_elements[j])^2 / 4
fill_at.x += 1
evec[fill_at.x] = [-c d a -b] * ref
evals[fill_at.x] = (tridiagonal_elements[i] + tridiagonal_elements[j])^2 / 4
fill_at.x += 1
evec[fill_at.x] = [-d -c b a] * ref
evals[fill_at.x] = (tridiagonal_elements[i] + tridiagonal_elements[j])^2 / 4
fill_at.x += 1
evec[fill_at.x] = [ d -c b -a] * ref
evals[fill_at.x] = (tridiagonal_elements[i] - tridiagonal_elements[j])^2 / 4
fill_at.x += 1
end
function ev_zero(tridiagonal_elements, unitary, evec, evals, fill_at; i)
N = size(unitary, 1)
ref = unitary[:, i]
for idx in 1:N
if idx != i
rup = ref * unitary[:, idx]'
evec[fill_at.x] = (rup - rup') ./ sqrt(2)
evals[fill_at.x] = tridiagonal_elements[idx]^2 / 4
fill_at.x += 1
end
end
return (values = evals, vectors = evec)
end
function get_basis(M::Rotations{N}, p, B::DiagonalizingOrthonormalBasis{ℝ}) where N
decomp = schur(B.frame_direction)
trian_elem = get_tridiagonal_elements(decomp.T)
unitary = decomp.Z
evec = Vector{typeof(B.frame_direction)}(undef, manifold_dimension(M))
evals = Vector{eltype(B.frame_direction)}(undef, manifold_dimension(M))
i = 1
fill_at = Ref(1)
while i <= N
if trian_elem[i] == 0
evs = ev_zero(trian_elem, unitary, evec, evals, fill_at; i=i)
i += 1
else
evs = ev_diagonal(trian_elem, unitary, evec, evals, fill_at, i=i)
j = 1
while j < i
if trian_elem[j] == 0
j += 1
else
evs = ev_offdiagonal(trian_elem, unitary, evec, evals, fill_at, i=i, j=j)
j += 2
end
end
i += 2
end
end
return CachedBasis(B, evals, evec)
end
For small N
the main cost is Schur decomposition (roughly 70% of remaining time) and for large N
it is dominated by multiplications in ev_offdiagonal
.
Thanks @mateuszbaran for improving the code :)
@kellertuer
a PR of this would be neat! For that I would just ask for more documentation, and for the basis itself even with math formulae and a reference.
Okay, I will get on it.
I have another question related to all this, which is related to switching between Manifolds.
To explain where I am coming from, a bit of context about the project that motivated me to learn more about differential geometry and geodesic regression: I have a number of data points Y in Euclidean space that I want to fit (i.e. minimize the Euclidean distance between Y and \hat{Y}). I assume that the datapoint are generated by some rotation where the degree of the rotation depends on some known covariate x. I now want to find the rotation R_0 that corresponds to x_i = 0 and the direction of the rotation V \in tangent space T_R_0M.
I am trying to formulate my model as a minimization of: 1/2 ||Y - \hat{Y}||_2^2 where \hat{Y}_i = exp(R_0, x_i V) * Z_i
Note that this is an extension of orthogonal Procrustes problem.
My problem now is that for the Rotation manifold, the derivatives are affected by the transition from the Rotation to the Euclidean manifolds. The following example illustrates what I mean:
function manifold_distance_problem(S)
p = random_point(S)
q = random_point(S)
v1 = random_tangent(S, p)
v2 = random_tangent(S, p)
cost_fnc(M, Vec1) = 0.5 * distance(S, q, exp(S, p, Vec1)).^2
diff_fnc(M, Vec1, Vec2) = inner(S, exp(S, p, Vec1), -log(S, exp(S, p, Vec1), q), differential_exp_argument(S, p, Vec1, Vec2))
check_diff(TangentSpace(S, p), cost_fnc, diff_fnc, p = v1, v = v2)
end
manifold_distance_problem(Sphere(6))
manifold_distance_problem(Rotations(3))
function euclid_distance_problem(S)
p = random_point(S)
q = random_point(S)
v1 = random_tangent(S, p)
v2 = random_tangent(S, p)
cost_fnc(M, Vec1) = 0.5 * sum((q - exp(S, p, Vec1)).^2)
diff_fnc(M, Vec1, Vec2) = sum(-(q - exp(S, p, Vec1)) .* differential_exp_argument(S, p, Vec1, Vec2))
check_diff(TangentSpace(S, p), cost_fnc, diff_fnc, p = v1, v = v2)
end
euclid_distance_problem(Sphere(6))
euclid_distance_problem(Rotations(3))
This works for all but the last call euclid_distance_problem(Rotations(3))
.
I think the reason why the euclid_distance_problem()
works for spheres, is that for small distances moving along on the Sphere is the same as moving in Euclidean space
lim_{t->0} log(exp(p, x+tv), exp(p, x)) / t = (exp(p, x + tv) - exp(p, x)) / t.
However, for the Rotation manifold this does not hold because lim_{t->0} log(exp(p, x+tv), exp(p, x)) / t = (exp(p, x + tv)' exp(p, x) - exp(p, x)' exp(p, x + tv)) / 2 / t != (exp(p, x + tv) - exp(p, x)) / t)
My question is: What is happening during the transition when I go from one manifold to Euclidean space? Why doesn't the chain rule (as described in sec. 4.7. of Boumal's book) apply for Rotation manifolds? Or am I missing something obvious here?
I think most likely euclid_distance_problem(Rotations(3))
doesn't work because our Rotations manifold represents tangent vectors at group identity while Sphere uses the natural representation in the embedding. Anyway, if manifold_distance_problem
works then it doesn't seem to be a problem :slightly_smiling_face: .
Thanks, I was considering that explanation. However, I am unclear at what point I would need to call embed
to make euclid_distance_problem(Rotations(3))
work.
Anyway, if manifold_distance_problem works then it doesn't seem to be a problem 🙂 .
Well, I wish. However for my specific project, I rather need something that looks like euclid_distance_problem
😅
Oh... @mateuszbaran you are right. Calling embed()
resolves the issue:
function euclid_distance_problem_mod(S)
p = random_point(S)
q = random_point(S)
v1 = random_tangent(S, p)
v2 = random_tangent(S, p)
cost_fnc(M, Vec1) = 0.5 * sum((q - exp(S, p, Vec1)).^2)
diff_fnc(M, Vec1, Vec2) = sum(-(q - exp(S, p, Vec1)) .* embed(S, exp(S, p, Vec1), differential_exp_argument(S, p, Vec1, Vec2)))
check_diff(TangentSpace(S, p), cost_fnc, diff_fnc, p = v1, v = v2)
end
euclid_distance_problem_mod(Sphere(6))
euclid_distance_problem_mod(Rotations(12))
The problem was that I always tried embed(S, p, differential_exp_argument(S, p, Vec1, Vec2))
instead of embed(S, exp(S, p, Vec1), differential_exp_argument(S, p, Vec1, Vec2))
🙈
OK, cool, nice that you managed to solve the problem :slightly_smiling_face: .
Concerning your original problem in the research you are invested in: I am not a fan of doing euclidean distances for manifold (i.e. something in the embedding), especially because if it works than only very locally and it might even be that you do not have an isometric embedding then it probably will not work in general. So if possible, maybe use distance_problem
instead of the euclidean variant you posted – well at least if you want to do manifold stuff – the Euclidean problem you could also solve with any euclidean solver.
Hey, thank you for the comment, I appreciate the input :)
If I may ask, what would you suggest to do if I have a mixed problem, where data points in Euclidean space are rotated (like in the orthogonal procrustes problem) and I want to find the rotation that is in some sense optimal? Is there some alternative way to phrase this problem so that I could use the manifold_distance_problem
approach?
Yes, there is an alternative approach via a rotation action: https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/group.html#Rotation-action . You probably need to define your own action based on the ones we have so there likely would be not that much work. We also have differentials of apply
: https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/group.html#Manifolds.apply_diff-Tuple{AbstractGroupAction,%20Any,%20Any,%20Any} and https://juliamanifolds.github.io/ManifoldDiffEq.jl/stable/lie_group_solvers.html#ManifoldDiffEq.apply_diff_group-Tuple{AbstractGroupAction,%20Any,%20Any,%20Any}.
There is also https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/group.html#Manifolds.center_of_orbit which does some alignment but probably not the one you need.
Do you maybe have a reference with more mathematical details?
If you ware looking for the best rotation I would first think of writing it with a cost function f(R)
where R is from SO(n).
Maybe this Matlab example is what you are looking for?
https://www.manopt.org/reference/examples/generalized_procrustes.html
I can check whether one of my students already did that in Julia as well.
One point for the rotation action approach is that it allows for relatively easy generalization to rotations + translations (special Euclidean group), or some other group action that you might need.
@mateuszbaran, the Rotation action things look great. Thank you so much for the links. I hadn't heard about this, but I will take a look :)
@kellertuer
https://www.manopt.org/reference/examples/generalized_procrustes.html
Thanks, my problem is an extension of the Procrustes problem, because I want the rotation to depend on some known covariate.
Do you maybe have a reference with more mathematical details?
Nothing properly written up, this is more something that my supervisor and I came up with during a brainstorming session. The jist of the idea is that we have $n$ measured data points $Y \in R^{g x n}$ and some known representation $Z \in R^{g x n}$. In addition for each datapoint, we know some covariate $t$. My cost function is $$ cost(p, V) = \sum_{i=0}^n || Y_i - exp(p, t_i V) Z_i||^2 $$ and I try to find $p \in SO(g)$ and $V \in T_p SO(G)$ that minimize $cost(p, V)$.
Ok, I am a little stuck – without more details about the real numbers (?) t_i you call covariate(s). And I have to think a little bit what the gradient here might look if you consider this on the Tangent bundle of SO(g) (i.e. all tuples (p,V) where V is in TpSO(g) for p in SO(g)).
without more details about the real numbers (?) t_i you call covariate
The approach is inspired by regular linear regression. In my specific case, $t$ could for example be a drug concentration, or a 0/1 vector to assign observations to a treatment group.
I have to think a little bit what the gradient here might look if you consider this on the Tangent bundle of SO(g) (i.e. all tuples (p,V) where V is in TpSO(g) for p in SO(g)).
Yes, similar to the example in Do geodesic regression, I consider the tuple (p,V) of the Tangentbundle of SO(g).
I will try to provide some code soon, to illustrate what I am doing :)
Cool looking forward to that, it sounds really interesting.
@const-ae would you mind me adding the diagonalizing basis on SO(n) developed here to Manifolds.jl? It would be quite useful.
No go ahead, I am happy to see it being integrated :)
Great, thanks!
I think this issue is then resolved?
I think it is, in any case we can reopen it or make a new one if something more needs to be done.
I am working on some problem where I would like to implement geodesic regression on Rotation manifolds (SO(n)). Following the geodesic regression tutorial in manopt, I dove into learning more about differentials of matrix exponentials and read the paper by Rentmeesters (2011) who explains how to form the orthogonal normalizing basis of the Jacobi operator for SO(3).
However, trying to implement his approach, I don't seem to be able to get the differential right:
In the above code example I am using
check_diff
function that tests if the Taylor expansion decreases quadrattically following section 4.8 of Boumal (2021).