`mean` runtime-dispatch and allocations on Lie Groups - SE(2/3)

JuliaManifolds / Manifolds.jl

Manifolds.jl provides a library of manifolds aiming for an easy-to-use and fast implementation.

https://juliamanifolds.github.io/Manifolds.jl

MIT License

367 stars 54 forks source link

`mean` runtime-dispatch and allocations on Lie Groups - SE(2/3) #661

Open Affie opened 11 months ago

Affie commented 11 months ago

We use mean in a few performance-critical places and I've noticed it was one of the bottlenecks. For example:

julia> @time mean(SpecialEuclidean(2), [ArrayPartition(SA[-1.,0], SA[1 0; 0 1.]), ArrayPartition(SA[1.,0], SA[1 0; 0 1.])])
  0.000968 seconds (5.03 k allocations: 184.000 KiB)
([-5.551115123125783e-17, 0.0], [1.0 0.0; 0.0 1.0])

mean also makes NelderMead really slow.

Is this to be expected?

I found GeodesicInterpolation() and it makes it a lot faster, but I'm not familiar enough to know if it's okay to use on Lie Groups in general? Also, if GeodesicInterpolation() is indeed the correct way forward, should NelderMead have a parameter similar to retraction method to set the mean method?

julia> @time mean(SpecialEuclidean(2), [ArrayPartition(SA[-1.,0], SA[1 0; 0 1.]), ArrayPartition(SA[1.,0], SA[1 0; 0 1.])], GeodesicInterpolation())
  0.000038 seconds (30 allocations: 1.375 KiB)
([0.0, 0.0], [1.0 0.0; 0.0 1.0])

Medium term we would like to support any Lie Group, but short term the performance of SE(2/3) is most important.

Affie commented 11 months ago

Also, even with GeodesicInterpolation I'm still not getting the results I would expect in the @profview flame graph

kellertuer commented 11 months ago

There are different approaches to approximating the mean (note that on most manifolds there is no closed form solution like the nice one on Euclidean()) and we use a good default method to compute that, see e.g. https://juliamanifolds.github.io/Manifolds.jl/latest/features/statistics.html#Manifolds.GeodesicInterpolation, but there is also for example https://juliamanifolds.github.io/Manifolds.jl/latest/features/statistics.html#Manifolds.GradientDescentEstimation all of these use different method to approximate the mean.

It heavily depends on the manifold, which one works better — we could maybe think about a default_estimation_method (if we do not have that yet) here as well.

uff, Melder Mead itself if one of the slowest algorithms I am aware of, I would maybe either recommend to sit down and compute the gradient or use some AD tools to compute it and then use quasi Newton? (If your problem is smooth enough that is).

But concerning your second post and considering that Nelder Mead is not far away from randomly wandering over the manifold looking for some nice cost – what would you expect? Without gradient information I would not expect much.

mateuszbaran commented 11 months ago

I will take a look if performance can be improved. Also, perhaps NelderMead just shouldn't run mean computation to convergence. SE2 may also have a closed-form formula for mean.

Affie commented 11 months ago

Thanks for the explanation.

uff, Melder Mead itself if one of the slowest algorithms I am aware of, I would maybe either recommend to sit down and compute the gradient or use some AD tools to compute it and then use quasi Newton? (If your problem is smooth enough that is).

One of our algorithms currently uses NelderMead from Optim.jl and I wanted to swop it out for the Manopt version. The next step would be to use analytic gradients where applicable or fall back to AD for smooth problems.

But concerning your second post and considering that Nelder Mead is not far away from randomly wandering over the manifold looking for some nice cost – what would you expect? Without gradient information I would not expect much.

That part of the flame graph is where we have to calculate the variance of about 100 points.

NelderMead has so far proven more robust than the other methods I tried and I suspect it will continue to have a place as a fallback, for example, some of our cost functions are Flux models.

kellertuer commented 11 months ago

One of our algorithms currently uses NelderMead from Optim.jl and I wanted to swop it out for the Manopt version. The next step would be to use analytic gradients where applicable or fall back to AD for smooth problems.

That sounds reasonable.

That part of the flame graph is where we have to calculate the variance of about 100 points.

Ah ok, so it is not in Nelder Mead or is that in Nelder Mead?

NelderMead has so far proven more robust than the other methods I tried and I suspect it will continue to have a place as a fallback, for example, some of our cost functions are Flux models.

Sure it is robust. Maybe similar to a tractor. Gets forward does the job. Does not so much win in formula one maybe ;) But sure sometimes there are restrictions that the gradient is too expensive (then you could still try forward differences, if function evaluations are not too expensive), that the tractor is the only choice.

mateuszbaran commented 11 months ago

@kellertuer what do you think about moving AbstractEstimationMethod to ManifoldsBase.jl so that Manopt.jl's NelderMead could have a swappable mean estimation method?

kellertuer commented 11 months ago

Sounds nice, but then really as well with a default_estimation_method as well, that we use here thoroughly for some manifolds as well then.

mateuszbaran commented 11 months ago

I will take a look at that after 0.15 update is finished.