JuliaManifolds / Manopt.jl

🏔️Manopt. jl – Optimization on Manifolds in Julia
http://manoptjl.org
Other
314 stars 40 forks source link

Improving performance of L-BFGS #333

Closed mateuszbaran closed 8 months ago

mateuszbaran commented 9 months ago

I'm currently working on some benchmarks aimed at identifying Manopt.jl's performance bottlenecks.

Bottlenecks for quick-to-evaluate functions:

  1. https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/plans/stepsize.jl#L795 should be in-place
  2. get_gradient should be turned into in-place: https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/plans/stepsize.jl#L812
  3. [resolved in #334] https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/plans/quasi_newton_plan.jl#L495 is slow
  4. https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/plans/quasi_newton_plan.jl#L474 should be in-place
  5. https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/plans/stepsize.jl#L828 should be in-place
  6. https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/plans/stepsize.jl#L368 should be in-place
  7. https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/solvers/quasi_Newton.jl#L339 could we somehow multiply η by α in-place?

Maybe more but these look like the main ones

Bottlenecks for more expensive functions:

The code:


using Optim, Manopt
using Manifolds

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
f_manopt(::Euclidean, x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

optimize(f, [0.0, 0.0], Optim.NelderMead())

function g!(storage, x)
    storage[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
    storage[2] = 200.0 * (x[2] - x[1]^2)
end

optimize(f, g!, [0.0, 0.0], LBFGS())

function g_manopt!(::Euclidean, storage, x)
    storage[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
    storage[2] = 200.0 * (x[2] - x[1]^2)
    if isnan(x[1])
        error("nan")
    end
    return storage
end

M = Euclidean(2)
Manopt.NelderMead(M, f_manopt)

qn_opts = quasi_Newton(M, f_manopt, g_manopt!, [0.0, 0.0]; evaluation=InplaceEvaluation(), return_state=true)

using Profile
using ProfileView
using BenchmarkTools

test_f() = quasi_Newton(M, f_manopt, g_manopt!, [0.0, 0.0]; evaluation=InplaceEvaluation(), return_state=true)

@profile for i in 1:10000; test_f(); end
ProfileView.view()
kellertuer commented 9 months ago

I now worked 1.5 hours on point 6, I think I have something that might work, and it is even non-breaking, just for the Circle and the Positive Numbers and such none of your inlace ideas would work actually.

kellertuer commented 9 months ago

I think (4) can not be improved, without risking a lot of side effects.

The r is the direction we compute and return. Sure it would be easy to have a storage for that in the accompanying struct – but since r is also the value we return The only trick I could see, but that would probably be a bit of work, having direction updates that work in place so currently we have

https://github.com/JuliaManifolds/Manopt.jl/blob/3d38cc559bf29e030ab5b24877c06131e679622a/src/solvers/quasi_Newton.jl#L336

which we could transform into a

qns.direction_update(η, mp, qns)

when each of the direction update functors would have both the current (allocating) and an inplace variant. That would be a bit of work but probably worth it. What do you think?

edit: It seems there are only 2 direction update rules affected if I see that correctly, that is of course not so much to do (the other one even ends in a get_vector so that is also easy to adapt). I might try that later today on the PR I already work on.

Then only (7) is left, where for now I have no clue how one could do that, unless ManifoldsBase.jl would provide an inplace-scaling I think? Then one could switch the scaling and the retract and overwrite the eta probably.

kellertuer commented 9 months ago

I saw an easy way to realise what I proposed, but it indeed seems that now we need a copy somewhere still, since now we have a side effect that makes more tests fail.

But in theory (just not yet working in practice) all 7 points should be resolved.

The problem is now that a few quasi newton tests return nan and I do not yet see why, though I spent quite some time on that by now. If you see what that might be, let me know :)

mateuszbaran commented 8 months ago

Cool to see this completed :tada: . I will re-check performance and improve my benchmark.

kellertuer commented 8 months ago

Nice! In the long run I will start a benchmark CI as well.