fslaborg / FSharp.Stats

statistical testing, linear algebra, machine learning, fitting and signal processing in F#
https://fslab.org/FSharp.Stats/
Other
210 stars 56 forks source link

[Feature Request] Levenberg–Marquardt algorithm: Different increase and decrease for damping parameter λ #145

Open caroott opened 3 years ago

caroott commented 3 years ago

The current implementation of the Levenberg–Marquardt algorithm has the damping parameter (lambda) and and a factor which raises/lowers it (lambdaFactor) as parameters. For some problems it can be beneficial to have different values for the increase and decrease of the damping parameter.

In this scheme, if a step is accepted, then λ is decreased by a fixed factor, say 10. If a step is rejected then λ is appropriately raised by a factor of 10. (...) We have much greater success using a factor of 2 or 3 on most problems. Additionally, we find that lowering λ by a larger factor than it is raised also produces more beneficial results. For many moderate sized problems decreasing by a factor of 3 and raising by a factor of 2 is adequate. For larger problems, decreasing by a factor of 5 and raising by a factor of 1.5 is better. -- Transtrum, Mark K; Sethna, James P (2012)

That is why I'd like to propose to split the parameter "lambdaFactor" in "lambdaFactorIncrease" and "lambdaFactorDecrease". This change would break the function for everyone currently using it. My suggestion for this would be to rename the LevenbergMarquardt modules and abbreviate them with LM. The old module can then stay (marked as obsolete) and contain a version of the function, which simply takes one lambdaFactor and calls the new version of the function internally, inserting the one factor twice. This way those functions would still work and the new version would also be available with a shorter module name.

open System

module LM =
    let estimateParams (...) lambdaInit lambdaFactorIncrease lambdaFactorDecrease = 
        (...)
    let estimateParamsVerbose (...) lambdaInit lambdaFactorIncrease lambdaFactorDecrease = 
        (...)

[<Obsolete("Use module LM instead")>]
module LevenbergMarquardt = 

    [<Obsolete("Use LM.estimateParams with additional lambdaFactorDecrease instead")>]
    let estimateParams (...) lambdaInit lambdaFactor = 
        LM.estimateParams (...) lambdaInit lambdaFactor lambdaFactor 

    [<Obsolete("Use LM.estimateParams with additional lambdaFactorDecrease instead")>]
    let estimateParamsVerbose (...) lambdaInit lambdaFactor =
        LM.estimateParamsVerbose (...) lambdaInit lambdaFactor lambdaFactor 
bvenn commented 3 years ago

I like your suggestion and think this modification gives a boost in efficiency and accuracy of LM optimization. The renaming of LevenbergMarquardt into LM is in line with optimizations in other languages (already discussed in #94 for OLS). I do not think this will add a hurdle for the user but rather simplify the access.

I wonder how the constrained version of LM should be treated. Would it be beneficial to add it to LM or leave it as separate module? Or do you have any other thoughts @Joott @ZimmerD?

Suggestion:

module LM = (...)

    module Constrained = (...)
bvenn commented 3 years ago

However, we should update the Fitting and Growth curve documentation after this.

ZimmerD commented 3 years ago

I am also positive in terms of renaming the module!

I think in the constrained and and unconstrained implementation share a lot of code and it seems that the unconstrained version is just a special case of the constrained one (with no borders) so I guess it would make sense to replace the unconstrained version with a function call to the constrained one with open borders?

When it comes to the rescaling of the lambda I think we should revaluate how many different "standard" ways of manipulating lambda are out there and maybe also replace this rescaling with a function which is dependent on different parameters of a iteration e.g jacobian, rss and so on. This way we would have a generic way of manipulating lambda that would not require a rework of the module at a later stage.

caroott commented 3 years ago

As far as I can see, the Hessian Matrix, and in a wider sense the Jacobian Matrix, are influenced by lambda. So including those two together with the RSS for the function manipulation lambda would make sense. Since the usage of such a function may be unintuitive, I'd propose the following solution: We include two versions of the function, "estimateParamsVerboseDetailed" and "estimateParamsVerbose". The detailed version gets two functions, one for increasing and one for decreasing lambda. The simple version gets two float values and calls the detailed version, which is given two functions internally that return the given scaling factor. This could then look like this:

module LM =

    let estimateParamsVerboseDetailed (...) lambdaInit (lambdaFactorIncrease: Matrix -> Matrix -> float -> float) (lambdaFactorDecrease: Matrix -> Matrix -> float -> float) = 
        (...)
    let estimateParamsVerbose (...) lambdaInit lambdaFactorIncrease lambdaFactorDecrease = 
        lambdaFactorIncreaseFunc = (fun jacobian hessian rss -> lambdaFactorIncrease)
        lambdaFactorDecreaseFunc = (fun jacobian hessian rss -> lambdaFactorDecrease )
        estimateParamsVerboseDetailed (...) lambdaInit lambdaFactorIncreaseFunc lambdaFactorDecreaseFunc
        (...)

I'm not sure if this is the best approach, so I'm open for other suggestions. Maybe it would also make sense to include more parameters, or even all, for maximum flexibility of methods.