andreabonvini / pointprocess

A C++ Library for Point Process Analysis
https://pointprocess.readthedocs.io/en/latest/
MIT License
7 stars 2 forks source link

Refactor and optimize `optimizeNewton` routine. #4

Open andreabonvini opened 2 years ago

andreabonvini commented 2 years ago

There are surely better ways to find the local minimum of a non-convex multidimensional function.

andreabonvini commented 2 years ago

In src/pointprocess/optimizers/BaseOptimizers the method optimizeNewton is responsible for finding the best set of parameters x based on the currently observed dataset. The optimization process consists of finding the local minimum of a non-convex likelihood function in the fastest way possible, possibly through the Newton's method. While the computation of the gradient vector and hessian matrix w.r.t. the likelihood function for each available optimizer is reliable (and well tested), the current implementation of the optimization routine could be surely enhanced, since in certain situations it isn't able to reach convergence or it takes too much time (measured in number of iterations) before doing so.

The current pseudocode is presented below:

x_new = SOME_INITIAL_POINT
ftol = 1e-5 //  or gtol = 1e-5
fdiff = + INF // or gnorm = + INF

while (fdiff > ftol) // or (gnorm > gtol)

    x_old = x_new
    // Compute current gradient vector.
    g = gradient(x_old)
    // Compute current hessian matrix.
    H = hessian(x_old)

    // See "Useful Links" for the rationale behind this if statement.
    if isPositiveDefinite(H)
        x_new = computeNewtonRaphsonUpdateWithLineSearch(x_old,H,g)
    else
        x_new = computeGradientDescentUpdateWithLineSearch(x_old,g)

    fdiff = abs(f(x_new) - f(x_old)) // or gnorm = computeNorm(g)

Useful links:

One idea could be to slightly change the hessian matrix if it's not positive definite to adjust the update direction, instead of relying on gradient descent, as explained here.

Since this is probably the most important function of the library at the moment, I'm going to develop an internal tool/executable to easily benchmark a proposed solution.