Refactor and optimize `optimizeNewton` routine.

In src/pointprocess/optimizers/BaseOptimizers the method optimizeNewton is responsible for finding the best set of parameters x based on the currently observed dataset. The optimization process consists of finding the local minimum of a non-convex likelihood function in the fastest way possible, possibly through the Newton's method. While the computation of the gradient vector and hessian matrix w.r.t. the likelihood function for each available optimizer is reliable (and well tested), the current implementation of the optimization routine could be surely enhanced, since in certain situations it isn't able to reach convergence or it takes too much time (measured in number of iterations) before doing so.

The current pseudocode is presented below:

x_new = SOME_INITIAL_POINT
ftol = 1e-5 //  or gtol = 1e-5
fdiff = + INF // or gnorm = + INF

while (fdiff > ftol) // or (gnorm > gtol)

    x_old = x_new
    // Compute current gradient vector.
    g = gradient(x_old)
    // Compute current hessian matrix.
    H = hessian(x_old)

    // See "Useful Links" for the rationale behind this if statement.
    if isPositiveDefinite(H)
        x_new = computeNewtonRaphsonUpdateWithLineSearch(x_old,H,g)
    else
        x_new = computeGradientDescentUpdateWithLineSearch(x_old,g)

    fdiff = abs(f(x_new) - f(x_old)) // or gnorm = computeNorm(g)

Useful links:

What do I mean by Line Search? check this.
Role of the Hessian matrix in Newton's method: check this and this

One idea could be to slightly change the hessian matrix if it's not positive definite to adjust the update direction, instead of relying on gradient descent, as explained here.

Since this is probably the most important function of the library at the moment, I'm going to develop an internal tool/executable to easily benchmark a proposed solution.

andreabonvini / pointprocess

Refactor and optimize `optimizeNewton` routine. #4