Open andreabonvini opened 2 years ago
In src/pointprocess/optimizers/BaseOptimizers
the method optimizeNewton
is responsible for finding the best set of parameters x
based on the currently observed dataset
. The optimization process consists of finding the local minimum of a non-convex likelihood function in the fastest way possible, possibly through the Newton's method. While the computation of the gradient vector and hessian matrix w.r.t. the likelihood function for each available optimizer is reliable (and well tested), the current implementation of the optimization routine could be surely enhanced, since in certain situations it isn't able to reach convergence or it takes too much time (measured in number of iterations) before doing so.
The current pseudocode is presented below:
x_new = SOME_INITIAL_POINT
ftol = 1e-5 // or gtol = 1e-5
fdiff = + INF // or gnorm = + INF
while (fdiff > ftol) // or (gnorm > gtol)
x_old = x_new
// Compute current gradient vector.
g = gradient(x_old)
// Compute current hessian matrix.
H = hessian(x_old)
// See "Useful Links" for the rationale behind this if statement.
if isPositiveDefinite(H)
x_new = computeNewtonRaphsonUpdateWithLineSearch(x_old,H,g)
else
x_new = computeGradientDescentUpdateWithLineSearch(x_old,g)
fdiff = abs(f(x_new) - f(x_old)) // or gnorm = computeNorm(g)
Useful links:
One idea could be to slightly change the hessian matrix if it's not positive definite to adjust the update direction, instead of relying on gradient descent, as explained here.
Since this is probably the most important function of the library at the moment, I'm going to develop an internal tool/executable to easily benchmark a proposed solution.
There are surely better ways to find the local minimum of a non-convex multidimensional function.