Preallocate caches in Fminbox to avoid unnecessary value_gradient!! calls

pkofod commented 5 years ago

see https://stackoverflow.com/questions/54271116/optim-jl-makes-many-redundant-function-calls-per-iteration?noredirect=1#comment96906868_54271116

nfernandez-arias commented 5 years ago

In order to make this easier to engage with, here is a self-contained example of this issue.

using Optim

function f(x)

    log_file = open("figures/Optim_Bug_Example_Log.txt","a")

    write(log_file,"Iteration: x1 = $(x[1]); x2 = $(x[2])\n")

    close(log_file)

    return (x[1] - x[2])^2 + x[1]

end

initial_x = [0.0 0.0]

lower = [-0.5 -0.5]
upper = [1.0 1.0]

inner_optimizer = LBFGS()

results = optimize(f,lower,upper,initial_x,Fminbox(inner_optimizer),
     Optim.Options(iterations = 0,  store_trace = true, show_trace = true))

The trace is

######## fminbox ########
Initial mu = 0.00016666666666666666
#### Fminbox #1: Calling optimizer with mu = 0.00016666666666666666 ####
Iter     Function value   Gradient norm
   0     2.310491e-04     9.998333e-01

The log produced is

Iteration: x1 = 6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = -6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = 0.0; x2 = 6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = -6.0554544523933395e-6
Iteration: x1 = 6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = -6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = 0.0; x2 = 6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = -6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = 0.0
Iteration: x1 = 6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = -6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = 0.0; x2 = 6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = -6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = 0.0
Iteration: x1 = 0.0; x2 = 0.0

As you can see, for any x such that the algorithm evaluates f(x), it evaluates it three times. In cases (such as my application) where evaluating the objective function is the costly step, this means that the speed could be improved by a factor of 3.

The same happens I replace LBFGS() with GradientDescent() or ConjugateGradient().

However, the problem goes away if I do not use Fminbox but rather do unconstrained maximization. So the problem, as pointed out by pkofod above, has to do with Fminbox.

I would try to fix it myself, but I don't have a good understanding of how the source code of Optim.jl works. I hope this helps, let me know if any additional information would be helpful. Thank you.

pkofod commented 5 years ago

Yeah, I know exactly what the problem is, I just haven’t gotten around to it due to work commitments. Sorry, I’ll try to get it done

nfernandez-arias commented 5 years ago

Ok that's great news. Thanks for your help.

nfernandez-arias commented 5 years ago

Now, attempting to minimize a function of 9 variables in a box (basically, like my original stack overflow question but a function of more variables), I get the following output:

Final Error: (g, 3.2881657562044975e-7; L_RD, 6.258150524879325e-8; w, 2.30316828858379e-5)
17899.430937 seconds (167.52 G allocations: 13.761 TiB, 5.65% gc time)
Results of Optimization Algorithm
 * Algorithm: Fminbox with L-BFGS
 * Starting Point: [3.0,1.2,0.75,1.08,0.05,0.4,0.2,0.1,0.05]
 * Minimizer: [2.996530305989923,1.0000000000000002, ...]
 * Minimum: 1.285687e+00
 * Iterations: 8
 * Convergence: false
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 2.19e-09 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = 4.69e-06 |f(x)|
   * |g(x)| ≤ 1.0e-08: false 
     |g(x)| = 6.67e-01 
   * Stopped by an increasing objective: true
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 5740
 * Gradient Calls: 5740
Score: 1.2856864993762607

It seems to have called the objective and gradient 5740 times, while it only did 8 iterations. Am I misreading this? I would expect more than 8 iterations to be necessary for calculating the optimum in a 9-dimensional space. Is it miscalculating the number of iterations?

pkofod commented 5 years ago

You are misreading it, so we should improve the printing here. Those are not L-BFGS iterations, those are the number of times the barrier term was updated. So it was probably doing something like... I don't know, (5740/8)/4 ~ around 200 iterations per barrier term update.

The way Fminbox works is that it incorporates how close you are to the bounds in the objective function. As the "major iterations" (those you see in your output) progress, there's a term that gives that closeness to bounds less and less weight, allowing you to go to the bounds if the actual solution is at the bounds.

I'm very curious about your code btw. Those are enormous amounts if allocations being made. I know it's running for a long time, but I'm curious to have a look at what you're doing. Feel free to show it here, or reach out.

nfernandez-arias commented 5 years ago

Thanks for the thoughtful reply.

I just added you as a collaborator on the private github page for my project, if you feel like taking a look.

I am calibrating an economic model. For each evaluation of the objective function, I am solving a fixed point problem, with three nested iterations.The code is in code/julia/modules/scripts.

CalibrationModuleScript contains the code for the calibration, including the calls to Optim It refers to ModelSolverScript, which contains the two outer fixed point iterations. This in turn refers to HJBModuleScript, which solves a differential equation using another iterative method. All of them refer variously to the other modules, let me know if it's not clear.

In order to solve the model, one needs an initial guess. CalibrationModuleScript gains a lot of speed by updating the guess for the next function call using the solution to the model obtained at the last function call (I do this with mutable structs, but I think the same can be done with global variables, not sure which is faster).

As far as optimization of allocations, at one point I had a significantly higher % gc time and I went through and replaced some declarations of new structs with updates of mutable structs. According to the output above the model takes 3 seconds on average to solve. This is surprising to me since it looks faster so now doing a test of whether it is counting the function calls properly.

nfernandez-arias commented 5 years ago

I just ran my code with the following call to Optim:

calibrationResults = optimize(f,lower,upper,initial_x,Fminbox(inner_optimizer),Optim.Options(iterations = 1, store_trace = true, show_trace = true))

Note: iterations = 1.

The function is f is defined by

    function f(x::Array{Float64})

        # Unpack x vector into modelPar struct for inputting into model solver

        #modelPar.ρ = x[1]
        modelPar.χI = x[1]
        modelPar.χS = x[2]
        modelPar.χE = x[3] * x[2]
        modelPar.λ = x[4]
        modelPar.ν = x[5]
        modelPar.ζ = x[6]
        modelPar.κ = x[7]
        modelPar.spinoutsFromSpinouts = x[8]
        modelPar.spinoutsFromEntrants = x[9]
        modelPar.θ = x[10]

        if modelPar.CNC == true
            log_file = open("./figures/CalibrationLog_CNC.txt","a")
        else
            log_file = open("./figures/CalibrationLog_noCNC.txt","a")
        end

        write(log_file,"Iteration: χI = $(x[1]); χS = $(x[2]); χE = $(x[2] * x[3]); λ = $(x[4]); ν = $(x[5]); ζ = $(x[6]); κ = $(x[7]); sFromS = $(x[8]); sFromE = $(x[9]); θ = $(x[10])\n")

        close(log_file)

        output = computeScore(algoPar,modelPar,guess,calibPar,targets,weights,incumbentSolution)

    end

So that every time f is called, a line is added to the log file.

The results were:

399.939920 seconds (2.64 G allocations: 315.799 GiB, 8.63% gc time)
Results of Optimization Algorithm
 * Algorithm: Fminbox with L-BFGS
 * Starting Point: [3.0,1.2, ...]
 * Minimizer: [2.99432731690038,1.1986798485334498, ...]
 * Minimum: 1.482581e+00
 * Iterations: 7
 * Convergence: true
   * |x - x'| ≤ 0.0e+00: true 
     |x - x'| = 0.00e+00 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = 2.97e-13 |f(x)|
   * |g(x)| ≤ 1.0e-08: false 
     |g(x)| = 5.91e+00 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: true
 * Objective Calls: 444
 * Gradient Calls: 444
Score: 1.4825796583515451

Minimizer: 

χI = 2.99432731690038
χS = 1.1986798485334498
χE = 0.2573406362038258
λ = 1.0378621655060316
ν = 0.028359793558622268
θ = 0.2032806102418321
ζ = 4.338874463157289e-15
κ = 0.009487341832867547
spinoutsFromSpinouts = 0.10016194408431676
spinoutsFromEntrants = 0.058902079753627566

Moments: ModelMoments(-0.0014250539013785622, 0.6091698180053811, 0.03353930594906293, 0.212390419285713
87, 0.01518320976388458, 0.053527766095185496, 0.7689737697373885, 0.3552559230770765)                  

Format : (target, model)

R&D Intensity: (0.11 , -0.0014250539013785622)
Internal innovation share: (0.5 , 0.6091698180053811)
Spinout Entry Rate: (0.03 , 0.03353930594906293)
Spinout Fractions of entry: (0.3 , 0.21239041928571387)
growth rate: (0.015 , 0.01518320976388458)
R&D Labor allocation: (0.1 , 0.053527766095185496)
Wage ratio (R&D to production): (0.7 , 0.7689737697373885)
Spinouts NC Share: (0.5 , 0.3552559230770765)

1) Why does it say 7 iterations when I selected the option of 1 iteration? 2) When I inspect the log created, it shows that the function was called 2184 times, rather than 444 times.

Am I still missing something?

pkofod commented 5 years ago

You control the number of outer iterations with outer_iterations. As I said, the printing could be better to make this more clear.

Those 444 evaluations of the gradient includes several calls to your f since you're using finite differencing. I'm not sure if it's 100% correctly reported, but I should be (though there could be a bug, I cannot rule that out.)

nfernandez-arias commented 5 years ago

Ok that makes sense -- now running

calibrationResults = optimize(f,lower,upper,initial_x,Fminbox(inner_optimizer),Optim.Options(outer_iterations = 1, iterations = 1, store_trace = true, show_trace = true))

I get results

 59.186004 seconds (309.74 M allocations: 31.421 GiB, 7.39% gc time)
Results of Optimization Algorithm
 * Algorithm: Fminbox with L-BFGS
 * Starting Point: [3.0,1.2, ...]
 * Minimizer: [2.9952401944040745,1.1994136441676886, ...]
 * Minimum: 3.090231e+00
 * Iterations: 1
 * Convergence: false
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 8.56e-03 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = 4.51e-02 |f(x)|
   * |g(x)| ≤ 1.0e-08: false 
     |g(x)| = 3.61e+02 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: true
 * Objective Calls: 8
 * Gradient Calls: 8
Score: 3.0902313874635894

Minimizer: 

χI = 2.9952401944040745
χS = 1.1994136441676886
χE = 0.25466020956320806
λ = 1.074794143253538
ν = 0.01001595496580562
θ = 0.2012386514547373
ζ = 0.009946948983710827
κ = 0.0080390799580063
spinoutsFromSpinouts = 0.10020391698479574
spinoutsFromEntrants = 0.059517854034775976

Moments: ModelMoments(0.0028895284227138654, 0.5515161113005287, 0.0079424579098265, 0.055350930205381774, 0.02374490570601744, 0.053164099742304405, 0.9345610126064329, 0.24785542406671826)

Format : (target, model)

R&D Intensity: (0.11 , 0.0028895284227138654)
Internal innovation share: (0.5 , 0.5515161113005287)
Spinout Entry Rate: (0.03 , 0.0079424579098265)
Spinout Fractions of entry: (0.3 , 0.055350930205381774)
growth rate: (0.015 , 0.02374490570601744)
R&D Labor allocation: (0.1 , 0.053164099742304405)
Wage ratio (R&D to production): (0.7 , 0.9345610126064329)
Spinouts NC Share: (0.5 , 0.24785542406671826)

I still don't quite understand why it is calling the function more than once when every iteration option is set to 1 -- appreciate any clarification.

The log file has 168 lines. This is exactly equal to 8 + 8 10 2: 8 function calls, plus 8 gradient calls each with 2 calls for each of the 10 dimensions. So that's consistent.

Inspecting the log file, it actually looks like Optim is evaluating only the first function calls multiple times before moving on -- out of 168 lines, 126 of them are unique. So maybe the original minimal working example exaggerates the problem with Fminbox.

pkofod commented 5 years ago

I still don't quite understand why it is calling the function more than once when every iteration option is set to 1 -- appreciate any clarification.

with one iteration you have: the initial evaluation, a line search and a hessian approximation update. All of those require function and gradient evaluations.

pkofod commented 5 years ago

Inspecting the log file, it actually looks like Optim is evaluating only the first function calls multiple times before moving on -- out of 168 lines, 126 of them are unique. So maybe the original minimal working example exaggerates the problem with Fminbox.

Okay, but it's still something we should avoid :)

nfernandez-arias commented 5 years ago

Great, I think this makes sense now. Thanks again for your help!

pkofod commented 5 years ago

Great, I think this makes sense now. Thanks again for your help!

No worries. I hope I can find some time to look at your code. It's great that you got GC time down, that's usually a sign of problems, but you're still allocating quite a lot. Not always a problem, but sometimes it is.

pkofod commented 4 years ago

I think I have a fix ready now, I know it's a year later... but... :)

Iteration: x1 = 6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = -6.0554544523933395e-6; x2 = 0.0
Iteration: x1 = 0.0; x2 = 6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = -6.0554544523933395e-6
Iteration: x1 = 0.0; x2 = 0.0
Iteration: x1 = 0.0; x2 = 0.0

is the output from your example above https://github.com/JuliaNLSolvers/Optim.jl/issues/704#issuecomment-512968901

I'm not sure why it's evaluating at (0,0) twice though... but many of the redundant calls are gone :)

nfernandez-arias commented 4 years ago

Thanks for your help!

pkofod commented 4 years ago

Thanks for your help!

Thanks for commenting on the issue! I will post a PR as soon as I'm sure everything passes as it should.

JuliaNLSolvers / Optim.jl

Preallocate caches in Fminbox to avoid unnecessary value_gradient!! calls #704