fprimeau / OCIM-BGC

Coupled offline biogeochemistry model for the cycling of P, C, Si, and O2
4 stars 3 forks source link

Confusion about ResetPar: Does resetting parameters interfere with fminunc? #17

Open MeganRSullivan opened 1 month ago

MeganRSullivan commented 1 month ago

Description

I am confused about how the ResetPar function affects the fminunc search algorithm. During an optimization where I switched the C2P parameterization to be a global constant optimizable parameter, the optimization ended at what it considers to be a local minimum. However, the objective function is significantly larger than it was at the initial iteration. I am wondering if this might be related to the ResetPar function.

Details

Below is a subset of the log file from the optimization run:

Tunable parameters 
-----------------------------------------------
current sigP    is   3.50e-01 
current Q10P    is   2.28e+00 
current kdP     is   1.86e-08 
current bP_T    is   3.19e-01 
current bP      is   7.70e-01 
current alpha   is   2.45e-08 
current beta    is   4.50e-01 
current sigC    is   9.00e-02 
current kru     is   5.74e-12 
current krd     is   2.99e-12 
current etau    is   9.80e-01 
current bC_T    is   9.57e-01 
current bC      is   6.50e-01 
current d       is   4.55e+03 
current Q10C    is   1.05e+00 
current kdC     is   5.42e-09 
current R_Si    is   1.00e-01 
current rR      is   2.34e-02 
current dd      is   9.43e-03 
current rO2C    is   1.77e+00

current objective function value is 1.947e+00

                               Norm of      First-order 
 Iteration        f(x)          step          optimality   CG-iterations
     0            1.94732                           406                

current fb is 2.500e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ... 6 ... 10 ... 11 ... 12 ... 13 ... 15 ... 16 ... 17 ... 18 ...
current objective function value is 1.756e+01 
     1            1.94732             10            406           0

current fb is 3.125e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
current objective function value is 1.440e+00 
     2            1.44006            2.5           3.03           0

current fb is 3.906e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
current objective function value is 3.749e+00 
     3            1.44006          0.625           3.03           0

current fb is 4.883e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
current objective function value is 1.139e+00 
     4            1.13943        0.15625            113           0

current fb is 6.104e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
current objective function value is 1.149e+00 
     5            1.13943         0.3125            113           0

current fb is 7.629e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
current objective function value is 1.043e+00 
     6            1.04269       0.078125           3.59           0

current fb is 9.537e+00 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
reset bP_T    is   2.97e-01 
reset bP      is   8.35e-01
current objective function value is 1.022e+00 
     7            1.02161        0.15625           78.9           0

current fb is 1.192e+01 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
reset bP_T    is   2.94e-01 
reset bP      is   8.66e-01
current objective function value is 9.531e-01 
     8           0.953051      0.0390625            110           0

current fb is 1.490e+01 
solver suggested unrealistic parameter values. Reset x for pindx = 4 ... 5 ...
reset bP_T    is   2.84e-01 
reset bP      is   8.94e-01
current objective function value is 8.818e-01 
     9           0.881795       0.078125            269           0

No parameters reset.
current bP_T    is  -1.00e+00 
current bP      is   4.97e+00
current objective function value is 4.507e+02 
    10           0.881795      0.0195312            269           0

No parameters reset.
current bP_T    is  -9.93e-01 
current bP      is   4.92e+00
current objective function value is 4.476e+02 
    11           0.881795     0.00488281            269           0

...
...
...

current bP_T    is  -9.93e-01 
current bP      is   4.91e+00
current objective function value is 3.700e+02 
    25           0.881795    1.35761e-11            269           0

Local minimum possible.

Notice how the ResetPar function repeatedly resets the bP and bP_T parameters (pindx 4 & 5) in the first 9 iterations. For these iterations, the objective function value using the reset parameters is reasonably small. Then, from the 10th iteration onwards, the parameters do not get reset (because ResetPar is only executed in neglogpost if iter < 10). In iteration 10, the two parameters that had been reset previously make large jumps. (bP_T jumps from 0.284 to -1.0 and bP jumps from 0.894 to 4.97).

My first guess is that fminunc makes its next guess based on the values it initially suggested, rather than the reset values. When the objective function value decreases after resetting some of the parameter values, fminunc might proceed along its initial search direction, thinking that the parameter values it tried moved the function closer to a minimum, despite the model not using those unrealistic parameters to calculate the objective function, gradient, and Hessian.

Potential Solutions

If ResetPar is a problem for fminunc, we might want to consider removing ResetPar to avoid this issue. Or, if there are potentially bad values that could cause the model to crash, another option would be to impose a very large objective function value when ResetPar finds bad values, instead of running the model with reset values.

MeganRSullivan commented 1 month ago

When I tried skipping ResetPar entirely, the model crashed in the first iteration (after the initial iteration). The oxygen model never converged, and then the model crashed in neglogpost because: Output argument Ox (and maybe others) not assigned during call to eqOcycle_v2.

See log file below:

...
...
...
current objective function value is 1.947e+00 

saving data to ../output/optPCO_constC2P_v2_CTL_He_PCO_DOC1_DOP0.mat  ...
saving xhat to ../output/optPCO_constC2P_v2_CTL_He_PCO_DOC1_DOP0_xhat.mat  ...

                                Norm of      First-order 
 Iteration        f(x)          step          optimality   CG-iterations
     0            1.94732                           406                
Tunable parameters 
-----------------------------------------------
current sigP    is   2.97e-03 
current Q10P    is   1.65e+00 
current kdP     is   2.40e-08 
current bP_T    is  -4.47e+00 
current bP      is   5.47e+02 
current alpha   is   7.79e-08 
current beta    is   4.36e-01 
current sigC    is   3.59e-01 
current kru     is   3.64e-12 
current krd     is   4.90e-13 
current etau    is   1.03e+00 
current bC_T    is   1.93e+00 
current bC      is   1.74e+00 
current d       is   4.11e+03 
current Q10C    is   4.63e-01 
current kdC     is   1.18e-08 
current R_Si    is  -7.34e-01 
current rR      is   1.65e-02 
current dd      is   1.38e-02 
current rO2C    is   1.24e+00 
saving xhat & x0 to: ../output/optPCO_constC2P_v2_CTL_He_PCO_DOC1_DOP0_xhat.mat  ...
current iteration is 1 
Solving P model ...
factoring Jacobian matrix ...
Elapsed time is 340.027612 seconds.
solve for P-cycle model state ...
Elapsed time is 4.436780 seconds.
Compute the gradient of the solution wrt the parameters...
Elapsed time is 23.585154 seconds.
 Compute the hessian of the solution wrt the parameters...
Elapsed time is 97.753342 seconds.
Solving C model ...
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000014 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000006 seconds.
 Newton-Armijo solver 
Iteration  ||F(x_k)||    Steps   
   0               1.2193e-02        0  
factorize Jacobian matrix ...
Elapsed time is 918.252195 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000033 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000009 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000015 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000007 seconds.
         1          1.1695e-02       0  
factorize Jacobian matrix ...
Elapsed time is 914.276151 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000041 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000011 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000015 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000006 seconds.
         2          1.9138e-03       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000015 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000003 seconds.
         3          1.3465e-03       0  
factorize Jacobian matrix ...
Elapsed time is 897.176839 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000027 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000011 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000017 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000007 seconds.
         4          5.9894e-04       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000013 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000006 seconds.
         5          3.5496e-04       0  
factorize Jacobian matrix ...
Elapsed time is 912.043570 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000035 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000010 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000016 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000006 seconds.
         6          5.5135e-05       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000017 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000006 seconds.
         7          1.7507e-05       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000010 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000003 seconds.
         8          5.1336e-06       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000008 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000003 seconds.
         9          1.3863e-06       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000011 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000003 seconds.
        10          3.5546e-07       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000014 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000003 seconds.
        11          8.8835e-08       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000011 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000002 seconds.
        12          2.2076e-08       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000010 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000002 seconds.
        13          5.5251e-09       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000016 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000006 seconds.
        14          1.3994e-09       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000013 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000005 seconds.
        15          3.5805e-10       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000017 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000004 seconds.
        16          9.2164e-11       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000013 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000005 seconds.
        17          2.3773e-11       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000013 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000004 seconds.
        18          6.0893e-12       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000014 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000004 seconds.
        19          1.5877e-12       0  
Compute the gradient of the solution wrt parameters ...
Elapsed time is 0.000016 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 0.000005 seconds.
        20          4.7245e-13       0  
reset the global variable for the next call eqCycle. 
factorize Jacobian matrix ...
Elapsed time is 863.131733 seconds.
Compute the gradient of the solution wrt parameters ...
Elapsed time is 184.140662 seconds.
Compute the hessian of the solution wrt the parameters ...
Elapsed time is 1363.570053 seconds.
Solving O model ...
 Newton-Armijo solver 
Iteration  ||F(x_k)||    Steps   
   0               3.9132e-03        0  
         1          3.3859e-03       1  
         2          1.8399e-03       0  
         3          1.1473e-03       1  
         4          8.9825e-04       1  
         5          7.5800e-04       2  
         6          6.9120e-04       2  
         7          6.4705e-04       2  
         8          6.1991e-04       2  
         9          5.8800e-04       2  
        10          5.6486e-04       2  
        11          5.5868e-04       2  
        12          5.3384e-04       2  
        13          5.1417e-04       2  
        14          4.8599e-04       2  
        15          4.6316e-04       2  
        16          4.4701e-04       2  
        17          4.2052e-04       2  
        18          4.0394e-04       2  
        19          3.9327e-04       2  
        20          3.7435e-04       2  
        21          3.5951e-04       2  
        22          3.4437e-04       2  
        23          3.3220e-04       2  
        24          3.1809e-04       2  
        25          3.0658e-04       2  
        26          2.9275e-04       2  
        27          2.7859e-04       2  
        28          2.6459e-04       2  
        29          2.5105e-04       2  
        30          2.3833e-04       2  
        31          2.2835e-04       2  
        32          2.1677e-04       2  
        33          2.0579e-04       2  
        34          1.9647e-04       2  
        35          1.8732e-04       2  
        36          1.7795e-04       2  
        37          1.6905e-04       2  
        38          1.6063e-04       2  
        39          1.5257e-04       2  
        40          1.4492e-04       2  
O2model did not converge.
Using the time stepping method for initial O2 ...
dt = 3600.000000 nsteps = 100 
....................................................................................................
dt = 86400.000000 nsteps = 100 
....................................................................................................
dt = 2628000.000000 nsteps = 100 
....................................................................................................
dt = 31536000.000000 nsteps = 200 
........................................................................................................................................................................................................
dt = 126144000.000000 nsteps = 200 
........................................................................................................................................................................................................
dt = 315360000.000000 nsteps = 300 
............................................................................................................................................................................................................................................................................................................
Elapsed time is 599.841313 seconds.
Used O2(:,:,:,250) for initial O2 field of nsnew ...
 Newton-Armijo solver 
Iteration  ||F(x_k)||    Steps   
   0               2.2592e-03        0  
         1          2.2355e-03       0  
         2          1.3188e-03       1  
         3          9.4577e-04       1  
         4          7.7328e-04       2  
         5          6.9115e-04       2  
         6          6.6921e-04       2  
         7          6.1935e-04       2  
         8          5.9021e-04       2  
         9          5.7575e-04       2  
        10          5.5707e-04       2  
        11          5.3474e-04       2  
        12          5.1639e-04       2  
        13          4.8471e-04       2  
        14          4.6280e-04       2  
        15          4.4260e-04       2  
        16          4.1745e-04       2  
        17          4.0091e-04       2  
        18          3.8147e-04       2  
        19          3.6714e-04       2  
        20          3.4868e-04       2  
        21          3.3505e-04       2  
        22          3.1929e-04       2  
        23          3.0486e-04       2  
        24          2.9464e-04       2  
        25          2.8441e-04       2  
        26          2.7442e-04       2  
        27          2.6043e-04       2  
        28          2.4744e-04       2  
        29          2.3584e-04       2  
        30          2.2391e-04       2  
        31          2.1260e-04       2  
        32          2.0224e-04       2  
        33          1.9220e-04       2  
        34          1.8253e-04       2  
        35          1.7353e-04       2  
        36          1.6482e-04       2  
        37          1.5655e-04       2  
        38          1.4869e-04       2  
        39          1.3791e-04       2  
        40          1.2685e-04       2  
Used O2(:,:,:,500) for initial O2 field of nsnew ...
 Newton-Armijo solver 
Iteration  ||F(x_k)||    Steps   
   0               3.5050e-03        0  
         1          1.9476e-03       0  
         2          1.2651e-03       1  
         3          9.3798e-04       1  
         4          7.8671e-04       2  
         5          7.0825e-04       2  
         6          6.7379e-04       2  
         7          6.4210e-04       2  
         8          6.0551e-04       2  
         9          5.8483e-04       2  
        10          5.5254e-04       2  
        11          5.4513e-04       2  
        12          5.1025e-04       2  
        13          4.8790e-04       2  
        14          4.6729e-04       2  
        15          4.4317e-04       2  
        16          4.2681e-04       2  
        17          4.0225e-04       2  
        18          3.8569e-04       2  
        19          3.6707e-04       2  
        20          3.4910e-04       2  
        21          3.3664e-04       2  
        22          3.2939e-04       2  
        23          3.1318e-04       2  
        24          3.0365e-04       2  
        25          2.8906e-04       2  
        26          2.7498e-04       2  
        27          2.6127e-04       2  
        28          2.4876e-04       2  
        29          2.3619e-04       2  
        30          2.2423e-04       2  
        31          2.1292e-04       2  
        32          2.0206e-04       2  
        33          1.9187e-04       2  
        34          1.8222e-04       2  
        35          1.7309e-04       2  
        36          1.6154e-04       2  
        37          1.5005e-04       2  
        38          1.3399e-04       2  
        39          1.2443e-04       2  
        40          1.1430e-04       2  
Used O2(:,:,:,750) for initial O2 field of nsnew ...
 Newton-Armijo solver 
Iteration  ||F(x_k)||    Steps   
   0               4.7789e-03        0  
         1          4.5251e-03       0  
         2          2.0197e-03       0  
         3          1.0170e-03       1  
         4          8.5852e-04       1  
         5          7.3254e-04       2  
         6          6.7287e-04       2  
         7          6.4611e-04       2  
         8          6.0489e-04       2  
         9          5.7984e-04       2  
        10          5.5540e-04       2  
        11          5.4647e-04       2  
        12          5.0215e-04       2  
        13          4.9016e-04       3  
        14          4.8660e-04       2  
        15          4.6117e-04       2  
        16          4.4350e-04       2  
        17          4.2098e-04       2  
        18          4.0121e-04       2  
        19          3.8375e-04       2  
        20          3.7087e-04       2  
        21          3.5040e-04       2  
        22          3.3469e-04       2  
        23          3.2391e-04       2  
        24          3.1180e-04       2  
        25          2.9759e-04       2  
        26          2.8314e-04       2  
        27          2.8173e-04       3  
        28          2.8031e-04       3  
        29          2.7891e-04       3  
        30          2.7869e-04       4  
        31          2.7115e-04       2  
        32          2.5981e-04       2  
        33          2.5182e-04       2  
        34          2.3921e-04       2  
        35          2.2743e-04       2  
        36          2.2403e-04       3  
        37          2.1267e-04       2  
        38          2.0194e-04       2  
        39          1.9181e-04       2  
        40          1.8221e-04       2  
Used O2(:,:,:,1000) for initial O2 field of nsnew ...
 Newton-Armijo solver 
Iteration  ||F(x_k)||    Steps   
   0               4.7789e-03        0  
         1          4.5251e-03       0  
         2          2.0197e-03       0  
         3          1.0170e-03       1  
         4          8.5852e-04       1  
         5          7.3254e-04       2  
         6          6.7287e-04       2  
         7          6.4611e-04       2  
         8          6.0489e-04       2  
         9          5.7984e-04       2  
        10          5.5540e-04       2  
        11          5.4647e-04       2  
        12          5.0215e-04       2  
        13          4.9016e-04       3  
        14          4.8660e-04       2  
        15          4.6117e-04       2  
        16          4.4350e-04       2  
        17          4.2098e-04       2  
        18          4.0121e-04       2  
        19          3.8375e-04       2  
        20          3.7087e-04       2  
        21          3.5040e-04       2  
        22          3.3469e-04       2  
        23          3.2391e-04       2  
        24          3.1180e-04       2  
        25          2.9759e-04       2  
        26          2.8314e-04       2  
        27          2.8173e-04       3  
        28          2.8031e-04       3  
        29          2.7891e-04       3  
        30          2.7869e-04       4  
        31          2.7115e-04       2  
        32          2.5981e-04       2  
        33          2.5182e-04       2  
        34          2.3921e-04       2  
        35          2.2743e-04       2  
        36          2.2403e-04       3  
        37          2.1267e-04       2  
        38          2.0194e-04       2  
        39          1.9181e-04       2  
        40          1.8221e-04       2  
Warning: O2model did not converge even after the time stepping method.
>>