Closed jeffpollock9 closed 5 years ago
Looks like the default tolerance https://github.com/tensorflow/probability/blob/46b0b89821921b1b5bef163d3dba355c67bc8209/tensorflow_probability/python/optimizer/bfgs.py#L75 is 1e-8, and your logs show it is only within about 1e-6.
On Tue, Mar 26, 2019 at 11:13 AM Jeff notifications@github.com wrote:
In the following example bfgs_minimize reports a failure to converge when it is very close to the solution:
import tensorflow as tfimport tensorflow_probability as tfp
A = tf.constant(3.0) B = tf.constant(100.0)
def rosenbrock(x): x0 = x[0] x1 = x[1] first = tf.math.squared_difference(A, x0) second = B * tf.math.squared_difference(x1, tf.square(x0)) return first + second
def value_and_gradients(x): return tfp.math.value_and_gradient(rosenbrock, x)
@tf.functiondef hessian(opt): x = opt.position value = rosenbrock(x) hess = tf.hessians(value, x)[0] return hess
def approx_hessian(opt): inverse_hess = opt.inverse_hessian_estimate hess = tf.linalg.inv(inverse_hess) return hess
tf.random.set_seed(42)
init = tf.random.normal([2]) opt = tfp.optimizer.bfgs_minimize(value_and_gradients, init) print(f"initial position: {init}")print(f"true solution: [{A}, {A*A}]") print(f"found solution: {opt.position}")print(f"converged: {opt.converged}")print(f"num_iterations: {opt.num_iterations}") print(f"hessian at solution:\n{hessian(opt)}")print(f"approx hessian at solution:\n{approx_hessian(opt)}")
(@tf.function is only used here to make calculating the hessian easier).
which outputs:
(tf2) $ python bfgs_fail.py initial position: [ 0.3274685 -0.8426258] true solution: [3.0, 9.0] found solution: [3.0000024 9.000014 ] converged: False num_iterations: 34 hessian at solution: [[ 7202.0117 -1200.001 ] [-1200.001 200. ]] approx hessian at solution: [[ 7280.896 -1213.2786 ] [-1213.2787 202.23483]]
using the latest tensorflow 2 alpha and tfp nightly from pip.
Would it be possible to also output why the routine has failed? Has it actually failed?
If it is at all useful, the LBFGS routine gets the right answer and reports it accordingly.
I am very happy to devote some time to helping with this if I could get a few pointers to start me off. Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/341, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJtGXfMpeuqD5-Y1N9sZGNufSW_y_ohks5vajkEgaJpZM4cLtKp .
-- Christopher Suter | SWE | cgs@google.com | 352-234-4096
(actually I think I took the wrong meaning of tolerance in my reply. That
tolerance parameter is renamed grad_tolerance
in the downstream
computations. In any case, I think that or the other tolerance values are
just too tight for it to conclude convergence.
On Tue, Mar 26, 2019 at 11:13 AM Jeff notifications@github.com wrote:
In the following example bfgs_minimize reports a failure to converge when it is very close to the solution:
import tensorflow as tfimport tensorflow_probability as tfp
A = tf.constant(3.0) B = tf.constant(100.0)
def rosenbrock(x): x0 = x[0] x1 = x[1] first = tf.math.squared_difference(A, x0) second = B * tf.math.squared_difference(x1, tf.square(x0)) return first + second
def value_and_gradients(x): return tfp.math.value_and_gradient(rosenbrock, x)
@tf.functiondef hessian(opt): x = opt.position value = rosenbrock(x) hess = tf.hessians(value, x)[0] return hess
def approx_hessian(opt): inverse_hess = opt.inverse_hessian_estimate hess = tf.linalg.inv(inverse_hess) return hess
tf.random.set_seed(42)
init = tf.random.normal([2]) opt = tfp.optimizer.bfgs_minimize(value_and_gradients, init) print(f"initial position: {init}")print(f"true solution: [{A}, {A*A}]") print(f"found solution: {opt.position}")print(f"converged: {opt.converged}")print(f"num_iterations: {opt.num_iterations}") print(f"hessian at solution:\n{hessian(opt)}")print(f"approx hessian at solution:\n{approx_hessian(opt)}")
(@tf.function is only used here to make calculating the hessian easier).
which outputs:
(tf2) $ python bfgs_fail.py initial position: [ 0.3274685 -0.8426258] true solution: [3.0, 9.0] found solution: [3.0000024 9.000014 ] converged: False num_iterations: 34 hessian at solution: [[ 7202.0117 -1200.001 ] [-1200.001 200. ]] approx hessian at solution: [[ 7280.896 -1213.2786 ] [-1213.2787 202.23483]]
using the latest tensorflow 2 alpha and tfp nightly from pip.
Would it be possible to also output why the routine has failed? Has it actually failed?
If it is at all useful, the LBFGS routine gets the right answer and reports it accordingly.
I am very happy to devote some time to helping with this if I could get a few pointers to start me off. Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/341, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJtGXfMpeuqD5-Y1N9sZGNufSW_y_ohks5vajkEgaJpZM4cLtKp .
-- Christopher Suter | SWE | cgs@google.com | 352-234-4096
Wrong again! Printing opt.failed yields "True". The docstring for this field reads
failed: boolean tensor of shape `[...]` indicating for each
batch member whether a line search step failed to find a
suitable step size satisfying Wolfe conditions. In the absence
of any constraints on the number of objective evaluations
permitted, this value will be the complement of converged
.
However, if there is a constraint and the search stopped due to
available evaluations being exhausted, both failed
and
converged
will be simultaneously False.
So it sounds like the line search failed at step 34. The line search algorithm is Hager-Zhang This is pushing towards the boundaries of how well I understand the optimizer code... :) Maybe someone else from the team can chime in. @SiegeLordEx ?
@csuter thanks for the comments. I see now that the reason of failure is actually returned (and it is documented) so apologies for not realising, although I do wonder if line_search_failed
(or similar) would be a better name than failed
. I would be happy to send over a brief pull request with that change to the whole optimizer module if that is correct and it would be useful?
Looking at the line search code it seems sensitive to the float type (see _machine_eps) so I tried changing tf.float32
to tf.float64
and it now works (as in opt.converged = True
and the solution is spot on) in 44 iterations!
Anyway, I can close this issue as that is all cleared now, thanks again, and let me know if the pull request would be useful.
Ah, great, glad to hear you're unblocked!
I'd personally be open to the name change but it would be a backward incompatible API change. This may have implications for existing users. Please feel free to file an issue and we can look into whether and how to migrate the name.
In the following example
bfgs_minimize
reports a failure to converge when it is very close to the solution:(
@tf.function
is only used here to make calculating the hessian easier).which outputs:
using the latest tensorflow 2 alpha and tfp nightly from pip.
Would it be possible to also output why the routine has failed? Has it actually failed?
If it is at all useful, the LBFGS routine gets the right answer and reports it accordingly.
I am very happy to devote some time to helping with this if I could get a few pointers to start me off. Thanks!