Closed joelberkeley closed 3 years ago
I think there is a missing comma in:
inverted = bijector.inverse(tf.constant([[-0.06450844 -0.02390611]]))
but perhaps also another issue somewhere.
@jeffpollock9 yes, thanks. updated. error still happens
Good catch on the comma! Everything still goes through (it is subtraction, but gets broadcasted to the "right" shape), but quietly gives the wrong answer.
Luckily, the same bug shows up, which is that tfp.math.value_and_gradient
returns a None
gradient, and it is because it tries to trace further up than it can see. Adding a tf.stop_gradient
fixes it, I think (it depends on what you want to minimize -- I assume you have a reason for inverting, and then pushing back forward a point!):
opt = tfp.optimizer.lbfgs_minimize(
lambda x: tfp.math.value_and_gradient(_quadratic, x),
tf.stop_gradient(inverted))
I assume you have a reason for inverting, and then pushing back forward a point!
indeed. We want to optimize a function over a constrained space, so we're training an unconstrained parameter and using a bijector to keep it in the constrained region
@joelberkeley -- I think stop_gradient
will fix the issue after the typo is taken care of, is that true?
@ColCarroll it certainly fixes it for my actual use case, Would you say this is a workaround, or a canonical solution?
if it is the correct solution, can it be in the docs please? the docs say
real Tensor of shape [..., n]. The starting point, or points when using batching dimensions, of the search procedure. At these points the function value and the gradient norm should be finite.
I don't get the idea that I need to use tf.stop_gradients
from that
This is only a workaround to a known bug -- the root cause is bijectors cacheing outputs, and there has been some work done to address this.
I believe you could also add 0 (inverted + 0
), do inverted.numpy()
, or wrap in tf.identity
, and those should all work.
Apologies that you ran into it -- hopefully including this discussion here might help users until this is fixed!
ok. could you show me the ticket which tracks the bug, so I can follow it?
Oy, apologies -- #1190 will track it.
I'm seeing
when using
lbfgs_minimize
and theSigmoid
bijectorIf I use
inverted
directly, it errors, but if I use any of the commented out versions, it works.I'm using tfp version 0.11.1 and tf version 2.3.1