Open Lescurel opened 5 years ago
This is probably an issue with the new broadcasting tf.where, which has temporarily been rolled back in tf.
Brian Patton | Software Engineer | bjp@google.com
From: Louis KLEIN notifications@github.com Date: Wed, May 8, 2019 at 9:13 AM To: tensorflow/probability Cc: Subscribed
System information
- OS Platform and Distribution: Linux NixOS unstable
- TensorFlow installed from : binary using anaconda
- TensorFlow version : '2.0.0-alpha0'
- TensorFlow Probability version : '0.7.0-dev20190504'
- Python version: 3.6.8
Describe the current behavior
When trying to use the tfp.optimizer.lbfgs_minimize function, I get an error, :
InvalidArgumentError: Inputs to operation Select of type Select must have the same size and shape. Input 0: [1,2] != input 1: [] [Op:Select]
Describe the expected behavior
This should run without issue, as it works under TensorFlow 1.13.1 and TensorFlow Probability 0.6.0, with tf.enable_eager_execution()
Code to reproduce the issue The following code runs under TF 1.13.1 with TFP 0.6.0, but not with TF 2.0 with TFP 0.7.0
import numpy as np import tensorflow as tf import tensorflow_probability as tfp
class TestEager(): def init(self):
tf.losses.mean_squarred_error is not the same under TF 2.0
self.mse = tf.losses.mean_squared_error if tf.__version__ == '2.0.0-alpha0': self.mse = tf.losses.MeanSquaredError() def __call__(self, inputs): loss = 0 with tf.GradientTape() as tape: tape.watch(inputs) new_guess = np.random.rand(*inputs.shape) loss += self.mse(inputs, new_guess) grad = tape.gradient(loss, inputs) return loss, grad
def main_eager(): guess = np.random.rand(1,2,3).astype(np.float32) test_eager = TestEager() res = tfp.optimizer.lbfgs_minimize( test_eager, initial_position=guess, tolerance=1e-8) print(res)
if name == "main": version = tf.version if version == '2.0.0-alpha0': main_eager() else: tf.enable_eager_execution() main_eager()
Other info / logs Traceback :
2019-05-08 14:58:50.115047: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-05-08 14:58:50.147511: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1800000000 Hz 2019-05-08 14:58:50.148229: I tensorflow/compiler/xla/service/service.cc:162] XLA service 0x55a747566a30 executing computations on platform Host. Devices: 2019-05-08 14:58:50.148279: I tensorflow/compiler/xla/service/service.cc:169] StreamExecutor device (0):
, Traceback (most recent call last): File "reprodcuing_bug.py", line 176, in main_eager() File "reprodcuing_bug.py", line 159, in main_eager tolerance=1e-8) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/lbfgs.py", line 260, in minimize parallel_iterations=parallel_iterations)[0] File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3216, in while_loop_v2 return_same_structure=True) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3442, in while_loop loop_vars = body(loop_vars) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/lbfgs.py", line 238, in _body tolerance, f_relative_tolerance, x_tolerance, stopping_condition) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/bfgs_utils.py", line 153, in line_search_step converged=inactive) # No search needed for these. File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/linesearch/hager_zhang.py", line 283, in hager_zhang right=hzl.val_where(init_converged, val_0, val_c)) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/linesearch/internal/hager_zhang_lib.py", line 45, in val_where return cls((val_where(cond, t, f) for t, f in zip(tval, fval))) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/linesearch/internal/hager_zhang_lib.py", line 45, in return cls((val_where(cond, t, f) for t, f in zip(tval, fval))) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow_probability/python/optimizer/linesearch/internal/hager_zhang_lib.py", line 42, in val_where return tf.where(cond, tval, fval) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(args, **kwargs) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 3231, in where return gen_math_ops.select(condition=condition, x=x, y=y, name=name) File "/home/beren/.conda/envs/style_transfer/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 9060, in select _six.raise_from(_core._status_to_exception(e.code, message), None) File " ", line 3, in raise_from tensorflow.python.framework.errors_impl.InvalidArgumentError: Inputs to operation Select of type Select must have the same size and shape. Input 0: [1,2] != input 1: [] [Op:Select] — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/398, or mute the thread https://github.com/notifications/unsubscribe-auth/AFJFSI566RMN4I7564FRJM3PULGWFANCNFSM4HLRSPDA .
Any updates on this?
There must have been some fixes pushed because I'm now able to use lbfgs_minimize with TF2.0 (tfp-nightly and tensorflow=2.0.0-beta1).
The might be a regression here, I tried running the same test case as above and encountered the following:
InvalidArgumentError: Inputs to operation Select of type Select must have the same size and shape. Input 0: [1,2] != input 1: [1,2,3] [Op:Select]
In [3]: tf.__version__
Out[3]: '2.0.0-rc0'
In [4]: tfp.__version__
Out[4]: '0.9.0-dev20190905'
In [6]: sys.version
Out[6]: '3.7.3 | packaged by conda-forge
Do you see the same issue with tensorflow-probability==0.8.0rc0?
Same result.
In [2]: main_eager()
InvalidArgumentError: Inputs to operation Select of type Select must have the same size and shape. Input 0: [1,2] != input 1: [1,2,3] [Op:Select]
In [3]: tfp.__version__
Out[3]: '0.8.0-rc0'
FYI, I recently updated my env to 0.9.0-dev20190915
and I'm still seeing the issue.
I also checked that the same error occurs on bfgs_minimize()
.
I think this might be working now, in that I was able to run the following example:
import numpy as np
import functools
import tensorflow.compat.v2 as tf
import tensorflow_probability as tfp
def _make_val_and_grad_fn(value_fn):
@functools.wraps(value_fn)
def val_and_grad(x):
return tfp.math.value_and_gradient(value_fn, x)
return val_and_grad
@_make_val_and_grad_fn
def quadratic(x):
scales = np.array([1.0, 3.0])
minimum = np.array([0.1, 0.3])
return tf.reduce_sum(input_tensor=scales * (x - minimum)**2)
def rosenbrock(coord):
x, y = coord[0], coord[1]
fv = (1 - x)**2 + 100 * (y - x**2)**2
dfx = 2 * (x - 1) + 400 * x * (x**2 - y)
dfy = 200 * (y - x**2)
return fv, tf.stack([dfx, dfy])
start = tf.constant([-1.2, 1.7])
out_rosenbrock = tfp.optimizer.lbfgs_minimize(rosenbrock, initial_position=start, tolerance=1e-5)
start = tf.constant([-1.2, 1.7])
out_quadratic = tfp.optimizer.lbfgs_minimize(quadratic, initial_position=start, tolerance=1e-5)
@kyleabeauchamp so what is the status of this? What changed that it works now? I see a similar error with TFP 0.8
I'm going to close this since I can't reproduce this. I believe that part of this issue had to do with tf.where. tf.where (V2) has broadcasting support, but doesn't allow for the conditional to be a prefix to the batch shape of the branches (which was a thing in V1). I believe an up to date TF and TFP should not have this issue any more (as we import the correct version of tf.where in the BFGS code with an updated TF and TFP).
@srvasude Sorry to necro, but I'm getting exactly the behaviour of #39970, with everything up to date. Have you tried @kyleabeauchamp's example with batching? Because I'm only seeing the error when attempting to batch minimize.
Here's a mwe:
# pip freeze
absl-py==0.11.0
appdirs==1.4.3
astunparse==1.6.3
CacheControl==0.12.6
cachetools==4.2.1
certifi==2019.11.28
chardet==3.0.4
cloudpickle==1.6.0
colorama==0.4.3
contextlib2==0.6.0
decorator==4.4.2
distlib==0.3.0
distro==1.4.0
dm-tree==0.1.5
flatbuffers==1.12
gast==0.3.3
google-auth==1.25.0
google-auth-oauthlib==0.4.2
google-pasta==0.2.0
grpcio==1.32.0
h5py==2.10.0
html5lib==1.0.1
idna==2.8
ipaddr==2.2.0
Keras-Preprocessing==1.1.2
lockfile==0.12.2
Markdown==3.3.3
msgpack==0.6.2
numpy==1.19.5
oauthlib==3.1.0
opt-einsum==3.3.0
packaging==20.3
pep517==0.8.2
progress==1.5
protobuf==3.14.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.6
pytoml==0.1.21
requests==2.22.0
requests-oauthlib==1.3.0
retrying==1.3.3
rsa==4.7
six==1.15.0
tensorboard==2.4.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.4.1
tensorflow-estimator==2.4.0
tensorflow-probability==0.12.1
termcolor==1.1.0
typing-extensions==3.7.4.3
urllib3==1.25.8
webencodings==0.5.1
Werkzeug==1.0.1
wrapt==1.12.1
# mwe.py
import tensorflow as tf
import tensorflow_probability as tfp
def function_and_gradient(x):
print("CALLED")
return x**2, 2*x
start = tf.constant([[-1.2], [1.7]])
opt_result = tfp.optimizer.lbfgs_minimize(function_and_gradient, initial_position=start, tolerance=1e-5)
print(opt_result)
# ... Tensorflow initializes ....
CALLED
CALLED
Traceback (most recent call last):
File "mwe.py", line 9, in <module>
opt_result = tfp.optimizer.lbfgs_minimize(function_and_gradient, initial_position=start, tolerance=1e-5)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/lbfgs.py", line 284, in minimize
return tf.while_loop(
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py", line 605, in new_func
return func(*args, **kwargs)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2489, in while_loop_v2
return while_loop(
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2735, in while_loop
loop_vars = body(*loop_vars)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/lbfgs.py", line 257, in _body
next_state = bfgs_utils.line_search_step(
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/bfgs_utils.py", line 210, in line_search_step
ls_result = linesearch.hager_zhang(
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py", line 538, in new_func
return func(*args, **kwargs)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/linesearch/hager_zhang.py", line 277, in hager_zhang
right=hzl.val_where(init_converged, val_0, val_initial))
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/linesearch/internal/hager_zhang_lib.py", line 46, in val_where
return cls(*(val_where(cond, t, f) for t, f in zip(tval, fval)))
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/linesearch/internal/hager_zhang_lib.py", line 46, in <genexpr>
return cls(*(val_where(cond, t, f) for t, f in zip(tval, fval)))
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow_probability/python/optimizer/linesearch/internal/hager_zhang_lib.py", line 43, in val_where
return tf1.where(cond, tval, fval)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 4483, in where
return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 8676, in select
_ops.raise_from_not_ok_status(e, name)
File "/tmp/mwe/venv/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6862, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Inputs to operation Select of type Select must have the same size and shape. Input 0: [2,2] != input 1: [2] [Op:Select]
EDIT: Doing something closer to the docs' example and initializing start
with numpy routines gets me correct behaviour, so maybe this is misuse; but what's wrong with that mwe?
Confirmed, I have a repro
def testIssue398(self):
mse = tf.keras.losses.MeanSquaredError()
def f(inputs):
loss = 0
with tf.GradientTape() as tape:
tape.watch(inputs)
new_guess = np.random.rand(*inputs.shape)
loss += mse(inputs, new_guess)
grad = tape.gradient(loss, inputs)
return loss, grad
self.evaluate(tfp.optimizer.lbfgs_minimize(
f,
initial_position=np.random.rand(1, 2, 3).astype(np.float32),
tolerance=1e-8))
This seems to relate somehow to calling tf.reduce_sum
; the following works:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np
if __name__ == "__main__":
def quadratic_function(x):
return tfp.math.value_and_gradient(
lambda x: tf.reduce_sum(x**2, axis=1), x)
start = np.array([[1.0], [-2.0]])
optim_results = tfp.optimizer.lbfgs_minimize(
quadratic_function,
initial_position=start,
num_correction_pairs=10,
tolerance=1e-8)
but
def quadratic_function(x):
return tfp.math.value_and_gradient(
lambda x: x**2, x)
does not.
(Calling tf.reduce_sum(x)
flattens the array from e.g. [[1.0],[2.0]]
to [1.0, 2.0]
)
Update: Can confirm that it's misuse; the docs specify the output should have shape [...]
, and not [..., 1]
. It would be nice if this were caught more gracefully.
Quick note, in case this helps someone else. I had this issue, but the problem for me turned out to be that I thought I had to provide 1:1 correspondence pairs between the returned (loss, gradients). I was trying to return e.g. 10 loss values matching the 10 gradients, even though my loss was only really 1 value broadcasted to the same shape. Actually, once I returned just a shape (1) loss tensor with a (e.g.) shape (10) gradient tensor (for 10 variables being fit), then things worked great.
System information
Describe the current behavior
When trying to use the
tfp.optimizer.lbfgs_minimize
function, I get an error, :Describe the expected behavior
This should run without issue, as it works under TensorFlow
1.13.1
and TensorFlow Probability0.6.0
, with tf.enable_eager_execution()Code to reproduce the issue The following code runs under TF 1.13.1 with TFP 0.6.0, but not with TF 2.0 with TFP 0.7.0
Other info / logs Traceback :