Closed Sayam753 closed 4 years ago
Thanks for reporting this; that's a pretty nasty error. It looks like it's not specific to the Autoregressive model; any instance of tfd.LinearGaussianStateSpaceModel
has the same problem. I've simplified the failing example a bit, there's a runnable version in this colab: https://colab.research.google.com/drive/14ytDF-74jvDYtJXff0IktJLBin_BuKYJ. (EDIT: I further simplified the example and the linked colab; see my comment below)
# Generating data
np.random.seed(seed=42)
data=np.random.normal([100, 1]).astype(np.float32)
def log_prob(x):
return tfd.LinearGaussianStateSpaceModel(
num_timesteps=100,
transition_matrix=tf.eye(1),
transition_noise=tfd.MultivariateNormalDiag(loc=[x], scale_diag=[1]),
observation_matrix=tf.eye(1),
observation_noise=tfd.MultivariateNormalDiag(loc=[0.], scale_diag=[1.]),
initial_state_prior=tfd.MultivariateNormalDiag(scale_diag=[1e-6]),
).log_prob(data)
def vectorize_function(function):
def vectorizedfn(*q_samples):
return tf.vectorized_map(
lambda samples: function(*samples), q_samples)
return vectorizedfn
v_log_prob = vectorize_function(log_prob)
print(v_log_prob(x)) # Works.
vfn = vectorize_function(log_prob)
with tf.GradientTape() as tape:
lp = v_log_prob(x)
g = tape.gradient(lp, x) # Raises exception.
Using vectorized_map to compute the log prob works, but the gradient raises an error very similar to the one you report: TypeError: Value passed to parameter 'input' has DataType variant not in list of allowed values: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, uint16, complex128, float16, uint32, uint64
(with a stack trace 67 frames deep).
Actually I managed to get this to an even simpler case that uses only TF (no TFP at all):
import numpy as np
import tensorflow as tf
# Generating data
np.random.seed(seed=42)
data = np.random.randn(100).astype(np.float32)
def log_prob(x):
return tf.reduce_sum(tf.scan(
lambda _, yi: (yi - x)**2,
elems=data,
initializer=tf.convert_to_tensor(0.),))
def vectorize_function(function):
def vectorizedfn(*q_samples):
return tf.vectorized_map(
lambda samples: function(*samples), q_samples)
return vectorizedfn
v_log_prob = vectorize_function(log_prob)
x = tf.Variable(tf.random.normal([10]))
with tf.GradientTape() as tape:
lp = v_log_prob(x)
g = tape.gradient(lp, x) # Raises exception.
The issue seems to be in the interaction of taking the gradient of a vectorized scan
loop.
Since this looks like a TF bug, I went ahead and filed an issue with TF: https://github.com/tensorflow/tensorflow/issues/41789
I'm going to go ahead and close this issue for now, though feel free to reopen if something TFP-specific pops up.
Hi @davmre Thanks a lot for debugging this issue. Let's see how this turns out to be in TF.
I have been trying to fit Auto Regressive model by Mean Field ADVI. But using
tf.vectorized_map
while calculating log_prob results in XLA and Dtype issues.Code Snippet
```python import numpy as np import tensorflow as tf import tensorflow_probability as tfp from tensorflow_probability.python.mcmc.transformed_kernel import ( make_transformed_log_prob, ) dtype = tf.float32 tfb = tfp.bijectors tfd = tfp.distributions # Generating data np.random.seed(seed=42) T = 100 y = np.zeros((T,)) for i in range(1,T): y[i] = 0.95 * y[i-1] + np.random.normal() data = y.reshape(-1, 1) model = tfd.JointDistributionSequential([ tfd.Normal(loc=0, scale=1.), lambda e: tfp.sts.AutoregressiveStateSpaceModel( num_timesteps=100, coefficients=[e], level_scale=0.1, initial_state_prior=tfd.MultivariateNormalDiag(scale_diag=[1e-6]), ) ]) def vectorize_function(function): def vectorizedfn(*q_samples): return tf.vectorized_map(lambda samples: function(*samples), q_samples) return vectorizedfn joint_log_prob = vectorize_function(lambda *x: model.log_prob(x+(data, ))) # joint_log_prob = lambda *x: model.log_prob(x+(data, )) # Transformations to bounded space unconstraining_bijectors = [tfb.Identity()] target_log_prob = make_transformed_log_prob( joint_log_prob, unconstraining_bijectors, direction="forward", enable_bijector_caching=False, ) def build_mf_advi(): parameters = model.sample()[:-1] dists = [] for i, parameter in enumerate(parameters): shape = parameter.shape loc = tf.Variable( tf.random.normal(shape, dtype=dtype), name=f"meanfield_{i}_loc", dtype=dtype, ) scale = tfp.util.TransformedVariable( tf.fill(shape, value=tf.constant(1, dtype=dtype)), tfb.Softplus(), # For positive values of scale name=f"meanfield_{i}_scale", ) approx_parameter = tfd.Normal(loc=loc, scale=scale) dists.append(approx_parameter) return tfd.JointDistributionSequential(dists) posterior = build_mf_advi() num_steps = 5_000 def trace_fn(trace): tf.cond( tf.math.mod(trace.step, 100) == 0, lambda: tf.print(trace.step, "/", num_steps, "Loss:", trace.loss, end="\r"), lambda: tf.print("", end="") ) return trace.loss opt = tf.optimizers.Adam(learning_rate=0.1) @tf.function(autograph=False) def run_approximation(): elbo_loss = tfp.vi.fit_surrogate_posterior( target_log_prob, surrogate_posterior=posterior, optimizer=opt, num_steps=num_steps, trace_fn=trace_fn ) return elbo_loss elbo_loss = run_approximation() ```Traceback
```python --------------------------------------------------------------------------- InvalidArgumentError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2511 with c_api_util.tf_buffer() as buf: -> 2512 pywrap_tf_session.TF_OperationGetAttrValueProto(self._c_op, name, buf) 2513 data = pywrap_tf_session.TF_GetBuffer(buf) InvalidArgumentError: Operation 'while' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py in _MaybeCompile(scope, op, func, grad_fn) 330 try: --> 331 xla_compile = op.get_attr("_XlaCompile") 332 xla_separate_compiled_gradients = op.get_attr( /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2515 # Convert to ValueError for backwards compatibility. -> 2516 raise ValueError(str(e)) 2517 x = attr_value_pb2.AttrValue() ValueError: Operation 'while' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: InvalidArgumentError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2511 with c_api_util.tf_buffer() as buf: -> 2512 pywrap_tf_session.TF_OperationGetAttrValueProto(self._c_op, name, buf) 2513 data = pywrap_tf_session.TF_GetBuffer(buf) InvalidArgumentError: Operation 'while/while_body/cond' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py in _MaybeCompile(scope, op, func, grad_fn) 330 try: --> 331 xla_compile = op.get_attr("_XlaCompile") 332 xla_separate_compiled_gradients = op.get_attr( /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2515 # Convert to ValueError for backwards compatibility. -> 2516 raise ValueError(str(e)) 2517 x = attr_value_pb2.AttrValue() ValueError: Operation 'while/while_body/cond' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: InvalidArgumentError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2511 with c_api_util.tf_buffer() as buf: -> 2512 pywrap_tf_session.TF_OperationGetAttrValueProto(self._c_op, name, buf) 2513 data = pywrap_tf_session.TF_GetBuffer(buf) InvalidArgumentError: Operation 'while/while_body/cond/PartitionedCall' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py in _MaybeCompile(scope, op, func, grad_fn) 330 try: --> 331 xla_compile = op.get_attr("_XlaCompile") 332 xla_separate_compiled_gradients = op.get_attr( /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2515 # Convert to ValueError for backwards compatibility. -> 2516 raise ValueError(str(e)) 2517 x = attr_value_pb2.AttrValue() ValueError: Operation 'while/while_body/cond/PartitionedCall' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: InvalidArgumentError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2511 with c_api_util.tf_buffer() as buf: -> 2512 pywrap_tf_session.TF_OperationGetAttrValueProto(self._c_op, name, buf) 2513 data = pywrap_tf_session.TF_GetBuffer(buf) InvalidArgumentError: Operation 'monte_carlo_variational_loss/expectation/loop_body/JointDistributionSequential/log_prob/monte_carlo_variational_loss_expectation_loop_body_JointDistributionSequential_log_prob_AutoregressiveStateSpaceModel/log_prob/scan/while/TensorArrayV2Write/TensorListSetItem/pfor/Tile' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py in _MaybeCompile(scope, op, func, grad_fn) 330 try: --> 331 xla_compile = op.get_attr("_XlaCompile") 332 xla_separate_compiled_gradients = op.get_attr( /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in get_attr(self, name) 2515 # Convert to ValueError for backwards compatibility. -> 2516 raise ValueError(str(e)) 2517 x = attr_value_pb2.AttrValue() ValueError: Operation 'monte_carlo_variational_loss/expectation/loop_body/JointDistributionSequential/log_prob/monte_carlo_variational_loss_expectation_loop_body_JointDistributionSequential_log_prob_AutoregressiveStateSpaceModel/log_prob/scan/while/TensorArrayV2Write/TensorListSetItem/pfor/Tile' has no attr named '_XlaCompile'. During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last)System Details
```bash numpy - 1.18.5 tensorflow_probability - 0.12.0-dev20200723 tensorflow - 2.4.0-dev20200723 pip - 20.1.1 CPython - 3.7.7 IPython - 7.16.1 System - Darwin Mac 19.6.0 Darwin Kernel Version 19.6.0: Sun Jul 5 00:43:10 PDT 2020; root:xnu-6153.141.1~9/RELEASE_X86_64 x86_64 ```If I do not use
tf.vectorized_map
, everything works good. I am not sure if this issue has to be opened on tensorflow side. Any help in this regard? Thanks