Getting "TypeError: Exception encountered when calling layer "tf.keras.backend.rnn" (type TFOpLambda)" when I employ the Attention Layer

talhakabakus commented 2 years ago

I'm trying to re-implement the text summarization tutorial here. Getting the following error when I employ the Attention Layer:

/usr/local/lib/python3.7/dist-packages/keras/engine/keras_tensor.py in __array__(self, dtype)
    253   def __array__(self, dtype=None):
    254     raise TypeError(
--> 255         f'You are passing {self}, an intermediate Keras symbolic input/output, '
    256         'to a TF API that does not allow registering custom dispatchers, such '
    257         'as `tf.cond`, `tf.function`, gradient tapes, or `tf.map_fn`. '

TypeError: Exception encountered when calling layer "tf.keras.backend.rnn" (type TFOpLambda).

You are passing KerasTensor(type_spec=TensorSpec(shape=(None, 101), dtype=tf.float32, name=None), name='tf.compat.v1.nn.softmax_1/Softmax:0', description="created by layer 'tf.compat.v1.nn.softmax_1'"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as `tf.cond`, `tf.function`, gradient tapes, or `tf.map_fn`. Keras Functional model construction only supports TF API calls that *do* support dispatching, such as `tf.math.add` or `tf.reshape`. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by putting the operation in a custom Keras layer `call` and calling that layer on this symbolic input/output.

Call arguments received:
  • step_function=<function AttentionLayer.call.<locals>.energy_step at 0x7f1d5ff279e0>
  • inputs=tf.Tensor(shape=(None, None, 256), dtype=float32)
  • initial_states=['tf.Tensor(shape=(None, 101), dtype=float32)']
  • go_backwards=False
  • mask=None
  • constants=None
  • unroll=False
  • input_length=None
  • time_major=False
  • zero_output_for_mask=False

How can I overcome this error? I've added my software stack below:

- OS: macOS
- TensorFlow: 2.8.0
- Keras: 2.8.0
- Python Version: 3.7.12 (default, Jan 15 2022, 18:48:18)

TheZaraKhan commented 2 years ago

@thushv89 Please solve this issue

eliashossain001 commented 2 years ago

Has anyone solved these issues?

thushv89 commented 2 years ago

Hi, Sorry about the delay and thanks for raising this. Yes there's an unknown issue with the tf.keras.backend.rnn operation in tensorflow>2.6. I haven't been able to look into this in too much depth. I have tried few potential fixes but they all failed. So looks like something major has shifted since tensorflow>2.6.

The easiest solution would be to downgrade to tensorflow==2.6 until the issue is understood. But will keep you posted on the findings.

rudy-becarelli commented 2 years ago

Hi @thushv89 any news? I just tried to use tensorflow==2.6 but i still have the same problem....

AliMi001 commented 2 years ago

what about using the keras additive attention layer _attn_out = AdditiveAttention()([decoder_outputs, encoderoutputs]) would it be a fix ?

thushv89 commented 2 years ago

@rudy-becarelli: Apologies about the delay. Could you try tensorflow==2.5.0?

@AliMi001: It would be a close substitute but not a complete one. For example you can see here how TensorFlow uses AdditiveAttention for implementing Bahdanau attention. However, it does not compute attention sequentially as a sequence would be processed in a RNN. Rather, it computes all outputs first and then apply attention on top of that. I'm not sure what the performance difference is, but it is different from the original Bahdanau attention proposed in the paper.