tensorflow / models

Models and examples built with TensorFlow
Other
76.97k stars 45.79k forks source link

Bert2Bert model call fails when setting padded_decode to True #10222

Closed LoicDagnas closed 1 year ago

LoicDagnas commented 3 years ago

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/official/nlp/nhnet/models.py

2. Describe the bug

While using a Bert2Bert model instance setting the padded_decode parameter to True (e.g. for TPU usage), I am forced to specify the batch size in the input when calling the model.

3. Steps to reproduce

You can simply run the following code:

import tensorflow as tf
from official.nlp.nhnet.configs import UNITTEST_CONFIG, BERT2BERTConfig
from official.nlp.nhnet.models import Bert2Bert, get_bert2bert_layers

bert2bert_config_dict = UNITTEST_CONFIG.copy()
bert2bert_config_dict["len_title"] = 32
bert2bert_config_dict["max_position_embeddings"] = 200
bert2bert_config_dict["padded_decode"] = True

bert2bert_config = BERT2BERTConfig.from_args(**bert2bert_config_dict)
bert_layer, decoder_layer = get_bert2bert_layers(params=bert2bert_config)

bert2bert = Bert2Bert(bert2bert_config, bert_layer, decoder_layer)

inputs = {
    "input_ids": tf.keras.layers.Input((200,), dtype=tf.int32, name="input_ids"),
    "input_mask": tf.keras.layers.Input((200,), dtype=tf.int32, name="input_mask"),
    "segment_ids": tf.keras.layers.Input((200,), dtype=tf.int32, name="segment_ids"),
    "target_ids": tf.keras.layers.Input((32,), dtype=tf.int32, name="target_ids")
}

output = bert2bert(inputs, mode='predict')

you'll get the following stack:

[...]
C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\official\nlp\nhnet\models.py:168 predict_decode  *
        decoded_ids, scores = beam_search.sequence_beam_search(
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\official\nlp\modeling\ops\beam_search.py:622 sequence_beam_search  *
        return sbs.search(initial_ids, initial_cache)
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\official\nlp\modeling\ops\beam_search.py:158 search  *
        state, state_shapes = self._create_initial_state(initial_ids, initial_cache,
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\official\nlp\modeling\ops\beam_search.py:419 _create_initial_state  *
        alive_log_probs = tf.tile(initial_log_probs, [batch_size, 1])
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\ops\gen_array_ops.py:11530 tile  **
        _, _, _op, _outputs = _op_def_library._apply_op_helper(
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\op_def_library.py:525 _apply_op_helper
        raise err
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\op_def_library.py:511 _apply_op_helper
        values = ops.convert_to_tensor(
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\profiler\trace.py:163 wrapped
        return func(*args, **kwargs)
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\ops.py:1566 convert_to_tensor
        ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\constant_op.py:346 _constant_tensor_conversion_function
        return constant(v, dtype=dtype, name=name)
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\constant_op.py:271 constant
        return _constant_impl(value, dtype, shape, name, verify_shape=False,
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\constant_op.py:288 _constant_impl
        tensor_util.make_tensor_proto(
    C:\dev\ml\OnnxConversionLab\venv\lib\site-packages\tensorflow\python\framework\tensor_util.py:551 make_tensor_proto
        raise TypeError("Failed to convert object of type %s to Tensor. "

    TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [None, 1]. Consider casting elements to a supported type.

but if you give the following input with the batch size specified:

inputs = {
    "input_ids": tf.keras.layers.Input((200,), dtype=tf.int32, name="input_ids", batch_size=8),
    "input_mask": tf.keras.layers.Input((200,), dtype=tf.int32, name="input_mask", batch_size=8),
    "segment_ids": tf.keras.layers.Input((200,), dtype=tf.int32, name="segment_ids", batch_size=8),
    "target_ids": tf.keras.layers.Input((32,), dtype=tf.int32, name="target_ids", batch_size=8)
}

it will work.

4. Expected behavior

I was expecting that it works in both case i.e. specifying the batch size or not.

5. Additional context

X

6. System information

saberkun commented 2 years ago

(1) batch size must be known for beam search; (2) target_ids tensor is not necessary for decoding.

LoicDagnas commented 2 years ago

(1) it must be the same only when padded_decode is set to true?

(2) yes, in fact it is related to my other issue concerning this model https://github.com/tensorflow/models/issues/10221

laxmareddyp commented 1 year ago

Hi @LoicDagnas,

Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base. The TF models official team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate. Please follow the release notes to stay up to date with the latest developments which are happening in the TF models official space.

github-actions[bot] commented 1 year ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 year ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No