tf.keras.models.load_model() does not load saved model that includes TFOpenAIGPTLMHeadModel layer

wulikai1993 commented 4 years ago

transformers version: 3.1.0
Platform: linux
Python version: 3
Tensorflow version: 2.3.0

To reproduce

Steps to reproduce the behavior:

Load the model with TFOpenAIGPTLMHeadModel
Add input layers
save the model
Load saved model

from transformers import TFOpenAIGPTLMHeadModel
import tensorflow as tf

tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('./trans_model', from_pt=True) # ./trans_model is the directory including pre-trained model from pytorch
max_len = None

input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids_layer', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids_layer', dtype='int32')
keras_input = [input_ids, token_type_ids]

qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.summary()
keras_model.save("./saved_model")
print('**************************')
model = tf.keras.models.load_model("./saved_model")

Traceback (most recent call last):
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 395, in assert_same_structure
    expand_composites)
ValueError: The two structures don't have the same nested structure.

First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "interact_test.py", line 208, in <module>
    run()
  File "interact_test.py", line 180, in run
    model = tf.keras.models.load_model("./saved_model")
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 187, in load_model
    return saved_model_load.load(filepath, compile, options)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 121, in load
    path, options=options, loader_cls=KerasObjectLoader)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
    ckpt_options)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 194, in __init__
    super(KerasObjectLoader, self).__init__(*args, **kwargs)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 130, in __init__
    self._load_all()
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 221, in _load_all
    self._finalize_objects()
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 526, in _finalize_objects
    _finalize_saved_model_layers(layers_revived_from_saved_model)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 706, in _finalize_saved_model_layers
    inputs = infer_inputs_from_restored_call_function(call_fn)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 985, in infer_inputs_from_restored_call_function
    spec = nest.map_structure(common_spec, spec, spec2)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 629, in map_structure
    expand_composites=expand_composites)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 402, in assert_same_structure
    % (str(e), str1, str2))
ValueError: The two structures don't have the same nested structure.

First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}

wulikai1993 commented 4 years ago

I tried the resolution in #3627, but the output shape changed from 13088 to 768

LysandreJik commented 4 years ago

@jplu might be interested in this.

jplu commented 4 years ago

I cannot really test because I don't have your trans_model but as far as I can say it is not working because you are using the high level API (with Keras) to create a saved_model. For models with custom layers it is recommended to use the low level way, like this:

from transformers import TFOpenAIGPTLMHeadModel
import tensorflow as tf

tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('openai-gpt')
max_len = None

input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids_layer', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids_layer', dtype='int32')
keras_input = [input_ids, token_type_ids]

qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.summary()
tf.saved_model.save("./saved_model")
print('**************************')
model = tf.saved_model.load("./saved_model")

For me this works well.

wulikai1993 commented 4 years ago

tf.saved_model.save("./saved_model")

you mean tf.saved_model.save(keras_model, "./saved_model") ? I try it

tf.saved_model.save("./saved_model")
print('**************************')
model = tf.saved_model.load("./saved_model")
tf_logits = model.predict([tf_input_ids, tf_token_type_ids])

and the error:

tf_logits = model.predict([tf_input_ids, tf_token_type_ids])
AttributeError: '_UserObject' object has no attribute 'predict'

jplu commented 4 years ago

you mean tf.saved_model.save(keras_model, "./saved_model")

Yes sorry.

The error you get is normal because you are not loading a Keras model, but a Tensorflow model. To get a prediction you have to do something like:

model([tf_input_ids, None, tf_token_type_ids])

wulikai1993 commented 4 years ago

I changed the code as following:

tf_logits = model([tf_input_ids, tf_token_type_ids])

and the error:

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (3 total):
    * [<tf.Tensor 'inputs:0' shape=(1, 5) dtype=int64>, <tf.Tensor 'inputs_1:0' shape=(1, 5) dtype=int64>]
    * False
    * None
  Keyword arguments: {}

Expected these arguments to match one of the following 4 option(s):

Option 1:
  Positional arguments (3 total):
    * [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_layer'), TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids_layer')]
    * False
    * None
  Keyword arguments: {}

Option 2:
  Positional arguments (3 total):
    * [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1')]
    * True
    * None
  Keyword arguments: {}

Option 3:
  Positional arguments (3 total):
    * [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_layer'), TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids_layer')]
    * True
    * None
  Keyword arguments: {}

Option 4:
  Positional arguments (3 total):
    * [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1')]
    * False
    * None
  Keyword arguments: {}

jplu commented 4 years ago

Can you try:

model([tf_input_ids, tf_token_type_ids], False, None)

wulikai1993 commented 4 years ago

I try it, and the same error.

More information: In face, my ultimate objective is to use the savedmodel for tfserving. But I get the error when querying the serving:

Input to reshape is a tensor with 3840 values, but the requested shape has 768\n\t [[{{node functional_1/tf_open_aigptlm_head_model/transformer/Reshape_3}}]]\n\t [[StatefulPartitionedCall/StatefulPartitionedCall]]'}

It means getting 3840(768*5) values instead of 768. And considering the original error:

First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}

Does it mean the input is replicated 5 times? And where the shape (None, 5) come from?

jplu commented 4 years ago

5 is the default input when a None input shape is given.

It means there is certainly a bug with handling Keras symbolic Tensors. In order to be sure can you run:

saved_model_cli show --dir ./saved_model --tag_set serve --signature_def serving_default

wulikai1993 commented 4 years ago

saved_model_cli show --dir ./saved_model --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['input_ids_layer'] tensor_info:
      dtype: DT_INT32
      shape: (-1, -1)
      name: serving_default_input_ids_layer:0
  inputs['token_type_ids_layer'] tensor_info:
      dtype: DT_INT32
      shape: (-1, -1)
      name: serving_default_token_type_ids_layer:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['tf_open_aigptlm_head_model'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, -1, 13088)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

In the code

keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.summary()
keras_model.save("./saved_model")
print('**************************')
model = tf.keras.models.load_model("./saved_model")

When I use the keras_model directly, it works perfectly in a dialogue task. But after saving it to savedmodel, everything collapse.

jplu commented 4 years ago

Ok, the first thing I see is that the names doesn't correspond to the name we use internally, it might be one of the cause that brings the issue.

wulikai1993 commented 4 years ago

Sorry, which names do you mean? Is it a transformers library bug or my code mistake?

jplu commented 4 years ago

Sorry, which names do you mean? Is it a transformers library bug or my code mistake?

In the lib.

Also by name I mean that you are using input_ids_layer and token_type_ids_layer while internally they are input_ids and token_type_ids.

wulikai1993 commented 4 years ago

So what should I do? Remove the _layer string?

jplu commented 4 years ago

It won't solve your problem. For now there is not a "practical" solution to get a saved model unless you set everything yourself:

from transformers import TFOpenAIGPTLMHeadModel, OpenAIGPTTokenizer
import tensorflow as tf

tokenizer = OpenAIGPTTokenizer.from_pretrained("openai-gpt")
model = TFOpenAIGPTLMHeadModel.from_pretrained("openai-gpt")
inputs = tokenizer("Put a sentence here", return_tensors="tf")
model._saved_model_inputs_spec = None
model._set_save_spec(dict(inputs))
tf.saved_model.save(model, "./saved_model")

And then using it normally with TF serving. This solution has a big constraint, you have to set manually the size of your input sequence. This is for now the only solution I can give you because we haven't make the TF models fully TF Serving compliant yet. This is planed for a future release.

wulikai1993 commented 4 years ago

Thanks! Looking forward to the new release.

jplu commented 4 years ago

Or this should do the trick:

tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('openai-gpt')
input_ids = tf.keras.layers.Input(shape=(128,), name='input_ids', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(128,), name='token_type_ids', dtype='int32')
keras_input = [input_ids, token_type_ids]

qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = [qa_output])
keras_model.trainable = False
keras_model.summary()
keras_model.save("./saved_model", save_format="tf")

With this I can run the saved_model inside the TF serving Docker image, but for all the cases you have to set yourself your sequence length.

wulikai1993 commented 4 years ago

Sorry, I need a dynamic input length for the dialogue task.

jplu commented 4 years ago

As a temporary solution you can set the size of your inputs to the max length of the tokenizer. As you won't be able to get bigger sequences from the tokenizer you can be safe.

wulikai1993 commented 4 years ago

I use the following code:

tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('./trans_model', from_pt=True)
max_len = 128
input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids', dtype='int32')
keras_input = [input_ids, token_type_ids]

qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.trainable = False
keras_model.summary()
keras_model.save("./saved_model", save_format="tf")

I use keras_model directly to predict. The strange thing is the first 2 predicting always worked fine, while the third predicting collapse (the model always uses the previous sequence to predict the next put, so the input shape will plus 1 each time).

>>> 你现在做什么工作呢
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 12).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 12).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 12).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 12).
logits shape:  (1, 12, 13088)
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 13).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 13).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 13).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 13).
logits shape:  (1, 13, 13088)
Traceback (most recent call last):
  File "interact_test.py", line 233, in <module>
    run()
  File "interact_test.py", line 225, in run
    out_ids = sample_sequence(history, tokenizer, keras_model, args)
  File "interact_test.py", line 121, in sample_sequence
    tf_logits = model.predict([tf_input_ids, tf_token_type_ids])
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 130, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1599, in predict
    tmp_batch_outputs = predict_function(iterator)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 814, in _call
    results = self._stateful_fn(*args, **kwds)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 550, in call
    ctx=ctx)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Input to reshape is a tensor with 14 values, but the requested shape requires a multiple of 128
         [[node functional_1/tf_open_aigptlm_head_model/transformer/Reshape_2 (defined at /home/t9kuser/.local/lib/python3.6/site-packages/transformers/modeling_tf_openai.py:342) ]] [Op:__inference_predict_function_15804]

Errors may have originated from an input operation.
Input Source operations connected to node functional_1/tf_open_aigptlm_head_model/transformer/Reshape_2:
 functional_1/Cast_1 (defined at interact_test.py:121)

Function call stack:
predict_function

>>> 你好
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 5).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 5).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 5).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 5).
logits shape:  (1, 5, 13088)
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 6).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("input_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 6).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 6).
WARNING:tensorflow:Model was constructed with shape (None, 128) for input Tensor("token_type_ids:0", shape=(None, 128), dtype=int32), but it was called on an input with incompatible shape (None, 6).
logits shape:  (1, 6, 13088)
Traceback (most recent call last):
  File "interact_test.py", line 233, in <module>
    run()
  File "interact_test.py", line 225, in run
    out_ids = sample_sequence(history, tokenizer, keras_model, args)
  File "interact_test.py", line 121, in sample_sequence
    tf_logits = model.predict([tf_input_ids, tf_token_type_ids])
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 130, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1599, in predict
    tmp_batch_outputs = predict_function(iterator)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 814, in _call
    results = self._stateful_fn(*args, **kwds)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 550, in call
    ctx=ctx)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Input to reshape is a tensor with 7 values, but the requested shape requires a multiple of 128
         [[node functional_1/tf_open_aigptlm_head_model/transformer/Reshape_2 (defined at /home/t9kuser/.local/lib/python3.6/site-packages/transformers/modeling_tf_openai.py:342) ]] [Op:__inference_predict_function_15804]

Errors may have originated from an input operation.
Input Source operations connected to node functional_1/tf_open_aigptlm_head_model/transformer/Reshape_2:
 functional_1/Cast_1 (defined at interact_test.py:121)

Function call stack:
predict_function

wulikai1993 commented 4 years ago

And the savedmodel error:

{'error': '{{function_node __inference__wrapped_model_10271}} {{function_node __inference__wrapped_model_10271}} Incompatible shapes: [1,128,768] vs. [1,5,768]\n\t [[{{node functional_1/tf_open_aigptlm_head_model/transformer/add}}]]\n\t [[StatefulPartitionedCall/StatefulPartitionedCall]]'}

wulikai1993 commented 4 years ago

The pretrained model can be downloaded here: https://drive.google.com/file/d/1Wyr-fD4KuF0gWMtZ7STF09O2Ebtly-yq/view?usp=sharing This is my source code: You can reproduce the error, thanks a lot!

# # Copyright (c) 2019-present, HuggingFace Inc.
# All rights reserved.
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
import os
import logging
import random
from itertools import chain
from argparse import ArgumentParser
from pprint import pformat
import torch
import tensorflow as tf
import torch.nn.functional as F
import sys
import numpy as np

from transformers import OpenAIGPTLMHeadModel, GPT2LMHeadModel, BertTokenizer, TFOpenAIGPTLMHeadModel

SPECIAL_TOKENS = ["[CLS]", "[SEP]", "[PAD]", "[speaker1]", "[speaker2]"]

def top_filtering(logits, top_k=0, top_p=0.0, threshold=-float('Inf'), filter_value=-float('Inf')):
    """ Filter a distribution of logits using top-k, top-p (nucleus) and/or threshold filtering
        Args:
            logits: logits distribution shape (vocabulary size)
            top_k: <=0: no filtering, >0: keep only top k tokens with highest probability.
            top_p: <=0.0: no filtering, >0.0: keep only a subset S of candidates, where S is the smallest subset
                whose total probability mass is greater than or equal to the threshold top_p.
                In practice, we select the highest probability tokens whose cumulative probability mass exceeds
                the threshold top_p.
            threshold: a minimal threshold to keep logits
    """
    assert logits.dim() == 1  # Only work for batch size 1 for now - could update but it would obfuscate a bit the code
    top_k = min(top_k, logits.size(-1))
    if top_k > 0:
        # Remove all tokens with a probability less than the last token in the top-k tokens
        indices_to_remove = logits < torch.topk(logits, top_k)[0][..., -1, None]
        logits[indices_to_remove] = filter_value

    if top_p > 0.0:
        # Compute cumulative probabilities of sorted tokens
        sorted_logits, sorted_indices = torch.sort(logits, descending=True)
        cumulative_probabilities = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)

        # Remove tokens with cumulative probability above the threshold
        sorted_indices_to_remove = cumulative_probabilities > top_p
        # Shift the indices to the right to keep also the first token above the threshold
        sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1].clone()
        sorted_indices_to_remove[..., 0] = 0

        # Back to unsorted indices and set them to -infinity
        indices_to_remove = sorted_indices[sorted_indices_to_remove]
        logits[indices_to_remove] = filter_value

    indices_to_remove = logits < threshold
    logits[indices_to_remove] = filter_value

    return logits

def build_input_from_segments(history, reply, tokenizer, with_eos=True):
    """ Build a sequence of input from 3 segments: persona, history and last reply """
    bos, eos, pad, speaker1, speaker2 = tokenizer.convert_tokens_to_ids(SPECIAL_TOKENS)
    sequence = [[bos]] + history + [reply + ([eos] if with_eos else [])]
#     print('sequence 1', sequence)
    sequence = [sequence[0]] + [[speaker2 if i % 2 else speaker1] + s for i, s in enumerate(sequence[1:])]
#     print('sequence 2', sequence)
    instance = {}
    instance["input_ids"] = list(chain(*sequence))
    instance["token_type_ids"] = [bos] + [speaker2 if i % 2 else speaker1 for i, s in enumerate(sequence[1:])
                                          for _ in s]
    return instance, sequence

def sample_sequence(history, tokenizer, model, args, current_output=None):
    special_tokens_ids = tokenizer.convert_tokens_to_ids(SPECIAL_TOKENS)
#     print(special_tokens_ids)
    if current_output is None:
        current_output = []

    for i in range(args.max_length):
        instance, sequence = build_input_from_segments(history, current_output, tokenizer, with_eos=False)
        input_ids = torch.tensor(instance["input_ids"], dtype=torch.long, device=args.device).unsqueeze(0)
        token_type_ids = torch.tensor(instance["token_type_ids"], dtype=torch.long, device=args.device).unsqueeze(0)
#         print(type(input_ids))
#         print(input_ids.shape)
#         print('input_ids', input_ids)
#         print('token_type_ids', token_type_ids)
#         logits, *_ = model(input_ids, token_type_ids=token_type_ids)
#         print(type(logits))
#         print(logits.shape)
#         print(logits)
        tf_input_ids = input_ids.numpy()
        tf_token_type_ids = token_type_ids.numpy()
#         tf_input_ids_pad = np.pad(tf_input_ids, ((0, 0), (0, 128 - tf_input_ids.shape[1])), 'constant')
#         print(tf_input_ids_pad.shape)
#         print(tf_input_ids_pad)
        tf_logits = model.predict([tf_input_ids, tf_token_type_ids])
        logits = torch.from_numpy(tf_logits)

#         tf_logits, *_ = model(tf_input_ids, token_type_ids=tf.constant(tf_token_type_ids))
#         logits = torch.from_numpy(tf_logits.numpy())
#         print(type(tf_logits))
        print('logits shape: ', tf_logits.shape)
#         print(tf_logits)

        logits = logits[0, -1, :] / args.temperature
        print('logits tmp shape: ', logits.shape)
        logits = top_filtering(logits, top_k=args.top_k, top_p=args.top_p)
        print('logits filter shape: ', logits.shape)
        probs = F.softmax(logits, dim=-1)
        print('probs shape: ', probs.shape)

        prev = torch.topk(probs, 1)[1] if args.no_sample else torch.multinomial(probs, 1)
        print('prev: ', prev)
        if i < args.min_length and prev.item() in special_tokens_ids:
            while prev.item() in special_tokens_ids:
                prev = torch.multinomial(probs, num_samples=1)

        if prev.item() in special_tokens_ids:
            break
        current_output.append(prev.item())

    return current_output

def run():
    parser = ArgumentParser()
    parser.add_argument('--gpt2', action='store_true', help="use gpt2")
    parser.add_argument("--model_checkpoint", type=str, default="./LCCD_GPT", help="Path, url or short name of the model")
    parser.add_argument("--max_history", type=int, default=2, help="Number of previous utterances to keep in history")
    parser.add_argument("--device", type=str, default="cpu",
                        help="Device (cuda or cpu)")

    parser.add_argument("--no_sample", action='store_true', help="Set to use greedy decoding instead of sampling")
    parser.add_argument("--max_length", type=int, default=30, help="Maximum length of the output utterances")
    parser.add_argument("--min_length", type=int, default=1, help="Minimum length of the output utterances")
    parser.add_argument("--seed", type=int, default=42, help="Seed")
    parser.add_argument("--temperature", type=int, default=0.7, help="Sampling softmax temperature")
    parser.add_argument("--top_k", type=int, default=0, help="Filter top-k tokens before sampling (<=0: no filtering)")
    parser.add_argument("--top_p", type=float, default=0.9,
                        help="Nucleus filtering (top-p) before sampling (<=0.0: no filtering)")
    args = parser.parse_args()

    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__file__)
    logger.info(pformat(args))

    if args.model_checkpoint == "":
        logging.error("Checkpoint needed!")
        return

    random.seed(args.seed)
    torch.random.manual_seed(args.seed)
    torch.cuda.manual_seed(args.seed)

    logger.info("Get pretrained model and tokenizer")
    tokenizer_class = BertTokenizer
    tokenizer = tokenizer_class.from_pretrained(args.model_checkpoint, do_lower_case=True)
    tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('./trans_model', from_pt=True)
    max_len = 128

    input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids', dtype='int32')
    token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids', dtype='int32')
    keras_input = [input_ids, token_type_ids]

    qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
    print('**************************')
    print(type(qa_output))
    print(qa_output)
    keras_model = tf.keras.Model(inputs= keras_input, outputs = [qa_output])
    keras_model.trainable = False
    keras_model.summary()
#     keras_model.save("./saved_model", save_format="tf")
#     tf.saved_model.save(keras_model, "./saved_model")
#     model = tf.saved_model.load("./saved_model")
#     keras_model.save("./saved_model")
#     print('**************************')
#     model = tf.keras.models.load_model("./saved_model")

    def tokenize(obj):
        if isinstance(obj, str):
            return tokenizer.convert_tokens_to_ids(tokenizer.tokenize(obj))
        if isinstance(obj, dict):
            return dict((n, tokenize(o)) for n, o in obj.items())
        return list(tokenize(o) for o in obj)

    history = []
    while True:
        raw_text = input(">>> ")
        while not raw_text:
            print('Prompt should not be empty!')
            raw_text = input(">>> ")
        sys.stdout.flush()
#         print(raw_text)
        raw_text = " ".join(list(raw_text.replace(" ", "")))
#         print(raw_text)
        sys.stdout.flush()
        history.append(tokenize(raw_text))
#         print('history', history)
        with torch.no_grad():
            out_ids = sample_sequence(history, tokenizer, keras_model, args)
        history.append(out_ids)
        history = history[-(2 * args.max_history + 1):]
        out_text = tokenizer.decode(out_ids, skip_special_tokens=True)
        print(out_text)

if __name__ == "__main__":
    run()

jplu commented 4 years ago

For now I don't really have time to check this, but as far as I can see, the issue is that you are not giving a sequence of 128, your sequences have to be padded to 128. If at each step you get one more element to your sequence, you can remove one padding everytime. Example:

1st iteration, shape [1, 128, 13088]: [[ [embed char 1], [embed char 2], [embed char 3], [embed char 4], [embed char 5], [embed padding], [embed padding], [embed padding], ... [embed padding] ]]

2nd iteration, shape [1, 128, 13088]: [[ [embed char 1], [embed char 2], [embed char 3], [embed char 4], [embed char 5], [embed char 6], [embed padding], [embed padding], ... [embed padding] ]]

And so on.

wulikai1993 commented 4 years ago

Yes, I tried this before. But the performance declined a lot. Thanks for your patience!

waring92 commented 4 years ago

Same Issue in Electra.

i think, the TensorSpec below seems to be a dummy input for building TF2 model in transformers library.

{'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

but, i don't know why, that dummy is alive after saving and loading.

jplu commented 4 years ago

All the models are initialized with this input. If you want to change it you have to recompile it with your own input as I shown in my previous posts.

jplu commented 4 years ago

@wulikai1993

Yes, I tried this before. But the performance declined a lot. Thanks for your patience!

Indeed the perf will decline, but you still get the same issue?

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dshahrokhian commented 3 years ago

@jplu any updates? Default exporting or dynamic shapes are still failing for me

dshahrokhian commented 3 years ago

@jplu Even by using your approach, still get the error on XLM models (max_len set to 192 in the Input):

ValueError: The two structures don't have the same nested structure.

First structure: type=TensorSpec str=TensorSpec(shape=(None, 192), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 192), dtype=tf.int32, name='inputs')" is not

Here is the model template, which works perfectly before exporting and trying to load it again:

def get_model(transformer, num_classes=1, max_len=512):
    input_word_ids = Input(shape=(192,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(num_classes, activation='sigmoid')(cls_token)

    model = Model(inputs=input_word_ids, outputs=out)
    model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])

    return model

fikhrimasri commented 3 years ago

@dshahrokhian i have the same problem, is there a solution?

dshahrokhian commented 3 years ago

@fikhrimasri gave up on dynamic shapes for now

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.

huggingface / transformers

tf.keras.models.load_model() does not load saved model that includes TFOpenAIGPTLMHeadModel layer #7164

To reproduce