Passing inputs to TFGPT2LMHeadModel results in error: 'TensorSliceDataset' object has no attribute 'shape'

rdisipio commented 4 years ago

🐛 Bug

Model I am using (Bert, XLNet....): TFGPT2LMHeadModel

Language I am using the model on (English, Chinese....): English

The problem arise when using:

[ ] the official example scripts: (give details)
[X ] my own modified scripts: (give details)

import tensorflow as tf
from transformers import *

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = TFGPT2LMHeadModel.from_pretrained('gpt2')

raw_text = "Here comes the sun"
tokens = tokenizer.encode(raw_text, add_special_tokens=False)
inputs = tf.data.Dataset.from_tensor_slices( np.array(tokens) )
inputs = {'input_ids': inputs}
outputs = model(inputs)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[ X ] my own task or dataset: (give details) Trying to work out a stripped-down version of run_generation.py using TFGPT2LMHeadModel only.

To Reproduce

Steps to reproduce the behavior: just run the code above, you should get the following error:

Traceback (most recent call last):
  File "./generate_text.py", line 47, in <module>
    out = sample_sequence(tokens, num_samples=num_samples)
  File "./generate_text.py", line 27, in sample_sequence
    outputs = model(inputs)
  File "/Users/Riccardo/development/ideal/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 891, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/Riccardo/development/ideal/lib/python3.7/site-packages/transformers/modeling_tf_gpt2.py", line 490, in call
    transformer_outputs = self.transformer(inputs, **kwargs)
  File "/Users/Riccardo/development/ideal/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 891, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/Riccardo/development/ideal/lib/python3.7/site-packages/transformers/modeling_tf_gpt2.py", line 257, in call
    position_ids = tf.range(past_length, shape_list(input_ids)[-1] + past_length, dtype=tf.int32)[tf.newaxis, :]
  File "/Users/Riccardo/development/ideal/lib/python3.7/site-packages/transformers/modeling_tf_utils.py", line 475, in shape_list
    static = x.shape.as_list()
AttributeError: 'TensorSliceDataset' object has no attribute 'shape'

Expected behavior

Still not sure!

Environment

OS: MacOsX 11.14.6 (Mojave)
Python version: 3.7.5
Tensorflow version: 2.0.0
Tensorflow Transformers version (or branch): 2.1.1
Using GPU ? No
Distributed of parallel setup ? No
Any other relevant information:

Additional context

LysandreJik commented 4 years ago

Hi! You can simply use tf.constant to build your input tensors, like this:

raw_text = "Here comes the sun"
tokens = tokenizer.encode(raw_text, add_special_tokens=False)
inputs = {'input_ids': tf.constant(tokens)}
outputs = model(inputs)

You can use datasets when using building a custom loop or using keras.fit for example, as these will generally feed the tensors directly to the model, instead of feeding the tf.data.Dataset directly. Here's how I would go about starting a basic custom loop using a tf.data.Dataset:

import tensorflow as tf
import numpy as np
from transformers import *

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = TFGPT2LMHeadModel.from_pretrained('gpt2')

raw_text = "Here comes the sun"
tokens = tokenizer.encode(raw_text, add_special_tokens=False)
inputs = tf.data.Dataset.from_tensor_slices( np.array([tokens]) )

for input_value in inputs:
    outputs = model(input_value)

Please notice I converted to a numpy array by adding a dimension ([tokens]) otherwise you would end up with individual IDs held by the dataset rather than sequences of ids.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

huggingface / transformers