TFDistilBertForSequenceClassification - TypeError: len is not well defined for symbolic Tensors during model.fit()

rickysaurav commented 4 years ago

🐛 Bug

Model I am using (TFDistilBertForSequenceClassification):

Language I am using the model on (English):

The problem arise when using: model.fit()

[ ] the official example scripts:
[x] my own modified scripts:

The tasks I am working on is:

[ ] an official GLUE/SQUaD task:
[x] my own task or dataset:

To Reproduce

Steps to reproduce the behavior:

create a random classification train,test set
get the pretrained TFDistilBertForSequenceClassification model

call fit() on the model for finetuning

x_train = np.random.randint(2000, size=(100, 12))
x_train[:,0]=101
x_train[:,11]=102
y_train = np.random.randint(2, size=100)
model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels = 2)
model.compile() 
model.fit(x_train,y_train,epochs = 1,batch_size = 32,verbose=1)

TypeError: in converted code:
relative to /usr/local/lib/python3.6/dist-packages:

transformers/modeling_tf_distilbert.py:680 call  *
    distilbert_output = self.distilbert(inputs, **kwargs)
tensorflow_core/python/keras/engine/base_layer.py:842 __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:447 call  *
    tfmr_output = self.transformer([embedding_output, attention_mask, head_mask], training=training)
tensorflow_core/python/keras/engine/base_layer.py:891 __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:382 call
    layer_outputs = layer_module([hidden_state, attn_mask, head_mask[i]], training=training)
tensorflow_core/python/keras/engine/base_layer.py:891 __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:324 call
    sa_output = self.attention([x, x, x, attn_mask, head_mask], training=training)
tensorflow_core/python/keras/engine/base_layer.py:891 __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:229 call
    assert 2 <= len(tf.shape(mask)) <= 3
tensorflow_core/python/framework/ops.py:741 __len__
    "shape information.".format(self.name))

TypeError: len is not well defined for symbolic Tensors. (tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/Shape_2:0) Please call `x.shape` rather than `len(x)` for shape information.

Expected behavior

Environment

OS: Colab Notebook
Python version:3.6.8
PyTorch version:N/A
Tensorflow version:tf-nightly-gpu-2.0-preview
PyTorch Transformers version (or branch): 2.0/0
Using GPU ? yes
Distributed of parallel setup ? No

Additional context

Calling the model directly with the input as mentioned in the example model doc works fine

kingxiaotian commented 4 years ago

so, how to solve this problem?

thomwolf commented 4 years ago

Should be solved on master and the latest release.

huggingface / transformers