huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.33k stars 26.86k forks source link

DistilBert for Tensorflow doesn't work #1453

Closed p-christ closed 5 years ago

p-christ commented 5 years ago

Model: TFDistilBertForSequenceClassification Language: English Task: multi-label classification Environment: google colab

When trying to use TF Distil Bert I get the below error after I have loaded the model and try to run model.fit() :

TypeError: in converted code: relative to /usr/local/lib/python3.6/dist-packages:

transformers/modeling_tf_distilbert.py:680 call  *
    distilbert_output = self.distilbert(inputs, **kwargs)
tensorflow_core/python/keras/engine/base_layer.py:842 __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:447 call  *
    tfmr_output = self.transformer([embedding_output, attention_mask, head_mask], training=training)
tensorflow_core/python/keras/engine/base_layer.py:891 __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:382 call
    layer_outputs = layer_module([hidden_state, attn_mask, head_mask[i]], training=training)
tensorflow_core/python/keras/engine/base_layer.py:891 __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:324 call
    sa_output = self.attention([x, x, x, attn_mask, head_mask], training=training)
tensorflow_core/python/keras/engine/base_layer.py:891 __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
transformers/modeling_tf_distilbert.py:229 call
    assert 2 <= len(tf.shape(mask)) <= 3
tensorflow_core/python/framework/ops.py:741 __len__
    "shape information.".format(self.name))

TypeError: len is not well defined for symbolic Tensors. (tf_distil_bert_for_sequence_classification_1/distilbert/transformer/layer_._0/attention/Shape_2:0) Please call `x.shape` rather than `len(x)` for shape information.

The exact same procedure works if I use TF Bert but not Distil Bert. Does anyone know how to get around this problem?

rickysaurav commented 5 years ago

I have been experiencing the same issue #1378.

thomwolf commented 5 years ago

Fixed on master with 23b7138, thanks. Will be in this week's new release 2.1

p-christ commented 5 years ago

thanks a lot