huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.87k stars 26.98k forks source link

TextClassificationPipeline does not work with pretrained BERT model #3316

Closed ogencoglu closed 4 years ago

ogencoglu commented 4 years ago

🐛 Bug

Information

Model I am using (Bert, XLNet ...): 'nlptown/bert-base-multilingual-uncased-sentiment'

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

The tasks I am working on is:

To reproduce

Steps to reproduce the behavior:

model = BertModel.from_pretrained(pretrained_model_name_or_path='nlptown/bert-base-multilingual-uncased-sentiment')
tokenizer = BertTokenizer.from_pretrained(pretrained_model_name_or_path='nlptown/bert-base-multilingual-uncased-sentiment')
sentiment_analyzer = TextClassificationPipeline(model=model, tokenizer=tokenizer)
sentiment_analyzer('This is awesome!')

/usr/local/lib/python3.7/site-packages/transformers/pipelines.py in __call__(self, *args, **kwargs)
    504     def __call__(self, *args, **kwargs):
    505         outputs = super().__call__(*args, **kwargs)
--> 506         scores = np.exp(outputs) / np.exp(outputs).sum(-1)
    507         return [{"label": self.model.config.id2label[item.argmax()], "score": item.max()} for item in scores]
    508 

ValueError: operands could not be broadcast together with shapes (1,8,768) (1,8) 

Expected behavior

A sentiment score.

Environment info

LysandreJik commented 4 years ago

That would be because you're using a BertModel instead of a BertModelForSequenceClassification.

briancleland commented 4 years ago

What model should we be using, and where can we download it from?