allenai / scibert

A BERT model for scientific text.
https://arxiv.org/abs/1903.10676
Apache License 2.0
1.49k stars 217 forks source link

Error while running Named Entity Recognition #97

Open Sachit1137 opened 4 years ago

Sachit1137 commented 4 years ago

I tried using SciBERT for NER using the following command-

from transformers import *

tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_uncased') model = AutoModel.from_pretrained('allenai/scibert_scivocab_uncased')

nlp = pipeline('ner',model = model,tokenizer=tokenizer) nlp('Clinical features of culture-proven Mycoplasma pneumoniae infections at King Abdulaziz University Hospital, Jeddah, Saudi Arabia')

While running it on a sample sentence, I get the following error- Traceback (most recent call last): File "", line 1, in File "/python3.6/site-packages/transformers/pipelines.py", line 927, in call for idx, label_idx in enumerate(labels_idx) File "/python3.6/site-packages/transformers/pipelines.py", line 928, in if self.model.config.id2label[label_idx] not in self.ignore_labels KeyError: 422

elkotito commented 4 years ago
from transformers import BertTokenizer, BertForTokenClassification
tokenizer = BertTokenizer.from_pretrained('allenai/scibert_scivocab_uncased')
model = BertForTokenClassification.from_pretrained('allenai/scibert_scivocab_uncased')
nlp = pipeline('ner', model=model, tokenizer=tokenizer)

text = 'Clinical features of culture-proven Mycoplasma pneumoniae infections at King Abdulaziz University Hospital, Jeddah, Saudi Arabia'
print(nlp(text))