naver / biobert-pretrained

BioBERT: a pre-trained biomedical language representation model for biomedical text mining
667 stars 88 forks source link

KeyError when running NER on pretrained BioBERT model #19

Closed ianovski closed 4 years ago

ianovski commented 4 years ago

Hello,

I am trying to run NER on the pretrained BioBERT model. I have tried using both biobert_v1.1_pubmed and biobert_large.

I first created the pytorch model using these steps

I then created and tested an nerPipeline:

My code:

from transformers import BertModel, BertTokenizer, BertConfig, pipeline

model=BertModel.from_pretrained('../../biobert_v1.1_pubmed')
tokenizer = BertTokenizer.from_pretrained('../../biobert_v1.1_pubmed')
config = BertConfig.from_pretrained('../../biobert_v1.1_pubmed')

nlp = pipeline(task = "ner", model = model, config = config, tokenizer = tokenizer, framework = "pt")

sequence = "some sequence of words"
test = nlp(sequence)
print(test)

Error:

Traceback (most recent call last):
  File "load_model.py", line 8, in <module>
    test = nlp(sequence)
  File "/home/dev/.local/lib/python3.6/site-packages/transformers/pipelines.py", line 794, in __call__
    if self.model.config.id2label[label_idx] not in self.ignore_labels:
KeyError: 97
jhyuklee commented 4 years ago

Hi, we would appreciate if you could make an issue in here (https://github.com/dmis-lab/biobert) where we reply whenever possible. Thanks.