allenai / scibert

A BERT model for scientific text.
https://arxiv.org/abs/1903.10676
Apache License 2.0
1.47k stars 214 forks source link

KeyError with HuggingFace NER Pipeline #84

Closed tbabinet closed 4 years ago

tbabinet commented 4 years ago

Hello, I've been trying to use scibert_scivocab_uncased with the NER pipeline from HuggingFace, but I can't seem to make it work. Each time I try to run the ner pipeline, my script crashes. The code is very simple : nlp(abstract) and the error is the following :

KeyError Traceback (most recent call last)

in 3 if(len(row.abstract.split())<512): 4 abst = row.abstract ----> 5 print(nlp(abst)) 6 7 except ValueError as ve: c:\users\tbabi\appdata\local\programs\python\python37\lib\site-packages\transformers\pipelines.py in __call__(self, *texts, **kwargs) 792 answer = [] 793 for idx, label_idx in enumerate(labels_idx): --> 794 if self.model.config.id2label[label_idx] not in self.ignore_labels: 795 answer += [ 796 { KeyError: 422 I have been struggling with this for quiet a time now. Has anyone else been bumping into this and could give some help ? Thank you !
stefan-it commented 4 years ago

@tbabinet Could you provide a full code snippet?

If you want to use the NER pipeline, then you need to use a fine-tuned model of e.g. SciBERT (the "normal" SciBERT model from model hub can't be used)

tbabinet commented 4 years ago

Hello It seems I had simply been misusing my model : I was instantiating an AutoModel, when I should have been using the BertForTokenClassification class. Thanks anyway ! :)

ibeltagy commented 4 years ago

glad you figured it out.