allenai / scibert

A BERT model for scientific text.
https://arxiv.org/abs/1903.10676
Apache License 2.0
1.47k stars 214 forks source link

How to load finetuned NER model in allenlp? #110

Open tomasonjo opened 3 years ago

tomasonjo commented 3 years ago

I have trained the NER model on sciie dataset using the following config:

DATASET='sciie'
TASK='ner'
with_finetuning='_finetune' #'_finetune'  # or '' for not fine tuning
dataset_size=38124

export BERT_VOCAB=/home/tomaz/neo4j/scibert/model/vocab.txt
export BERT_WEIGHTS=/home/tomaz/neo4j/scibert/model/weights.tar.gz

This worked nicely and I got a model.tar.gz as an output. Now when I try to load it in AllenNLP lib:

from allennlp.predictors.predictor import Predictor
predictor = Predictor.from_path("model.tar.gz")

I get the following error:

ConfigurationError: bert-pretrained not in acceptable choices for dataset_reader.token_indexers.bert.type: ['single_id', 'characters', 'elmo_characters', 'spacy', 'pretrained_transformer', 'pretrained_transformer_mismatched']. You should either use the --include-package flag to make sure the correct module is loaded, or use a fully qualified class name in your config file like {"model": "my_module.models.MyModel"} to have it imported automatically.

Any idea how to fix this?

tomasonjo commented 3 years ago

I have made a lot of progress, currently I use:

from allennlp.predictors.predictor import Predictor
from scibert.models.bert_crf_tagger import *
from scibert.models.bert_text_classifier import *
from scibert.models.dummy_seq2seq import *
from scibert.dataset_readers.classification_dataset_reader import *

predictor = Predictor.from_path("scibert_ner/model.tar.gz")
dataset_reader="classification_dataset_reader")
predictor.predict(
  sentence="Did Uriah honestly think he could beat The Legend of Zelda in under three hours?"
)

and I get the following error:

No default predictor for model type bert_crf_tagger.\nPlease specify a predictor explicitly

There is an option to add a predictor_name parameter to load module from path, but I don't know what to pick that would work

PetterBerntsson commented 3 years ago

I'm also having trouble running the model (#107), but I could at least load my model in a hacky way.

Please keep me updated if you get it running!

gshashi commented 3 years ago

If someone still encounter this issue, I could be able to solve it by including predictor_name while loading model. something like this.

predictor = Predictor.from_path("scibert_ner/model.tar.gz",predictor_name="sentence-tagger")