mandarjoshi90 / coref

BERT for Coreference Resolution
Apache License 2.0
440 stars 92 forks source link

Add custom vocabulary #69

Closed kiet13 closed 3 years ago

kiet13 commented 3 years ago

How can I add my custom vocabulary to the model? I tried to add some in file vocab.txt. The tokenizer worked well but I received KeyError in the prediction step.

Traceback (most recent call last): File "predict.py", line 31, in <module> tensorized_example = model.tensorize_example(example, is_training=False) File "/home/kietnguyen/recap-coref/coref/independent.py", line 162, in tensorize_example sent_input_ids = self.tokenizer.convert_tokens_to_ids(sentence) File "/home/kietnguyen/recap-coref/coref/bert/tokenization.py", line 179, in convert_tokens_to_ids return convert_by_vocab(self.vocab, tokens) File "/home/kietnguyen/recap-coref/coref/bert/tokenization.py", line 140, in convert_by_vocab output.append(vocab[item]) KeyError: 'Ngoc'