allenai / kb

KnowBert -- Knowledge Enhanced Contextual Word Representations
Apache License 2.0
370 stars 50 forks source link

Pre-trained model #43

Open NoviceCrom opened 2 years ago

NoviceCrom commented 2 years ago

Hi! I'd like to know how to replace bert-base-uncased with the pre-trained Knowbert downloaded from the link given. It seems that knowbert_wordnet_model did not contain relevant tokenizer file.

lshowway commented 2 years ago

@NoviceCrom The tokenizers are used, the first one is bert tokenizer, which is used to tokenize the input text, and the second one is the white space tokenizer, which is used to extract mentions from the input text.