Closed dummynov1 closed 2 years ago
It is configured after creating the model:
model = text.sequence_tagger('bilstm-bert', preproc, bert_model='allenai/scibert_scivocab_cased')
from transformers import AutoTokenizer
preproc.p.te.tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_cased', do_lower_case=False)
@amaiya :Thanks for sharing the info on Scibert Cased model tokenizer config issue. #422 Just to confirm should i use your workaround like this:
Is this how the Preproc tokennizer to be intialized, befor running learner code .? I think i'm doing something wrong, i'm not sure how to pass the option
do_lower_case=False
.