helboukkouri / character-bert

Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters"
Apache License 2.0
195 stars 47 forks source link

Small error in the Readme #24

Closed thibault-roux closed 1 year ago

thibault-roux commented 1 year ago

Hello, thanks a lot for developing character-bert.

I was trying character-bert but when I execute the following command:

tokenizer = BertTokenizer.from_pretrained('./pretrained-models/bert-base-uncased/')

I had the following error:

OSError: Model name './pretrained-models/bert-base-uncased/' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). We assumed './pretrained-models/bert-base-uncased/' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.

The problem is pretty trivial to solve, just replace './pretrained-models/bert-base-uncased/' by 'bert-base-uncased'

Regards,

thibault-roux commented 1 year ago

My bad, just did the thing too fast as I was in the interpreter.