Closed jmlongriver closed 4 years ago
Yes, we didn't save the vocabulary file because it doesn't change during training, therefore you can just use the training vocabulary during evaluation (e.g. 'bert-base-cased'). However, I do agree that it's more convenient when the vocabulary file is stored alongside the model checkpoint. I added this in commit bfa93b9a65e530cbfd826628082035539e843d5d.
in the checkpoint, there is no vocab file existed, which make evaluation fail when loading the checkpoint, sounds likely need to add the code of "tokenizer.save_pretrained" in the save model functionality.
Thanks Min