didn't save vocabulary files in the checkpoint

lavis-nlp / spert

PyTorch code for SpERT: Span-based Entity and Relation Transformer

MIT License

692 stars 148 forks source link

didn't save vocabulary files in the checkpoint #11

Closed jmlongriver closed 4 years ago

jmlongriver commented 4 years ago

in the checkpoint, there is no vocab file existed, which make evaluation fail when loading the checkpoint, sounds likely need to add the code of "tokenizer.save_pretrained" in the save model functionality.

Thanks Min

markus-eberts commented 4 years ago

Yes, we didn't save the vocabulary file because it doesn't change during training, therefore you can just use the training vocabulary during evaluation (e.g. 'bert-base-cased'). However, I do agree that it's more convenient when the vocabulary file is stored alongside the model checkpoint. I added this in commit bfa93b9a65e530cbfd826628082035539e843d5d.