Open robotsp opened 2 years ago
Hi,
The current code-base isn't really set up for fine-tuning; however, you can access the tokenizer vocabulary in the model file which looks like:
model = {
'vocab': sentencepiece_model_object,
'args': args_namespace_from_training,
'weights': model_state_dict
}
when loaded. You can see the load_model(checkpoint_path) function in trainer.py for more details.
I checked there is a pretrained model in repo "https://github.com/rewicks/ersatz-models/tree/main/monolingual/en". As I cannot find the tokenizer Vocabulary, I am not sure how to finetune the existed model.