Closed RuABraun closed 2 years ago
In the finetuned model, if you load and check the keys, there are two parts:
w2v_path & w2v_args
w2v_path contains the model path of pretrained model.
If you don't want to load pretained model, then try setting w2v_path to None, the code will pick up args from w2v_args.
Thank you for the response, let me try that
@harveenchadha I tried loading the finetuned model (as dct
), setting dct['cfg']['model']['w2v_path'] = None
, saving it and then using that model for inference. But I got an error
omegaconf.errors.ValidationError: Non optional field cannot be assigned None
full_key: w2v_path
reference_type=Optional[Wav2Vec2CtcConfig]
object_type=Wav2Vec2CtcConfig
It's quite annoying having to always keep two copies of the model around for inference... surely there must be a way to fix this?
@RuABraun I am facing a similar issue like you. Have you found a solution in the meantime?
Unfortunately not really. I keep a copy of the pretrained model around with the same path as on the host where I trained.
If I find a real solution I will update.
I maybe found a solution for the issue. There exists the option to give a dictionary via --model-overrides
to change an argument of the model at generation time. You can use this to reset the checkpoint path, what is also explained in this example in the last code line before the citation: https://github.com/pytorch/fairseq/blob/7818f6148da4ea04f0b4b3a2df780004c3580dad/examples/stories/README.md
I personally work with a modified XLMR-model for translation and change the argument pretrained_xlm_checkpoint
to "interactive"
when generating. It then skips the loading of the pre-trained checkpoint at model setup and only loads the fine-tuned checkpoint. It is a hacky workaround, but it does the job for me.
I don't know whether you tried --model-overrides
already and if it works with the wav2vec2 model, but maybe it helps! Also putting it here for other people having the same problem to find it more easily.
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!
❓ Questions and Help
What is your question?
Say I pretrain a model. Then I finetune it which creates a new model. When doing inference with the finetuned model I can get an error because the pretrained model does not exist anymore (because the checkpoint was deleted as training was not finished).
Similarly, I have noticed that if I
scp
a finetuned model to another host and want to use it, I'll get an error because the pretrained model used for finetuning is not on the host.Can I change this behaviour? If yes how?
Thank you in advance.