microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
1.09k stars 113 forks source link

SpeechUT inference and fine-tune problem #34

Closed ytf-philp closed 1 year ago

ytf-philp commented 1 year ago

I want to use the released version of pretrained SpeechUT model to fine tune and also want to use the released fine-tuned model on MUST-C ende model to inference, however when i reload the checkpoints, there are some extra non-local path which caused "FileNotFoundError", how can i solve this problem? image

zz12375 commented 1 year ago

I am so sorry for the late response. Hope this information is still helpful.

It is because the pre-trained model tried to load the data config which was used at the pre-training stage.

Actually, you can find the config yaml files in SpeechUT/dataset/MuSTC/en_xx, then replace them with the missing files.

I am sorry for your trouble, it is a bug. I will fix it as soon as possible.

Please let me know when you run it successfully, thanks!

zz12375 commented 1 year ago

Hi @ytf-philp , I think I have fixed the bug.

  1. For now, the code will default to load the dictionaries from the checkpoint, instead of asking for no-existing config files (*.yaml).
  2. I still added the config.yaml (no needed anymore for fine-tuning/inference, but still needed for pre-training) to dataset/MuSTC/en_{de,es,fr}/, just for a look.
  3. When inference, inference_st.sh will still ask for the pre-trained model which may cause a FileNotFoundError, to prevent it, use --model-overrides "{'model':{'w2v_path':'/path/to/your/pretrained/model.pt'}}" in inference_st.sh. Click here to see the usage.

Hope the above information helps, and thank you for your feedback.

ytf-philp commented 1 year ago

Thank you, this bug has been fixed. @zz12375