facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation
Other
10.94k stars 1.06k forks source link

How to finetune large v2 model? #315

Open adnankarimjs opened 10 months ago

mavlyutovr commented 10 months ago

@adnankarimjs see example in https://github.com/facebookresearch/seamless_communication/tree/main/src/seamless_communication/cli/m4t/finetune -- it shows command to finetue Speech-to-text part of large-v2

adnankarim commented 10 months ago

Thank you

Already got it. Done with fine-tuning

Cli help is missing the argument to download the large v2 model.

help="Base model name (seamlessM4T_medium, seamlessM4T_large)",

Need to add for seamlessM4T_v2_large

finetune.py is missing this argument.

Also

There was assertion error of vocabulary mismatch, which is already solved as raised in other issue.

StephennFernandes commented 7 months ago

@mavlyutovr hey I have some custom Translation, speech and TTS models that i want to collectively fine-tune on.

Does the fine-tuning scripts facilitates replacing the existing models with newer ones ?

RRThivyan commented 1 month ago

Hi, how can we infer with finetuned model of v2_large for TTS to unsupported indic languages like Gujarathi, Marathi etc? I have finetuned the model but its not able to load due to missing weights. I have changed the model path in cards/yaml file also. Do we have to do any finetuning for vocoder_v2?

m4t_predict "this is a test run" --task T2ST --tgt_lang "tam" --src_lang "eng" --output_path '/home/jupyter/myfiles' --model_name seamlessM4T_v2_large

image

image