jasonppy / VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild
Other
7.66k stars 749 forks source link

MFA alignment temp file #23

Closed friendlyFriend4000 closed 8 months ago

friendlyFriend4000 commented 8 months ago

I am trying to clone my own voice and when i use my own file mfa outputs the default demo text "but when I approaches...." instead of mine

jasonppy commented 8 months ago

Thanks!

In Inference_tts.ipynb, search "transcript", and change orig_transcript and target_transcript accordingly, note that target transcript should still has prefix of the prompt (i.e. from the orig_transcript). probably also need to change cut_off_sec accordingly

friendlyFriend4000 commented 8 months ago

Thanks!

In Inference_tts.ipynb, search "transcript", and change orig_transcript and target_transcript accordingly, note that target transcript should still has prefix of the prompt (i.e. from the orig_transcript). probably also need to change cut_off_sec accordingly

thanks for getting back at so quick! I did change the orig_audio and orig_transcript strings to what I say. I also ran whisper to transcribe it properly. When i first started the the demo the third cell has done some alignments with the demo files. Now that I have changed the files and strings it does not re align the new files. I get the message that the alignment has already been done and it is skipping it. no matter what i do.I can't set a proper cut_off_sec without the right mfa file.

rishiad commented 8 months ago

Set the cut-off to 0.1 and set your target transcript. That worked for me

jasonppy commented 8 months ago

Set the cut-off to 0.1 and set your target transcript. That worked for me

Thanks for the help. Just to clarify, cut_off_sec should be the end second of your prompt, it should be the end of a word in your prompt, so there isn't a one number fits all value

jasonppy commented 8 months ago

Thanks! In Inference_tts.ipynb, search "transcript", and change orig_transcript and target_transcript accordingly, note that target transcript should still has prefix of the prompt (i.e. from the orig_transcript). probably also need to change cut_off_sec accordingly

thanks for getting back at so quick! I did change the orig_audio and orig_transcript strings to what I say. I also ran whisper to transcribe it properly. When i first started the the demo the third cell has done some alignments with the demo files. Now that I have changed the files and strings it does not re align the new files. I get the message that the alignment has already been done and it is skipping it. no matter what i do.I can't set a proper cut_off_sec without the right mfa file.

Thanks! Two ways to solve that:

  1. manually delete ./demo/temp
  2. change the mfa align line in the 3rd cell to os.system(f"mfa align -j 1 --output_format csv --clean {temp_folder} english_us_arpa english_us_arpa {align_temp}"). Note that I added --clean so that it will remove previous results of previous runs.
friendlyFriend4000 commented 8 months ago
2. change the mfa align line in the 3rd cell to `os.system(f"mfa align -j 1 --output_format csv --clean {temp_folder} english_us_arpa english_us_arpa {align_temp}")`. Note that I added `--clean` so that it will remove previous results of previous runs.

I have done this way and it works now! It alignes the new file i put in the demo folder. Many thanks