CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
51.65k stars 8.66k forks source link

missing SV2TTS/ #1156

Closed prakharpbuf closed 1 year ago

prakharpbuf commented 1 year ago

Hey, I'm trying to finetune the pretrained model but it looks like I am missing the SV2TTS/ directory which contains train.txt, etc. I have a saved_models/ directory which has three *.pt files for the three components of this TTS architecture.

prakharpbuf commented 1 year ago

Went through some github issues and concluded that I don't need that SV2TTS/ directory. It will be created fresh when I run synthesizer_preprocess_audio.py

prakharpbuf commented 1 year ago

I have downloaded LibriSpeech train-clean-100 dataset and I'll practice finetuning the model with one of the speakers in the dataset before I create my own dataset and finetune it. The train-clean-100 dataset contains .flac audio files with all transcript in < speaker >-< book >.trans.txt as opposed to what this program requires. i.e. utterance-xx.flac and utterance-xx.txt

Does anyone have any code written to get the transcripts in the required format?

I found a script for preprocessing these transcripts for the Mozilla datasets but nothing for LibriSpeech. I will wait till tomorrow, otherwise I will write my own, share it here for anyone who needs it, and close this issue.

prakharpbuf commented 1 year ago
import os

for root, dirs, files in os.walk(r'C:\LibriSpeech\train-clean-100'):
    if len(files) == 0:
        continue
    try: 
        head, book = os.path.split(root)
        head, speaker = os.path.split(head)
        transFilePath = os.path.join(root, f"{speaker}-{book}.trans.txt")
        transFile = open(transFilePath)
        transText = transFile.readlines()
        for line in transText:
            utterance = line.split(" ")[0]
            utteranceFilePath = os.path.join(root, f"{utterance}.txt")
            if(os.path.exists(utteranceFilePath)):
                os.remove(utteranceFilePath)
            utteranceFile = open(utteranceFilePath, 'w')
            utteranceFile.write(" ".join(line.split(" ")[1:]))
            utteranceFile.close()
        transFile.close()
        # os.remove(transFilePath)
    except Exception as e:
        print(e)
        continue