IIEleven11 / StyleTTS2FineTune

177 stars 32 forks source link

Multi file processing #5

Closed 78Alpha closed 10 months ago

78Alpha commented 10 months ago
  1. Made the scripts work on generic paths so long as the terminal is on the root (StyleTTSFineTune-main)
  2. Altered to allow for multiple srt files and audio files to be processed for larger datasets.
  3. Changed the phonemizer loop to phonemize everything at once and build train_list later. This avoids a memory protection error in phonemizer if you use it in a loop or on datasets greater than 2000 files.
  4. Made changes to the ReadMe to reflect changes to code and practice for using whisperx on multiple files.

I did not make alterations to how the phonemizer works, it still drops the punctuation like the original. I tried to get the punctuation but it had some issues.

Other future additions could be adding an EOS token with 100 ms of silence at the end of all files to prevent short stops.

IIEleven11 commented 10 months ago

Don't know how I missed this. I'll take a look here shortly

IIEleven11 commented 10 months ago

Solid, thank you!