Made the scripts work on generic paths so long as the terminal is on the root (StyleTTSFineTune-main)
Altered to allow for multiple srt files and audio files to be processed for larger datasets.
Changed the phonemizer loop to phonemize everything at once and build train_list later. This avoids a memory protection error in phonemizer if you use it in a loop or on datasets greater than 2000 files.
Made changes to the ReadMe to reflect changes to code and practice for using whisperx on multiple files.
I did not make alterations to how the phonemizer works, it still drops the punctuation like the original. I tried to get the punctuation but it had some issues.
Other future additions could be adding an EOS token with 100 ms of silence at the end of all files to prevent short stops.
I did not make alterations to how the phonemizer works, it still drops the punctuation like the original. I tried to get the punctuation but it had some issues.
Other future additions could be adding an EOS token with 100 ms of silence at the end of all files to prevent short stops.