IIEleven11 / StyleTTS2FineTune

178 stars 32 forks source link

Why does phonemize run slower and slower when the dataset is large? #3

Closed blldd closed 11 months ago

blldd commented 11 months ago

Hi eleven, thanks for your guide, and I am trying to prepare the vctk dataset with your phonemize scripts, and I found that the phonemize function get slower and slower. Can you figure out why? Thanks a lot.

IIEleven11 commented 11 months ago

Hi eleven, thanks for your guide, and I am trying to prepare the vctk dataset with your phonemize scripts, and I found that the phonemize function get slower and slower. Can you figure out why? Thanks a lot.

Well it is working harder and harder the more data there is. One thing you may be running into is that whisperx has trouble creating an accurate .json. last night I updated the repo with a new script that will use the whisperx .srt file instead. The accuracy is considerably higher and it may be easier for the phonemizer to handle.

If you're still having issues you can try phonemizing in smaller portions.