How to use voice files instead pure TTS?

neosapience / editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)

https://editts.github.io

Other

117 stars 17 forks source link

How to use voice files instead pure TTS? #1

Open Vadim2S opened 3 years ago

Vadim2S commented 3 years ago

In papers you say about LJ speech dataset test (4.3 Content replacement). Can you provide code for loading voice files instead pure sample generation in tts.py?

jaketae commented 2 years ago

Hello @Vadim2S, thanks for opening this issue, and apologies for the belated reply. The modeling code is a direct adaptation of Grad-TTS, so you can refer to the upstream repository for detailed instructions on how to load data. Hope this helps!

mvoodarla commented 1 year ago

I have this question as well. Looking at the inference code, it is unclear how I could drop-in replace running Grad-TTS with my own source WAV file. Any tips would be appreciated :) Or if you have pointers to any other methods of doing something similar.

jaketae commented 1 year ago

Hey @mvoodarla, thanks for opening this issue.

how I could drop-in replace running Grad-TTS with my own source WAV file

Could you explain in more detail what you mean by this? I assume this is in the context of content replacement.

mvoodarla commented 1 year ago

Yes, I would like to have the ability to submit a source wav file with or without a text transcription of it, and be able to replace a word or a set of words that was said in that source wav file. Does that make sense?