Closed genabotpub closed 3 years ago
The question is unrelated to this repository. I cannot help on your question.
https://github.com/kan-bayashi/ParallelWaveGAN/issues/251 Also I don't think it's a good thing to ask the same questions to multiple places.
Thank you for your response. The reason behind two questions is that each of the repositories provide two different implementations and maybe at least one could offer some relevant capabilities. You are saying that this question is unrelated to this repository, than can you please indicate which part of the TTS end-to-end chain should be responsible for detecting and generating speech-marks.
Best-regards, gen
Hello,
Thank you very much for this project.
I am looking for a way to generate speech marks identifying the boundary and playback duration of each rendered word in milliseconds.
At the moment we are using a speech segmentation model to achieve that, but are looking for a method to achieve this without the need for the postprocessing.
Would appreciate any advice on the best way to achieve this.
Thank you, gen