Open FJCCOMMISH opened 5 days ago
Is there a way to expose/control more settings, see these settings, ensure consistent output (reads)?
Sure! I'll expose seed as a controllable setting. The variability is a natural outcome of Tortoise (or any neural net based TTS) and seed will keep it consistent across generations for the same inputs.
Even with the same text (sentences) and settings, multiple generations result in radically different pacing, style, inflection.
Is there a way to expose/control more settings, see these settings, ensure consistent output (reads)?
This file contains audio of the same text with audio generated with the same models and settings: https://we.tl/t-0M8VeAMAt0
Note the differences in style, inflection, pronunciation, pacing.