Hello!
I trained 2 voices from scratch, one in medium and the other in high quality.
When I export them to onnx and test, the medium has a RTF of around 0.09 which is very fast, however, the high quality one has a RTF of around 0.55 which is a lot slower and I really don't see any difference in quality.
I typically train the high quality voices at a higher sample rate as well, which contributes to it sounding better. For some datasets though, there isn't going to be a major difference.
Hello! I trained 2 voices from scratch, one in medium and the other in high quality.
When I export them to onnx and test, the medium has a RTF of around 0.09 which is very fast, however, the high quality one has a RTF of around 0.55 which is a lot slower and I really don't see any difference in quality.
Is this expected?
I'm running it on windows...
thanks!