So we're having issues inferencing efficiently at scale, and of course we're processing the audio parts one by one as is default for inference, but is there any support for batch inference to speed things up, in the same way vLLM or other LLM serving libraries work?
This is a solved problem for LLMs, or there's lots of inference options at least, so just wondering if the same exists for TTS? Thanks.
So we're having issues inferencing efficiently at scale, and of course we're processing the audio parts one by one as is default for inference, but is there any support for batch inference to speed things up, in the same way vLLM or other LLM serving libraries work?
This is a solved problem for LLMs, or there's lots of inference options at least, so just wondering if the same exists for TTS? Thanks.