Closed lumpidu closed 9 months ago
This has consequences of how we play the audio after it has been cached by the caching layer. If we'd implement this feature, we have to rearrange the order of cachine and audio playing.
This is actually very difficult to implement, because of the current caching bahavior. It would mean, we'd have to append audio to a cache item while it's already announced via the queue inside the Speech-Service. In that case, we would not use the audio inside the cache item to play, because this is not yet fully available, but we would need to open another side-queue which is feeded from withing the network observer response at the same time while the observer would also append the audio to the cache item audio. This will work, but the edge cases have to be dealt with thoroughly.
Closing. Superseded by new VITS voice Steinn.
Instead of waiting for the complete audio to be returned, we should read the audio in chunks and feed it to the Audio queue that is being used for the on-device voices. The benefit would be less delay for longer text passages