Open AryanEmbered opened 3 months ago
Thanks for reaching out. As everything is completely open-source, its already an open implementation to its fullest extent - I think? I don't see how integration with DirectML would add anything here. Streaming inputs to LLMs isn't feasible. The LLM's outputs are already streamed to the TTS engine in real-time . without waiting for the full output.
Also why not stream the responses to the model as they come instead of waiting for the entire response before tts starts?