dusty-nv / NanoLLM

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
https://dusty-nv.github.io/NanoLLM/
MIT License
196 stars 31 forks source link

TTS needs a way to drop inputs from a chat model linked to Video Auto Prompt #19

Open TadayukiOkada opened 5 months ago

TadayukiOkada commented 5 months ago

Since the auto prompt with video input generates excessive output, TTS can't keep up, resulting in increasing delays. It would be ideal if we could skip past outputs and use only the latest one. However, this might not work with the 'word' output. Perhaps we could use the 'final' output to jump to the most recent whole output?

dusty-nv commented 5 months ago

Hi Tadayuki! Yes that is a good point, I hadn't tried hooking TTS directly up to autoprompter output yet - lets try using the 'final' output, disabling the TTS buffering, and setting drop_inputs=True on TTS. That way it will at least process the latest response if it gets behind.


From: TadayukiOkada @.> Sent: Sunday, June 16, 2024 11:59:16 AM To: dusty-nv/NanoLLM @.> Cc: Subscribed @.***> Subject: [dusty-nv/NanoLLM] TTS needs a way to drop inputs for Video Auto Prompt (Issue #19)

Since the auto prompt with video input generates excessive output, TTS can't keep up, resulting in increasing delays. It would be ideal if we could skip past outputs and use only the latest one. However, this might not work with the 'word' output. Perhaps we could use the 'final' output to jump to the most recent whole output?

— Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/NanoLLM/issues/19, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGKZHIGDQFAOU273SMULZHWY5JAVCNFSM6AAAAABJMWQKTWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TKOBVGEZTENY. You are receiving this because you are subscribed to this thread.Message ID: @.***>