LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.98k stars 349 forks source link

Enable streaming on KoboldAI Lite when using remote hosts #966

Open morbidCode opened 3 months ago

morbidCode commented 3 months ago

Hello,

I’ve got an openrouter account with some credits to use LLM models. I noticed that the official KoboldAI Lite page (https://koboldai.net) supports openrouter’s API, which is great for me. However, I can't stream the generated responses, and I saw this mentioned on the Koboldcpp wiki page. Is there any way to turn on streaming for remote APIs? It would be nice to immediately cut off the response if I don't like the generation and save some credits.

Thanks!