I’ve got an openrouter account with some credits to use LLM models. I noticed that the official KoboldAI Lite page (https://koboldai.net) supports openrouter’s API, which is great for me. However, I can't stream the generated responses, and I saw this mentioned on the Koboldcpp wiki page.
Is there any way to turn on streaming for remote APIs? It would be nice to immediately cut off the response if I don't like the generation and save some credits.
Hello,
I’ve got an openrouter account with some credits to use LLM models. I noticed that the official KoboldAI Lite page (https://koboldai.net) supports openrouter’s API, which is great for me. However, I can't stream the generated responses, and I saw this mentioned on the Koboldcpp wiki page. Is there any way to turn on streaming for remote APIs? It would be nice to immediately cut off the response if I don't like the generation and save some credits.
Thanks!