khoj-ai / khoj

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
https://khoj.dev
GNU Affero General Public License v3.0
12.64k stars 640 forks source link

Add a way to exit chat request early from the chat UI #508

Open sabaimran opened 11 months ago

sabaimran commented 11 months ago

The local llama chat response can take minutes sometimes. If you want to update the request and tweak it, then this can mean a lot of waiting in order to retry your request. Add some way to send an interrupt signal from the UI to cancel the request.

See relevant discussion on Discord.

sabaimran commented 11 months ago

Mini-update. I looked into this today and found that gpt4all only supports shortcircuiting the model response after tokens have already started emitting. So, you can't stop it from 'thinking', so to speak, once it's already been given a query. To that end, I'll update the UI so that you can cancel the query once tokens are being spit out, but not before then.

Hopefully the time to first token issue will be less of a headache for folks using Mistral. That'll become the default model (see commit https://github.com/khoj-ai/khoj/commit/0f1ebcae18abc8969cb367564077ef8d20695be3) in the next release.