khoj-ai / khoj

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
https://khoj.dev
GNU Affero General Public License v3.0
12.64k stars 640 forks source link

offline chat generates garbage output #516

Closed mtoniott closed 11 months ago

mtoniott commented 11 months ago

Hello,

Thank you for this software getting better everyday.

So recently I updated khoj and the offline chat that was giving me normal answers before is now outputing garbage with Mississipi in it each time for some reasons. image

I tried reinstalling khoj in a new venv. Did not work. I tried turning off the offline chat, removing the model in my .cache file then redownload it. Same result.

I guess it is linked to the fact that it wants to use my intel integrated graphics to accelerate the queries. But I did not find a way to turn it off. I get the following line in the terminal;

llama.cpp: using Vulkan on Intel(R) Iris(R) Plus Graphics 655 (CFL GT3)

Any idea on how to fix this?

debanjum commented 11 months ago

Ah that's unfortunate that you're seeing a regression in behavior. The chat model being loaded into your GPU could be the reason for this regression for sure. Let me look into testing this out on my end.

Can you share details of your machine specs. Specifically the RAM, Processor and GPU on your machine?

Details

We'd started using an upgraded default model for offline chat (Mistral instead of Llama 2) and try to use GPU when a (Vulcan) supported GPU is available on the users machine. Using Intel or AMD GPUs hasn't been tested as we dont have such machines ourselves.

debanjum commented 11 months ago

Vulkan support in our upstream dependency (GPT4All) still needing some ironing out. Until then I've exposed a CLI flag to allow users to disable using GPU for offline chat.

To use this fix:

  1. Upgrade to the latest pre-release version of Khoj: pip install --upgrade --pre khoj-assistant
  2. Start Khoj server with --disable-chat-on-gpu flag: khoj --disable-chat-on-gpu

@mtoniott: Let me know if this mitigates the issue with offline chat generating gibberish output?

mtoniott commented 11 months ago

It works now. Ty !

image