"Aborted" after Offline Model download

khoj-ai / khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (e.g gpt, claude, gemini, llama, qwen, mistral).

https://khoj.dev

GNU Affero General Public License v3.0

15.59k stars 777 forks source link

"Aborted" after Offline Model download #933

Open SchinkTasia opened 1 month ago

SchinkTasia commented 1 month ago

Describe the bug

When i add a new offline ,odel and start the chat with any input., i can see that khoj starts to download the model in the console. but after it finished downloading the model, khoj crashes with nothing more than "Aborted (core dumped)". I used the -vv as start-parameter but i dont get any more information.

To Reproduce

i used 2 own offline models and the one that is preinstalled. Nothign worked.

Screenshots

Platform

Server:
- [ ] Cloud-Hosted (https://app.khoj.dev)
- [ ] Self-Hosted Docker
- [X] Self-Hosted Python package
- [ ] Self-Hosted source code
Client:
- [X] Obsidian
- [ ] Emacs
- [ ] Desktop app
- [X] Web browser
- [ ] WhatsApp
OS:
- [ ] Windows
- [ ] macOS
- [X] Linux (Ubuntu 24.04)
- [ ] Android
- [ ] iOS

If self-hosted

Server Version [e.g. 1.0.1]: Khoj v1.24.1

Additional context

Where can i get more information?

debanjum commented 1 month ago

How much RAM/VRAM does your machine have? This seems like Khoj has run out of memory and crashed.

Can you also try use one of the smaller (2B, 3B) default models like Gemma-2 2B to see if you can get a response from Khoj using them?

You would need to update your ServerChatSettings in the Khoj admin panel at localhost:42110/server/admin

SchinkTasia commented 1 month ago

I tested the bartowski/gemma-2-2b-it-GGUF Model but with the same Abort message. The machine have 12GB RAM in total.

debanjum commented 1 month ago

I see, that does sound strange. 12Gb RAM should be enough for Khoj to work (though without GPU it'd be slow)

Did you switch the chat model to Gemma 2 2B in both the Khoj Admin panel at http://localhost:42110/server/admin/database/serverchatsettings/ and your user settings at http://localhost:42110/settings?

Just want to make sure that this is happening even when a single small chat model is being used by Khoj

SchinkTasia commented 1 month ago

Yeah, i changed the settings in both.

It is an GPU Installation for a AMD RX6900XT with 16GB VRAM.

debanjum commented 1 month ago

I see, good to know. Not sure what's up. You have a decent sized GPU. What command did you use to install Khoj with GPU support?

Can you check if you have the required pre-requisites to use GPU with llama.cpp python binding we use here
As a fallback you can use Khoj with Ollama, to get started with Khoj using Offline chat models running on your GPU

SchinkTasia commented 1 month ago

I tried the first step, but everything looks alright.

Requirement already satisfied: llama-cpp-python in ./khoj/lib/python3.12/site-packages (0.2.88)
Requirement already satisfied: typing-extensions>=4.5.0 in ./khoj/lib/python3.12/site-packages (from llama-cpp-python) (4.12.2)
Requirement already satisfied: numpy>=1.20.0 in ./khoj/lib/python3.12/site-packages (from llama-cpp-python) (1.26.4)
Requirement already satisfied: diskcache>=5.6.1 in ./khoj/lib/python3.12/site-packages (from llama-cpp-python) (5.6.3)
Requirement already satisfied: jinja2>=2.11.3 in ./khoj/lib/python3.12/site-packages (from llama-cpp-python) (3.1.4)
Requirement already satisfied: MarkupSafe>=2.0 in ./khoj/lib/python3.12/site-packages (from jinja2>=2.11.3->llama-cpp-python) (3.0.0)

i will now try the second step.

SchinkTasia commented 1 month ago

I am using an Ubuntu VM and apparently my GPU never reached the VM. When i try to get the GPU in Ubuntu i see, that the system dont have a GPU.

I guess that this is the problem. I try to fix this and will post here later.