khoj-ai / khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (e.g gpt, claude, gemini, llama, qwen, mistral).
https://khoj.dev
GNU Affero General Public License v3.0
14.77k stars 735 forks source link

The context length of Llama2 is 2k or 4k ? #394

Closed Ellen7ions closed 1 year ago

Ellen7ions commented 1 year ago

I notice that the context length is hardcoded here: https://github.com/khoj-ai/khoj/blob/02e216c13584c6bd41afb637c7e41fccd84bf8a4/src/khoj/processor/conversation/gpt4all/chat_model.py#L164

But I see that the length of Llama2 7B is 4k here

Is it a bug ?

sabaimran commented 1 year ago

Hey @Ellen7ions ! Thanks for reporting. We're experimenting with this a little bit. I see that the context window can be 2048 tokens, but the number of embeddings can be 4096. I think this means the number of tokens the model can emit, but not sure. We'll need to figure out what the correct value for this is, but updating in #393 .

debanjum commented 1 year ago

This should be addressed in #393. Based on the previous investigation, the Llama2 model we use for Khoj supports a prompt size of 2K