Closed Ellen7ions closed 1 year ago
Hey @Ellen7ions ! Thanks for reporting. We're experimenting with this a little bit. I see that the context window can be 2048 tokens, but the number of embeddings can be 4096. I think this means the number of tokens the model can emit, but not sure. We'll need to figure out what the correct value for this is, but updating in #393 .
This should be addressed in #393. Based on the previous investigation, the Llama2 model we use for Khoj supports a prompt size of 2K
I notice that the context length is hardcoded here: https://github.com/khoj-ai/khoj/blob/02e216c13584c6bd41afb637c7e41fccd84bf8a4/src/khoj/processor/conversation/gpt4all/chat_model.py#L164
But I see that the length of Llama2 7B is 4k here
Is it a bug ?