Open c4fun opened 1 year ago
When I changed the gpt4all_model to llama2, it does not solve the context problem, the context window is still 2048:
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.07 MB
llama_model_load_internal: mem required = 5407.71 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size = 1024.00 MB
This seems like a problem in ggml, which I do not find a solution yet. Issue from gpt4all repo also reflect same unsolved status: https://github.com/nomic-ai/gpt4all/issues/664#issuecomment-1556233279
Bug behavior
When using the described process in the README.md, I bumped into this error:
Analysis
The context window of all llama model is 2048, as stated in the begining of
python reverie.py
logn_ctx
parameter:Of course, that is because LLama 1 is trained with 2k context length.
Suggestion
Change the default model in
reverie/backend_server/utils.py
fromorca-mini
tollama-2
, as llama-2 has context window of 4k.