Open linfengca opened 1 year ago
which model are you using and which format (ggml or gptq)?
Encountered the similar issue, I am using Mistra 7b on my PC()RTXA4000): MODEL_ID = "TheBloke/Mistral-7B-Instruct-v0.1-GGUF" MODEL_BASENAME = "mistral-7b-instruct-v0.1.Q8_0.gguf"
I have an Intel based MacBook pro with 16G RAM Here is the model I used: zephyr-7B-beta-GGUF model
MODEL_ID = "TheBloke/zephyr-7B-beta-GGUF"
MODEL_BASENAME = "zephyr-7b-beta.Q4_K_M.gguf"
When running the run_localGPT.py file, I've encountered this error:
pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain
llm none is not an allowed value (type=type_error.none.not_allowed)
and I fixed according to others recommendation with
pip install llama-cpp-python==0.1.83
Then I ran run_localGPT.py file and then this occurred:
ValueError: Requested tokens (5543) exceed context window of 4096
@PromtEngineer
I've tried to modify the CONTEXT_WINDOW_SIZE from 4096 to 2048 to 1024 and none of them worked.
Not sure where is the issue? Can you help?
@stonez56 you need to change the chunk_size here Set it to a smaller value such as 700 if that works, keep increasing it to something like 800. Play around with the value.
@PromtEngineer I changed CONTEXT_WINDOW_SIZE in constants.py file from 4096 to 8192 and it worked! :)
if I change 4096 to 8192, "max_ctx_size = 8192", it raises the same error message...