Open amit2103 opened 1 year ago
i get this error too with desktop 4090, 24GB VRAM + 32GB system memory. Windows 11. Model: TheBloke/Llama-2-7B-Chat-GGML, on: cuda (default settings with the default document).
not enough space in the buffer (needed 156291200, largest block available 18104320)
Edit: I suspect it is related to VRAM and not system RAM, because, for the same prompt (prompt: can a state declare its independence from the united states after being part of the united states), using --device_type cpu
will work (answer: 'it is not possible for a state to declare its independence from the United States') but not --device_type cuda
(shows 'not enough space in the buffer' error)
Edit2: I used to run the commands from windows terminal, now I switched to pycharms and switched to using the model Wizard-Vicuna-13B-Uncensored-GPTQ-4bit-128g. I no longer have the issue
yes same issue I am having I am using my own appllication to host on streamlit and I am running into the same issue ...so you shifted to vicuna and it works ?
Thanks for creating the awesome project. So I was trying to play around with the project. I have a couple of PDF's that i wanted to use (mostly around 300 pages).
I have a laptop 3080 with 16GB Vram and I was using the , MODEL_ID = "TheBloke/Llama-2-7B-Chat-GGML with MODEL_BASENAME = "llama-2-7b-chat.ggmlv3.q2_K.bin".
I repeatedly get the error